Fork me on GitHub
#babashka
<
2022-12-29
>
keychera08:12:36

Hello everyone, I am having trouble with babashka (v1.0.168) and bootleg. I am trying to catch the exception from the function convert-to consuming invalid html string and print information about the exception but I can’t seem to catch it, here’s the small snippet to reproduce the issue

(require '[babashka.pods :as pods])
(pods/load-pod 'retrogradeorbit/bootleg "0.1.9")
(require '[pod.retrogradeorbit.bootleg.utils :as utils])

(try (-> (slurp "./invalid.html") (utils/convert-to :hickory))
     (catch Exception e (println (-> e Throwable->map :cause))))
with the following invalid.html
<div id="wrong attr" ""></div>
I got full stack trace of Cannot read EDN:... from [clojure.lang.Util runtimeException "Util.java" 221] printed on the terminal and the script does not terminate, what is going on?

borkdude09:12:02

That's not what I'm seeing, I get:

$ bb /tmp/dude.clj
./invalid.html (No such file or directory)

keychera09:12:11

@U04V15CAJ how about this code without slurping

(require '[babashka.pods :as pods])
(pods/load-pod 'retrogradeorbit/bootleg "0.1.9")
(require '[pod.retrogradeorbit.bootleg.utils :as utils])

(try (-> "<div id=\"wrong attr\"\"></div>" (utils/convert-to :hickory))
     (catch Exception e (println (-> e Throwable->map :cause))))

borkdude09:12:19

I'll try later today, thanks for the clarification

1
borkdude12:12:27

I looked into it. I think what we should do is convert the read error into an exception response in the pod loop. I'll make an issue for this

💯 1
👀 1
borkdude14:12:42

I found a solution

keychera16:12:53

so quick! I saw your commits in the repo, thank you very much @U04V15CAJ! though I can’t try it immediately but I’ll let you know after I try it again

borkdude16:12:51

@U042NRFKHQE which os / are are you using?

keychera17:12:13

I am using MacOS Monterey (v12.4)

borkdude19:12:41

ok, you can install and test it with:

bash <(curl ) --dev-build --dir /tmp
and then /tmp/bb your-script.clj

keychera07:12:16

@U04V15CAJ I just tried it, it seems like it still hangs like previously, I made sure I’m using this version

Babashka v1.0.169-SNAPSHOT

borkdude08:12:36

Are you not using m1? This might explain it since the intel build didn’t run

keychera08:12:00

Yep, it's not M1

borkdude10:12:06

yep, just repeat the --dev-build install once again

keychera10:12:10

just did, it’s working now! printed this successfully and script terminates

Cannot read EDN:  "{:type :element, :attrs {:id \"wrong attr\", :\" \"\"}, :tag :div, :content nil}"
Invalid token: :

borkdude20:12:42

I'm going to look at babashka.http-client. Two small design thingies: 1. {:url ..} or {:uri ..}? The Java terminology is :uri but :url is used in almost all existing clojure http client libraries 2. headers: {"foo" ["a" "b"] "bar" ["a"]} (always use a vector for header values, even if there is one - or {"foo" "a", "bar" ["a" "b"]} (if there is only one value in the headers, don't use a vector) 🧵

borkdude20:12:40

About 1: it seems the question here is drop-in compatibility or not About 2: most headers, like 99% of them are single values, but with set-cookie for example, you can have multiple... Most existing clojure clients use variant 2...

skynet20:12:21

re 1: the URIs that aren't also URLs would be invalid for an http client right?

borkdude20:12:26

Don't know, let me try..

borkdude20:12:15

Message:  invalid URI scheme file
But I think there are URIs like domain socket things that are valid

👍 1
borkdude20:12:40

(I've yet to see a working example of this)

isak20:12:25

To me it seems like following what the existing clj http libraries do makes sense, since the alternative isn't obviously better in every way

borkdude20:12:52

TBH the vector thing: I went with the "dynamic type" if you will in babashka.curl, because I only discovered later that the same header can occur twice and then to not break backwards compatibility, you're kind of forced to do it this way. I can imagine the same thing happened to those other libs earlier on.

isak20:12:48

Ah, interesting

isak20:12:16

I wonder if there are any that would be incorrect to send multiple times

isak20:12:44

Like Content-Length maybe? If so I think that is an argument for the dynamic approach too

👍 1
borkdude20:12:31

Yeah, one could argue that the Java result always has a List because dynamic typing is much harder in Java

1
isak20:12:49

Interesting, not super statically typed out, like plucking out well known properties

borkdude21:12:04

well, the structure is statically typed as Map<String,List<String>> but they have convenience functions for getting the Optional<String> first element ;)

🙂 1
lispyclouds08:12:38

I would second the approach of header values as list or values or a single value. im also fine with it always taking a list too. this should simplify some contajners code 😄

lispyclouds08:12:00

always being a list of values seems more consistent

lispyclouds08:12:33

also this happens in query parameters too

borkdude10:12:58

yes, I'm sensitive to this "consistency" argument, but there's also the "drop-in" argument... Perhaps we could offer an API function to get the first header: (header response "Content-Length") or so

lispyclouds10:12:07

not sure whats the perspective behind the drop-in argument is, is it something the other http libs offer?

borkdude10:12:46

no, but having it "single value or multiple values" will make babashka.http-client work better as a drop-in for babashka.curl, clj-http, etc, etc

borkdude10:12:56

the ring spec applies to http servers though

lispyclouds10:12:56

right, makes sense! im happy with supporting both. being able to take in a vec is great, everything else is a cherry on top 🙂

borkdude10:12:35

I was referring to the response, but I didn't mention this yet - sorry

borkdude10:12:30

so if the response does something else than clj-http, babashka.curl, etc you can't use it as a drop-in

lispyclouds10:12:02

yeah i got confused, makes more sense now

lispyclouds10:12:39

but since it came up, would be nice to have vecs in requests too? 😉

borkdude10:12:08

yes, it's easy to support more flexible input, but not the other way around, so that's not the issue

lispyclouds10:12:34

as for the convenience fn, is that something people do often? most of my experience is to get the whole thing out and deal with it myself

borkdude10:12:03

@U7ERLH6JX Yes, but if you're going to deal with it yourself, you have to know if the format is dynamic (list or single) or not

lispyclouds10:12:39

yeah i mean we can define the spec as value or vec of values and not have the convenience fn. not sure of the value benefit of it?

borkdude10:12:29

if the server decides to send content-length twice, and you expect a single value, your calls could break, unless you explicitly check for vector or single value

borkdude10:12:30

I think I'm just going to re-implement babashka.curl in babashka.http-client since it's the API that follows the conventions that already exist

lispyclouds10:12:09

> you expect a single value, your calls could break the way im thinking here is that maintaining the convenience fn is more work than just setting the expectation of a value | [value] but also agree to the breakage argument

borkdude10:12:57

more work vs having the flexibility to make breaking changes by saying: always use the fn to get the header rather than relying on the raw response

borkdude10:12:11

it's not a lot of work though

lispyclouds10:12:23

yep, its quite easy to write it. my feelings are more from we are adding fns over data and somehow down the line people would go more and more around it as they have use case for multi headers. the data > fn clojure thoughts 🙂

lispyclouds10:12:27

maybe the first version doesnt have this fn and when people ask for it we can add it? people who are using the raw data can continue using it

borkdude10:12:42

the idea of the fn would be to protect people from breaking changes to the data, so that doesn't really make sense

borkdude10:12:37

anyway, got the first tests running, I'll just copy paste bb.curl's test and will make it work

borkdude10:12:29

it would be helpful if someone could look into running a non-third party service to run http tests as they are the single most reason for flakiness on bb CI

lispyclouds10:12:30

it has a docker image

borkdude10:12:09

See for example https://ci.appveyor.com/project/borkdude/babashka/builds/45797186 ERROR in (test-basic-response) (C:\projects\babashka\.\test-resources\lib_tests\hato\client_test.clj:94) verbs exist expected: (= 200 (:status (head "https://httpbin.org/status/200"))) actual: clojure.lang.ExceptionInfo: status: 502 This happens routinely

borkdude10:12:54

Sure, if you know how to run this in circleci / github actions / appveyor...?

lispyclouds11:12:15

should work. they all should have docker

lispyclouds11:12:34

i'll look into it, this should be in the http-client repo's CI right?

borkdude11:12:49

also in bb's own CI (see broken build above)

borkdude11:12:41

another less complex option could be to poll the service before we run the test, if it returns a 502, we just skip the test or so

borkdude11:12:02

would be good to think about a couple alternatives to not make things too complex. e.g. I don't like to run docker locally to run these tests

borkdude11:12:14

another alternative: run this service ourselves and use a token or so

lispyclouds11:12:29

yeah or some Go thing

borkdude11:12:09

or maybe I can just disable these tests on master, while still running them in branches, this should be enough to make sure they keep working (if we don't commit directly to master)

borkdude11:12:21

also it won't break builds during releases, which is the most annoying thing about all of this

lispyclouds11:12:12

yeah this seems to be the first thing to do, since all of the breakages are external. i can find a simpler server thing til then

borkdude11:12:55

The tests are mainly there to ensure that Java interop for http client libraries is working

borkdude11:12:07

and that the SCI mapping for built-in libs is correct

borkdude11:12:20

ok, let's just disable on master

borkdude11:12:41

let's use master / main since I might rename to main in the near future

1
borkdude11:12:01

we could use metadata ^:flaky and disable all flaky tests on the default branch

borkdude12:12:45

I'd say it's quite annoying that the default JDK client doesn't follow redirects. E.g. now you have to do:

(client/get ""
                               {:client (client/client {:follow-redirects :always})})

borkdude12:12:31

I think I'll make bb http-client's default client always follow redirects

borkdude14:12:17

Now ported most of what's in babashka.curl, except for throwing on exceptional status codes, will do that later today https://github.com/babashka/http-client

lispyclouds14:12:28

It says its a drop in replacement for bb curl, but we cant do unix sockets with this right?

borkdude14:12:05

it says mostly

borkdude14:12:40

bb curl won't disappear

borkdude14:12:52

but the unix domain thing will be fixed in 5 years probably ;)

lispyclouds14:12:42

yeah one can be hopeful

bherrmann00:12:34

What was wrong with babaska.curl ? How do I choose between it and the soon-to-be built-in babashka.http-client ? Will babashka.http-client also run on JVM clojure? Perhaps the trade-off/rational could be in the babashka.http-client readme? THANKS!

borkdude07:12:26

Http client will be the recommended one. Bb curl will still be included. Trade-off is that it doesn’t call out to curl so no OS process per call. Also http client supports async.

lispyclouds07:12:25

Also since http kit is included and its client also has these features, would that be marked as deprecated?

borkdude07:12:44

Httpkit client has another set of problems. E.g. the entire response has to be kept in memory at once. Yes, it will also be deprecated

borkdude07:12:47

In practice it’s fine for most scripting but bb client solves all of these (and eventually the Unix domain too)

lispyclouds07:12:14

yeah i was almost always using https://github.com/schmee/java-http-clj for this. good to have an official solution 😄

Al Baker22:12:01

@U04V15CAJ URLis notorious in the java community as having side effects (ie DNS resolution when you create an object), versus URI which enforces the IETF structure, but has no side effects. re: should you support multiple schemes, you could make it open, so http and https resolve, but I could say use tel: handler that I then choose to map that scheme to Twilio and things like that

👍 2