Fork me on GitHub
#babashka
<
2022-01-10
>
Ben Sless13:01:42

"for the JVM but in babashka it will be slower :/" What's the reason string builders are slower on babashka?

borkdude13:01:47

Because interop is interpreted (and currently not very optimized), such optimization tricks take more time to run than going through the (precompiled!) core functions.

Ben Sless13:01:41

Any thoughts or plans on optimizing it somehow?

Ben Sless13:01:58

I remember 😞

borkdude13:01:29

but this can be mitigated by using reader conditionals and just using the core funs in babashka

borkdude13:01:43

and yes, I've long wanted to optimize the interop by using a callsite cache

borkdude13:01:54

but I haven't gotten to it yet

borkdude13:01:16

I still doubt it would be as fast as using the core functions, but it would be much better

borkdude13:01:32

feel free to look into it of course

Ben Sless13:01:46

Maybe I will 🙂

borkdude13:01:01

What happens is that on each interop call there is reflection but the MethodHandle could be cached

borkdude13:01:24

also in analysis the methodhandle could already be looked up if there are enough type hints

borkdude13:01:53

but this is a trade-off of startup time and runtime because when not using certain functions, it will be more expensive to do this during analysis

borkdude13:01:25

this is why a callsite cache may be a good trade-off

borkdude13:01:42

"More research needed"

borkdude13:01:01

but in short, this explains why using "pure" clojure is much faster in babashka

Ben Sless13:01:57

I thought you could even prepare a bunch of hinted functions and replace them at analysis time if you have enough info

1
borkdude13:01:29

do you mean, have a #(.append ^StringBuilder x ^String y) function and then replace?

borkdude13:01:43

but wouldn't this mean you would have to prepare a gazillion functions?

Ben Sless13:01:53

How many classes do you support?

Ben Sless13:01:22

So emitting functions for all methods would be prohibitive

borkdude13:01:50

This is how I implemented interop in the very beginning. :)

borkdude13:01:57

when I just had a few classes

Ben Sless13:01:19

You could still do that for common cases

Ben Sless13:01:55

Then fallback to a cached method handle

borkdude13:01:49

Could pre-generate a "index" of all methods, if the instance method name is unique, we already know what to call without even knowing the arg types

borkdude13:01:31

and with some type hint info could resolve more things

kokada14:01:21

This is a fascinating thread BTW

borkdude12:01:53

A related issue: https://github.com/babashka/babashka/pull/1138#issuecomment-1010991962 These kinds of fast-assoc optimizations will just make things slower (or in this case not working because that method isn't exposed). So in this case it's better to introduce a :bb reader conditional which does not use the JVM-optimized version.

teodorlu15:01:00

What's an ideomatic way to turn a seq back into string lines, for piping into a command?

seq 10 | bb -e '(->> (line-seq (io/reader *in*)) (drop 2) (drop-last 2))'
("3" "4" "5" "6" "7" "8")

# how can I make it print
3
4
5
# ... ?

borkdude15:01:52

Either (run! println ...) or bb -o ...

👍 1
borkdude15:01:55

you can also replace (line-seq ...) with *input* and then use bb -io (->> *input* (drop 2) (drop-last 2))

👍 1
teodorlu15:01:33

seq 10 | bb -io '(->> *input* (drop 2) (drop-last 2))'
3
4
5
6
7
8
Wow - that's actually more compact than I can think of in Bash, AWK, or other Unix tools. Nice.

❤️ 1
joshuamendoza15:01:14

seq 10 | head -n-2 | tail -n+3
3
4
5
6
7
8
This is more succinct in Unix tools terms. However, I ignore how well it performs on big samples.

🙌 1
teodorlu15:01:57

I guess head and tail shorter in character count. Still -- I didn't know you could run head and tail in "inverse" (`-n -2`), whereas I was already familiar with drop and drop-last. Another Babashka pro is simple filtering in the middle of the pipe:

seq 10 | bb -io '(->> *input* (drop 2) (remove #{"4" "7"}) (drop-last 2))'
3
5
6
8

borkdude15:01:25

Don't tell anyone but you can also pipe "infinite" collections with babashka.

$ bb -O '(range)' | bb -I '(take 3 *input*)'
(0 1 2)

🤯 3
Ben Sless19:01:59

Used babashka to parse traced pcap files and then ran regression tests for a service, thanks borkdude!

🎉 2