This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2023-11-11
Channels
- # announcements (1)
- # aws (5)
- # beginners (8)
- # biff (7)
- # calva (32)
- # cider (26)
- # clj-kondo (6)
- # clojure (100)
- # clojure-europe (6)
- # clojure-greece (1)
- # clojurescript (15)
- # core-logic (3)
- # fulcro (2)
- # honeysql (4)
- # hoplon (39)
- # hyperfiddle (11)
- # lsp (12)
- # other-languages (2)
- # podcasts-discuss (1)
- # squint (30)
I am playing around with a faster version of apply for which I end up calling ArrayList.addAll indirectly via AFn from clojure.core. This results in this error:
[java] ... 1 more
[java] Caused by: java.lang.NullPointerException: Cannot invoke "java.util.Collection.toArray()" because "c" is null
It appears the clojure compiler - when compiling core - recursively parses all the java it finds with a parser that doesn't understand java generics. Is this due to an older version of java used just to compile core itself and if so can I change just this version of java in my fork?
The faster version of apply does work but I am just playing around with various ideas I have w/r/t the clojure compiler itself.I should have pasted more stacktrace. This is during the compilation of core and isn't due to running code but due to the parsing of the code itself.
Do you have a repo with the code we can look at?
in clojure core.clj there are a lot if (nil? x) checks in cond and friends whereas in lot of code the variable itself would just be repeated - this is just curiosity.
That specific pathway - the addAll pathway - isn't the most efficient thing you could do there - it was just what I could do quickly to test things out.
The most efficient thing is to have concrete list implementations that do the concatenation implicitly.
I tried to change the RestFn implementation so the fixed args were always a list but code pwned and backed off.
What are the performance changes?
Some calls faster, some slower. The best case scenario, calling with some random access data such as a persistent vector such as (apply f [1 2 3 4]) was quite a bit faster, 3x or more. But as you add scalar arguments in front (apply f a '(b c d e)) the gains dropped off to 2x in the best case to 20-30% slower in some cases such as (apply f '(1 2 3 4)). So much faster if you have a random access argument to somewhat slower if you have a 'seq' based argument. I like the tradeoff as it means people who care about performance can use apply or rather .applyTo with an appropriate argument and get decent performance. The impetus for this was https://cnuernber.github.io/ham-fisted/ham-fisted.lazy-noncaching.html#var-cartesian-map and that design - a single tupled argument - will beat any design assuming .apply but it requires buy-in from the user. In cases like generic map (or https://cnuernber.github.io/dtype-next/tech.v3.datatype.html#var-emap) with more than Y arguments you can get situations where you really want to use apply repeatedly in what amounts to a tight loop. In those cases I both want to do minimal mutable updates to the argument container and I want to call .applyTo with that container as is without conversion to seq.
In any case - thanks to you, ghadi, and hiredman - I upgraded my jdk I guess with brew update and I had never seen that NPE before and since it was during the compilation process I was confused. It helps to know that nothing parses java during the compilation process - in retrospect it was an odd assumption to make.
@UDRJMEFSN What about making cartesian-map
an inline function that will repack the varargs to a container of your preference? I think it is much easier than trying to speed up apply
which even then will likely still lag behind performance-wise.
This is a good question - The first answer is due to the interface of IFn that restfn will still get an implementation ISeq which is far slower to work with than the underlying argument container - this is unnecessary. Honestly the only change I really want is to make .applyTo in IFn take an Object and not an ISeq and then you could change nearly nothing else aside from avoiding calling seq in apply itself but leave it up to the specific implementation of IFn - those are the majority of changes I made but I went a step further making the actual implementation of AFn faster. This is more general and allows someone to get the max performance out of an apply situation that may need it and would transparently apply to all Clojure programs with no changes aside from changes in libraries like a few of mine that manipulate the interface of IFn. For cartesian map I carefully mutably update the argument container for max performance - that container is an object array and I can either create an instance of java.util.RandomAccess from it or I can create a Seq or I can do nothing. A variadic fn that simply passes the argument through to the code would work great in that case if the object array isn't converted to something else that has far worse access semantics. So the only strong argument I have is to change the interface of IFn. Everything else can be worked around in code that has different constraints than the general constraints -- threadsafe lazy caching -- of Clojure's core libraries. Even that argument is weakened as as I mentioned before for special cases like cartesian map simply passing a single argument which is the argument vector and having the user adjust to a single argument vector bypasses this whole discussion and is likely faster at the end of the day. That is analogous to your variadic suggestion aside from it avoids any manipulation by the clojure core function machinery that would change and weaken the argument. Honestly I think we have all seen apply-related noise show up in the profile and I think that is overall unecessary. Apply can be quite a bit faster and it isn't 'hard' it just needs doing so I wanted to see how fast it could be.
@UDRJMEFSN Here's my take on your problem. This is what I get when I bench your updated cartesian-map
:
(api/sum-fast
(cartesian-map
#(h/let [[a b c d] (lng-fns %)]
(-> (+ a b) (+ c) (+ d)))
[1 2 3]
[4 5 6]
[7 8 9]
[10 11 12 13 14]))
;; Time per call: 2.17 us Alloc per call: 1,272b
This is what I get when I bench the original cartesian-map
that used apply:
(api/sum-fast
(cartesian-map-apply
#(-> (+ %1 %2) (+ %3) (+ %4))
[1 2 3]
[4 5 6]
[7 8 9]
[10 11 12 13 14]))
Time per call: 18.21 us Alloc per call: 27,137b
Then, I did the following: renamed the original apply-based cartesian-map-apply
to cartesian-map-apply*
and made it accept a single arity [f args]
. I replaced (.applyTo f val-seq)
with (f val-seq)
. Finally, I made a macro facade for this function:
(defmacro cartesian-map-apply [f & args]
(let [arr (gensym "arr")]
`(let [f# ~f]
(cartesian-map-apply*
(fn [^clojure.lang.ArraySeq val-seq#]
(let [~arr (.array val-seq#)]
(f# ~@(for [i (range (count args))]
`(RT/aget ~arr (unchecked-int ~i))))))
[~@args]))))
Benching this version, I get:
Time per call: 2.55 us Alloc per call: 744b
Not too bad, eh? I made cartesian-map-apply
a macro so that the PoC is easier to understand. But it can be made an inline function, so that when invoked literally it expands into what I wrote in the macro, but if called via some HOF, it resorts to using apply
.Having said that, 2us for something so trivial still feels woefully slow :thinking_face: Abstractions be damned
This sequence has 135 members in it I think and cartesian-map has some setup time. As you move to larger numbers it outperforms for by more at least in my tests.
Also - make the last argument large - like hamf/range 2000 and see what happens.Sure, positional arguments don't make much sense at those sizes. Neither the user-provided f
will be positional then. So, taking a tuple seems more reasonable overall.
Oh I don't mean the expansion of the for in terms of n-args. Agreed about positional arguments completely - I meant the last argument [10 11 12 13 14] -> (api/range 2000)
Still 4 arg expansion, just one argument - the last one in the sequence thus the tightest one to loop over - is the largest.
implicitly you version is using an arrayseq and getting the array. That is then using positional getters - my argument is precisely that applyTo should allow such modifications.
(api/range 2000)Tried this out, and I got 480.36 us
in tuple version vs 876.68 us
in my version. The profiler told me it's due to boxed math, so when I rewrote the example to this, I got the performance on par again:
(api/sum-fast
(cartesian-map-apply
(fn [^long a, ^long b, ^long c, ^long d]
(-> (+ a b) (+ c) (+ d)))
[1 2 3]
[4 5 6]
[7 8 9]
(api/range 2000)))
;; 461.51 us
Overall, I agree with you that apply
can be made faster, but I sincerely doubt that it can be made fast and efficient enough to consider using it in cases where the performance is truly important. Making allocations when simply invoking a function is a huge waste for hot loops.
Yes I think you are right - the best case is just so much better with some custom work.
Regarding the last example: it will only work for up to 4 arguments since larger primitive arities are not supported by Clojure. So, that's another argument in favor of your tuple approach being more generic and flexible.
for
was the impetus for cartesian-map -
user> (crit/quick-bench (hamf/sum-fast (lznc/cartesian-map
(fn [data]
(h/let [[a b c d] (lng-fns data)]
(-> (+ a b) (+ c) (+ d))))
[1 2 3]
[4 5 6]
[9 8 9]
(hamf/range 2000))))
Evaluation count : 2058 in 6 samples of 343 calls.
Execution time mean : 324.076022 µs
Execution time std-deviation : 40.485625 µs
Execution time lower quantile : 287.106755 µs ( 2.5%)
Execution time upper quantile : 364.012552 µs (97.5%)
Overhead used : 1.600948 ns
nil
user> (crit/quick-bench (hamf/sum-fast (for [^long a [1 2 3]
^long b [4 5 6]
^long c [9 8 9]
^long d (hamf/range 2000)]
(-> (+ a b) (+ c) (+ d)))))
Evaluation count : 126 in 6 samples of 21 calls.
Execution time mean : 5.615093 ms
Execution time std-deviation : 942.330849 µs
Execution time lower quantile : 4.753947 ms ( 2.5%)
Execution time upper quantile : 6.499548 ms (97.5%)
Overhead used : 1.600948 ns
nil
Yeah, for
is a snail. I still like its syntax and the conveniences, so I wrote a non-lazy iterator-based version that we use internally. We don't need it to deal with primitives, though, which is obviously something you need in your kind of work.
There is a long standing issue in the clojure jira to provide a macro for unrolling application, since a number of core functions do that to vary degrees to avoid rest args and apply https://clojure.atlassian.net/browse/CLJ-731 seems related
We very very rarely see anything but the cartesian-map version of for. I personally never use :let or :while. That is related in the sense that while that style of function definition is common in a lot of Clojure code it is never the most efficient way to do it especially when called repeatedly in some looping context 🙂
I actually started down the road of writing a for macro in lazy-noncaching then realized all I care about is one version with no fancy stuff.
It is a question of where and how to do the unrolling, in apply itself, or as something like one of those functions mentioned in the ticket where you explicitly unroll for some number of arguments then fallback to apply
I don't think I ever used :while
but :when
appears quite often in my code. :let
is sometimes useful when I need to calculate something on the "outer" loop value before looping over inner things. Another affordance I added for my team is a flag :keep true
, so that only non-nil values coming out of for
body are preserved.
(defn cartesian-map ([f as] ... (f a) ...) ([f as bs] ... (f a b) ...) ([f as bs & xss] ... (apply f a b xs) ...))
If you unrolled cartesian-map
yes agreed but for cartesian map specifically I do some extra work in the 2 arg variant to specialize out cases where the return value can be a lazy noncaching random access list as opposed to an iterator-based result. It isn't a generic expansion of a combinator like what Stew suggests. The random access result supports parallel reduction - I guess its all possible also with the variadic expansion but I lost patience before then.
maybe I wouldn't have if I had successfully engineered some combinator based n-arg variant.
So you wrote the same or equivalent macro-expansion of :when and :let in your iterator-based pathway?
@U06PNK4HG can you share your non-lazy for
?
@UDRJMEFSN Yeah, I believe it's pretty much equivalent, but I won't bet my money on it. It simply expands into nested loop
s that go over iterators (so, it assumes that the provided sequences have efficient iterators) and collects the result into a vector (through a transient). It also does the handling of :let
and :when
, however I haven't tested all possible permutations and extreme usages of those keywords, we keep the usecases simple after all. I can't really share the code right now, so I bring this up only to say that it's not that hard to write.
@UEENNMX0T I apologize, but that code is not currently open-sourced, and to be honest I won't be comfortable sharing it publicly in its current state because it has some internal assumptions and hand-waving that may not make sense for the general public.
Am I misunderstanding how clojure.java.shell is suppose to work when you want custom :env
? 🧵
You have a typo, "bla"
should be "bar"
but fixing doesn't help. This worked for me tho:
(apply clojure.java.shell/sh ["sh" "bar" :env {"PATH" "/tmp/foo"}])
yeah, with sh
that works! so env seems to work, now I need to understand why that doesn't work from inside clojure-lsp process in clojure-lsp-intellij plugin 🤪
I love bb process for all the other things it does, but I'm guessing it will have same issue.
yeah, I'm having a hard time to understand why IntelliJ java process on MacBooks can't access brew installed programs, and I found a whole IntelliJ class called https://github.com/JetBrains/intellij-community/blob/master/platform/util/src/com/intellij/util/EnvironmentUtil.java#L72 with a funny comment on how those things are complicated hehe, now I'm trying to workaround that
Oh. Setting PATH
and calling a shell script that is on that path and expecting it to work.
That doesn't work since the bar program will see the PATH environment variable, but before finding it you should already be on the PATH. However:
you can find the executable with fs/which
by passing it a custom path:
$ bb -e '(babashka.fs/which "bar" {:paths ["/tmp/FOO"]})'
#object[sun.nio.fs.UnixPath 0x2840cf0 "/tmp/FOO/bar"]
I think you even contributed thatand then:
$ bb -e '(babashka.process/shell (babashka.fs/which "bar" {:paths ["/tmp/FOO"]}))'
:hello
That seems much more straightforward @U04V15CAJ
@UKFSJSM38 were you specifying PATH
env for some reason? Did you need an isolated PATH
or something?
Yes I was specifying, getting the whole return of intellij EnvironmentUtil/getEnvironmentMap
https://github.com/clojure-lsp/clojure-lsp-intellij/blob/1addfeeac0bf5a71e7241be423acdb1159ee5fce/src/main/clojure/com/github/clojure_lsp/intellij/server.clj#L56
@UKFSJSM38 Sorry, don't have a full dev env setup on macOS... if no one volunteers by tomorrow, ping me, and I'll setup whatever is necessary and help.