Fork me on GitHub
#clojure
<
2023-11-11
>
chrisn14:11:57

I am playing around with a faster version of apply for which I end up calling ArrayList.addAll indirectly via AFn from clojure.core. This results in this error:

[java]     ... 1 more
     [java] Caused by: java.lang.NullPointerException: Cannot invoke "java.util.Collection.toArray()" because "c" is null
It appears the clojure compiler - when compiling core - recursively parses all the java it finds with a parser that doesn't understand java generics. Is this due to an older version of java used just to compile core itself and if so can I change just this version of java in my fork? The faster version of apply does work but I am just playing around with various ideas I have w/r/t the clojure compiler itself.

chrisn14:11:44

I should have pasted more stacktrace. This is during the compilation of core and isn't due to running code but due to the parsing of the code itself.

chrisn14:11:40

or at least that is my assumption here - will check concretely in a moment.

chrisn15:11:35

Nope- you were right. does nil-punning not work in core.clj?

Noah Bogart15:11:42

Do you have a repo with the code we can look at?

ghadi15:11:17

what do you mean nil punning in clojure core?

chrisn15:11:16

in clojure core.clj there are a lot if (nil? x) checks in cond and friends whereas in lot of code the variable itself would just be repeated - this is just curiosity.

chrisn16:11:20

if you are curious what I am doing.

hiredman16:11:05

The clojure compiler doesn't parse java ever

hiredman16:11:22

apply works with infinite seqs, adding those to an array list is not possible

chrisn16:11:46

Read the code - apply calls seq on its args.

chrisn16:11:56

If it doesn't then lots of things are possible.

chrisn16:11:09

Or rather the read the diffs.

hiredman16:11:36

Not sure what calling seq has to do with infinite seqs or not

hiredman16:11:47

I am looking at the code

chrisn16:11:49

The AFn interface realizes the seq always - it is not lazy

chrisn16:11:00

The RestFn implementation relies on infinite seqs.

chrisn16:11:08

AFn.applyTo

chrisn16:11:38

So the AFn pathway is greedy not lazy.

hiredman16:11:16

Largs can be an infinite sequence and you are addAlling it to an arraylist

chrisn16:11:36

Only if it is either an Object[] or implements java.util.RandomAccess.

chrisn16:11:08

That specific pathway - the addAll pathway - isn't the most efficient thing you could do there - it was just what I could do quickly to test things out.

chrisn16:11:31

The most efficient thing is to have concrete list implementations that do the concatenation implicitly.

chrisn16:11:43

for 1, 2, or 3 args.

hiredman16:11:37

Ah, I see, that is the asrandomaccess method

chrisn16:11:21

I tried to change the RestFn implementation so the fixed args were always a list but code pwned and backed off.

Noah Bogart19:11:16

What are the performance changes?

chrisn21:11:08

Some calls faster, some slower. The best case scenario, calling with some random access data such as a persistent vector such as (apply f [1 2 3 4]) was quite a bit faster, 3x or more. But as you add scalar arguments in front (apply f a '(b c d e)) the gains dropped off to 2x in the best case to 20-30% slower in some cases such as (apply f '(1 2 3 4)). So much faster if you have a random access argument to somewhat slower if you have a 'seq' based argument. I like the tradeoff as it means people who care about performance can use apply or rather .applyTo with an appropriate argument and get decent performance. The impetus for this was https://cnuernber.github.io/ham-fisted/ham-fisted.lazy-noncaching.html#var-cartesian-map and that design - a single tupled argument - will beat any design assuming .apply but it requires buy-in from the user. In cases like generic map (or https://cnuernber.github.io/dtype-next/tech.v3.datatype.html#var-emap) with more than Y arguments you can get situations where you really want to use apply repeatedly in what amounts to a tight loop. In those cases I both want to do minimal mutable updates to the argument container and I want to call .applyTo with that container as is without conversion to seq.

👍 1
chrisn21:11:28

In any case - thanks to you, ghadi, and hiredman - I upgraded my jdk I guess with brew update and I had never seen that NPE before and since it was during the compilation process I was confused. It helps to know that nothing parses java during the compilation process - in retrospect it was an odd assumption to make.

oyakushev12:11:22

@UDRJMEFSN What about making cartesian-map an inline function that will repack the varargs to a container of your preference? I think it is much easier than trying to speed up apply which even then will likely still lag behind performance-wise.

chrisn14:11:46

This is a good question - The first answer is due to the interface of IFn that restfn will still get an implementation ISeq which is far slower to work with than the underlying argument container - this is unnecessary. Honestly the only change I really want is to make .applyTo in IFn take an Object and not an ISeq and then you could change nearly nothing else aside from avoiding calling seq in apply itself but leave it up to the specific implementation of IFn - those are the majority of changes I made but I went a step further making the actual implementation of AFn faster. This is more general and allows someone to get the max performance out of an apply situation that may need it and would transparently apply to all Clojure programs with no changes aside from changes in libraries like a few of mine that manipulate the interface of IFn. For cartesian map I carefully mutably update the argument container for max performance - that container is an object array and I can either create an instance of java.util.RandomAccess from it or I can create a Seq or I can do nothing. A variadic fn that simply passes the argument through to the code would work great in that case if the object array isn't converted to something else that has far worse access semantics. So the only strong argument I have is to change the interface of IFn. Everything else can be worked around in code that has different constraints than the general constraints -- threadsafe lazy caching -- of Clojure's core libraries. Even that argument is weakened as as I mentioned before for special cases like cartesian map simply passing a single argument which is the argument vector and having the user adjust to a single argument vector bypasses this whole discussion and is likely faster at the end of the day. That is analogous to your variadic suggestion aside from it avoids any manipulation by the clojure core function machinery that would change and weaken the argument. Honestly I think we have all seen apply-related noise show up in the profile and I think that is overall unecessary. Apply can be quite a bit faster and it isn't 'hard' it just needs doing so I wanted to see how fast it could be.

oyakushev20:11:49

@UDRJMEFSN Here's my take on your problem. This is what I get when I bench your updated cartesian-map:

(api/sum-fast
         (cartesian-map
          #(h/let [[a b c d] (lng-fns %)]
             (-> (+ a b) (+ c) (+ d)))
          [1 2 3]
          [4 5 6]
          [7 8 9]
          [10 11 12 13 14]))

;; Time per call: 2.17 us   Alloc per call: 1,272b
This is what I get when I bench the original cartesian-map that used apply:
(api/sum-fast
         (cartesian-map-apply
          #(-> (+ %1 %2) (+ %3) (+ %4))
          [1 2 3]
          [4 5 6]
          [7 8 9]
          [10 11 12 13 14]))

Time per call: 18.21 us   Alloc per call: 27,137b
Then, I did the following: renamed the original apply-based cartesian-map-apply to cartesian-map-apply* and made it accept a single arity [f args]. I replaced (.applyTo f val-seq) with (f val-seq) . Finally, I made a macro facade for this function:
(defmacro cartesian-map-apply [f & args]
  (let [arr (gensym "arr")]
    `(let [f# ~f]
       (cartesian-map-apply*
        (fn [^clojure.lang.ArraySeq val-seq#]
          (let [~arr (.array val-seq#)]
            (f# ~@(for [i (range (count args))]
                    `(RT/aget ~arr (unchecked-int ~i))))))
        [~@args]))))
Benching this version, I get:
Time per call: 2.55 us   Alloc per call: 744b
Not too bad, eh? I made cartesian-map-apply a macro so that the PoC is easier to understand. But it can be made an inline function, so that when invoked literally it expands into what I wrote in the macro, but if called via some HOF, it resorts to using apply.

oyakushev20:11:14

Having said that, 2us for something so trivial still feels woefully slow :thinking_face: Abstractions be damned

chrisn20:11:13

Also - make the last argument large - like hamf/range 2000 and see what happens.

chrisn20:11:20

less alloc per call is pretty nice though.

chrisn20:11:24

This sequence has 135 members in it I think and cartesian-map has some setup time. As you move to larger numbers it outperforms for by more at least in my tests.

oyakushev20:11:30

Also - make the last argument large - like hamf/range 2000 and see what happens.Sure, positional arguments don't make much sense at those sizes. Neither the user-provided f will be positional then. So, taking a tuple seems more reasonable overall.

chrisn20:11:59

Oh I don't mean the expansion of the for in terms of n-args. Agreed about positional arguments completely - I meant the last argument [10 11 12 13 14] -> (api/range 2000)

chrisn20:11:33

Still 4 arg expansion, just one argument - the last one in the sequence thus the tightest one to loop over - is the largest.

chrisn20:11:15

implicitly you version is using an arrayseq and getting the array. That is then using positional getters - my argument is precisely that applyTo should allow such modifications.

chrisn20:11:32

Potentially you want a double or long array or some other such thing.

chrisn20:11:43

Or some simple wrapper thereof.

oyakushev20:11:47

(api/range 2000)Tried this out, and I got 480.36 us in tuple version vs 876.68 us in my version. The profiler told me it's due to boxed math, so when I rewrote the example to this, I got the performance on par again:

(api/sum-fast
         (cartesian-map-apply
          (fn [^long a, ^long b, ^long c, ^long d]
            (-> (+ a b) (+ c) (+ d)))
          [1 2 3]
          [4 5 6]
          [7 8 9]
          (api/range 2000)))

;; 461.51 us

chrisn21:11:41

beautiful 🙂.

oyakushev21:11:23

Overall, I agree with you that apply can be made faster, but I sincerely doubt that it can be made fast and efficient enough to consider using it in cases where the performance is truly important. Making allocations when simply invoking a function is a huge waste for hot loops.

chrisn21:11:49

Yes I think you are right - the best case is just so much better with some custom work.

oyakushev21:11:08

Regarding the last example: it will only work for up to 4 arguments since larger primitive arities are not supported by Clojure. So, that's another argument in favor of your tuple approach being more generic and flexible.

chrisn21:11:42

for was the impetus for cartesian-map -

user> (crit/quick-bench (hamf/sum-fast (lznc/cartesian-map 
                                        (fn [data]
                                          (h/let [[a b c d] (lng-fns data)]
                                            (-> (+ a b) (+ c) (+ d))))
                                        [1 2 3]
                                        [4 5 6]
                                        [9 8 9]
                                        (hamf/range 2000))))
Evaluation count : 2058 in 6 samples of 343 calls.
             Execution time mean : 324.076022 µs
    Execution time std-deviation : 40.485625 µs
   Execution time lower quantile : 287.106755 µs ( 2.5%)
   Execution time upper quantile : 364.012552 µs (97.5%)
                   Overhead used : 1.600948 ns
nil
user> (crit/quick-bench (hamf/sum-fast (for [^long a [1 2 3]
                                             ^long b [4 5 6]
                                             ^long c [9 8 9]
                                             ^long d (hamf/range 2000)] 
                                         (-> (+ a b) (+ c) (+ d)))))
Evaluation count : 126 in 6 samples of 21 calls.
             Execution time mean : 5.615093 ms
    Execution time std-deviation : 942.330849 µs
   Execution time lower quantile : 4.753947 ms ( 2.5%)
   Execution time upper quantile : 6.499548 ms (97.5%)
                   Overhead used : 1.600948 ns
nil

oyakushev21:11:24

Yeah, for is a snail. I still like its syntax and the conveniences, so I wrote a non-lazy iterator-based version that we use internally. We don't need it to deal with primitives, though, which is obviously something you need in your kind of work.

hiredman21:11:46

There is a long standing issue in the clojure jira to provide a macro for unrolling application, since a number of core functions do that to vary degrees to avoid rest args and apply https://clojure.atlassian.net/browse/CLJ-731 seems related

harold 2
oyakushev21:11:14

Nothing inspires more confidence than the words "Patch away" next to 2011 😄

🤝 1
chrisn21:11:05

We very very rarely see anything but the cartesian-map version of for. I personally never use :let or :while. That is related in the sense that while that style of function definition is common in a lot of Clojure code it is never the most efficient way to do it especially when called repeatedly in some looping context 🙂

chrisn21:11:51

I actually started down the road of writing a for macro in lazy-noncaching then realized all I care about is one version with no fancy stuff.

hiredman21:11:08

It is a question of where and how to do the unrolling, in apply itself, or as something like one of those functions mentioned in the ticket where you explicitly unroll for some number of arguments then fallback to apply

oyakushev21:11:38

I don't think I ever used :while but :when appears quite often in my code. :let is sometimes useful when I need to calculate something on the "outer" loop value before looping over inner things. Another affordance I added for my team is a flag :keep true , so that only non-nil values coming out of for body are preserved.

hiredman21:11:34

(defn cartesian-map ([f as] ... (f a) ...) ([f as bs] ... (f a b) ...) ([f as bs & xss] ... (apply f a b xs) ...)) If you unrolled cartesian-map

chrisn22:11:46

yes agreed but for cartesian map specifically I do some extra work in the 2 arg variant to specialize out cases where the return value can be a lazy noncaching random access list as opposed to an iterator-based result. It isn't a generic expansion of a combinator like what Stew suggests. The random access result supports parallel reduction - I guess its all possible also with the variadic expansion but I lost patience before then.

chrisn22:11:33

maybe I wouldn't have if I had successfully engineered some combinator based n-arg variant.

chrisn22:11:35

So you wrote the same or equivalent macro-expansion of :when and :let in your iterator-based pathway?

Noah Bogart22:11:32

@U06PNK4HG can you share your non-lazy for?

oyakushev23:11:32

@UDRJMEFSN Yeah, I believe it's pretty much equivalent, but I won't bet my money on it. It simply expands into nested loops that go over iterators (so, it assumes that the provided sequences have efficient iterators) and collects the result into a vector (through a transient). It also does the handling of :let and :when , however I haven't tested all possible permutations and extreme usages of those keywords, we keep the usecases simple after all. I can't really share the code right now, so I bring this up only to say that it's not that hard to write. @UEENNMX0T I apologize, but that code is not currently open-sourced, and to be honest I won't be comfortable sharing it publicly in its current state because it has some internal assumptions and hand-waving that may not make sense for the general public.

👍 1
chrisn00:11:28

makes complete sense. (->> lznc/cartesian-map lznc/remove vec) with a generated fn.

ericdallo15:11:22

Am I misunderstanding how clojure.java.shell is suppose to work when you want custom :env ? 🧵

1
ericdallo15:11:27

I have a /tmp/foo/bar executable script with echo "foo:bar" in its content

ericdallo15:11:00

so if I PATH=/tmp/foo bar in terminal I get a foo:bar output as expected

ericdallo15:11:31

but when running (apply clojure.java.shell/sh ["bla" :env {"PATH" "/tmp/foo"}])

ericdallo15:11:46

I get a error=2, No such file or directory exception

lread15:11:16

You have a typo, "bla" should be "bar"but fixing doesn't help. This worked for me tho:

(apply clojure.java.shell/sh ["sh" "bar" :env {"PATH" "/tmp/foo"}])

ericdallo15:11:59

ah my bad, in my machine its bla, I rewrote to bar when typing here hehe

simple_smile 1
ericdallo15:11:10

it's odd that works to you :thinking_face:

lread15:11:23

I added "sh"

ericdallo15:11:49

yeah, with sh that works! so env seems to work, now I need to understand why that doesn't work from inside clojure-lsp process in clojure-lsp-intellij plugin 🤪

ericdallo15:11:53

thanks lread!

lread15:11:05

My pleasure. Also remember our old and dear friend: Windows!

ericdallo15:11:27

ah yeah, the CI always remebers of that haha

simple_smile 1
ericdallo15:11:46

I was considering use bb.process, but good to know old java.shell works

lread15:11:14

I love bb process for all the other things it does, but I'm guessing it will have same issue.

lread15:11:43

(when setting PATH)

ericdallo15:11:02

yeah, I'm having a hard time to understand why IntelliJ java process on MacBooks can't access brew installed programs, and I found a whole IntelliJ class called https://github.com/JetBrains/intellij-community/blob/master/platform/util/src/com/intellij/util/EnvironmentUtil.java#L72 with a funny comment on how those things are complicated hehe, now I'm trying to workaround that

lread15:11:31

Well, it's nice that you can keep your sense of humor! simple_smile I'm sure that helps!

ericdallo15:11:09

Oh yeah, for sure 😅 Especially as a Linux user...

simple_smile 1
borkdude16:11:45

What issue would bb process also have?

lread16:11:34

Just try above with bb process, I think it is probably normal behaviour.

borkdude16:11:24

What "above", it wasn't clear to me what the problem was

lread16:11:43

Oh. Setting PATH and calling a shell script that is on that path and expecting it to work.

lread16:11:24

I can try and show in a few mins…

borkdude16:11:25

That doesn't work since the bar program will see the PATH environment variable, but before finding it you should already be on the PATH. However:

borkdude16:11:59

you can find the executable with fs/which by passing it a custom path:

$ bb -e '(babashka.fs/which "bar" {:paths ["/tmp/FOO"]})'
#object[sun.nio.fs.UnixPath 0x2840cf0 "/tmp/FOO/bar"]
I think you even contributed that

👀 1
borkdude16:11:42

and then:

$ bb -e '(babashka.process/shell (babashka.fs/which "bar" {:paths ["/tmp/FOO"]}))'
:hello 

lread17:11:49

That seems much more straightforward @U04V15CAJ @UKFSJSM38 were you specifying PATH env for some reason? Did you need an isolated PATH or something?

ericdallo17:11:55

BTW I need a mac user to test that fix if any of you wanna try 😄

lread17:11:05

@UKFSJSM38 Sorry, don't have a full dev env setup on macOS... if no one volunteers by tomorrow, ping me, and I'll setup whatever is necessary and help.

ericdallo17:11:26

no problem, thanks for the help already @UE21H2HHD!

👍 1