Hi, I’m a bit confused about the clojure.core.reducers in CLJS .
When I do benchmarks in a real scenario, then r/* are usually a bit slower.
And my rule of thumb has always been - if you don’t have a lot of stuff and can’t parallelize it (CLJS problem), then don’t use it. But this surprised me:
cljs.user> (simple-benchmark [] (into [] (filter odd? (range 100))) 1000)
[], (into [] (filter odd? (range 100))), 1000 runs, 18 msecs
nil
cljs.user> (simple-benchmark [] (into [] (r/filter odd? (range 100))) 1000)
[], (into [] (r/filter odd? (range 100))), 1000 runs, 7 msecs
nil
So i would like to revisit my rule of thumb, thank you 🙂.I couldn’t reproduce transduce issue I mentioned, something between chair and monitor probably.
I also played with Iterator and Transduce + persistent! in completing, it is nice to have so many ways.
Now try the transducer arity. :)
cljs.user> (simple-benchmark [] (into [] (filter odd?) (range 100)) 1000)
[], (into [] (filter odd?) (range 100)), 1000 runs, 23 msecs
nilHuh!
Ah, wait, it's simple-benchmark. I mistook it for quick-bench that might not even exist for CLJS.
I was never afraid to use transducers in CLJ. But when I tried to use transducers and benchmark it in one part of my application in CLJS, the browser crashed. So i lowered simple-benchmark number of runs and it didn’t crash but the time was horrible, I dont get it.
I'm being particular about it because microbenchmarking is notoriously bad for using as a source of conclusive metrics for anything with a JIT. And I might be even too narrow here.
Not exactly related, but not too long ago I found out that adding a single def with a constant that wasn't even used made a program significantly slower. Don't recall the exact details but something like from 20 to 50% slower.
Oh, so the crash was when the real code was tested? Can you perhaps make a public script that can be used to reproduce it?
It will probably take a while, but it looks like an interesting topic to explore.I’ll try to make some repos, tests, so we can talk about it in more detail :-).
Yeah, that would be great.
just a word of warning: never trust any benchmark run from the REPL! you are almost guaranteed to get substantially different numbers for regularly compiled (and optimized) code.
this is due to the eval part of the REPL. browsers often do not optimize code that was created via eval at all, so the numbers you get are basically useless
Oops, forgot that part.
How would you benchmark it?
put it in a regular file that runs it somehow, make a release build and run it
i.e. call the benchmark from the :init-fn or :main for :node-script
it might not change the numbers at all, but rules out eval funkiness
That’s interesting. I didn’t know there was no point in benchmarking outside of a release build. I started writing a script that I evaluate with
clj -M --main cljs.main --compile hello-world.core --repl
and there r/* are still the fastest.The results are not really different with optimized build.
clj -M -m cljs.main --optimizations advanced -c hello-world.core
After I run
node out/main.js
It still show r/* as fastest
;; data = (range 1000)
[], (into [] (filter odd? data)), 1000 runs, 39 msecs
[], (into [] (r/filter odd? data)), 1000 runs, 20 msecs
[], (into [] (filter odd?) data), 1000 runs, 22 msecs
;; data = (into [] (repeatedly 100 #(hash-map :a 10 :b 100 :c "abcd" :d "zzzz")))
[], (into [] (->> data (map (fn* [p1__595#] (update p1__595# :a inc))) (filter (fn* [p1__596#] (odd? (:a p1__596#)))))), 1000 runs, 21 msecs
[], (into [] (->> data (r/map (fn* [p1__597#] (update p1__597# :a inc))) (r/filter (fn* [p1__598#] (odd? (:a p1__598#)))))), 1000 runs, 11 msecs
[], (into [] (comp (map (fn* [p1__599#] (update p1__599# :a inc))) (filter (fn* [p1__600#] (odd? (:a p1__600#))))) data), 1000 runs, 13 msecs
(and i think that mentioned crash with transducers in my app could be connected to ClojureScriptStorm, so let’s forgot about that for now until I will find a case where it is much more slower than other approaches)I'm not surpised the first one is slowest
I'm a bit surprised the transducers are a tad slower
essentially reducers is based on a dedicated protocols which reduces the amount of extra allocations the seq variant does
I’m trying to figure out when to use Reducers. From these examples it would seem that always. In my application, on the other hand, they were a LITTLE slower.
I'd go with transducers as they should apply more generally
seq things will always be slower than either reducers or transducers
fastest is probably always reduce
Another thing to try - extracting (filter odd?) into its own def that isn't under simple-benchmark.
that doesn't cost anything
Well, so is a def. And yet - and I believe you've witnessed it - I've found a case where it ended up slowing things down.
On the other hand, if (filter odd?) makes something slow, then you're probably unlikely to be able to predict anything with a small benchmark.
@thheller By reduce you mean Reducer, right? Using reduce instead of map + filter speeds it up but not much.
I mean actual reduce, reducers are still and abstraction layer on top of that
[], (reduce (fn [acc item] (conj acc (let [x (update item :a inc)] (if (odd? (:a x)) (conj acc x) acc)))) [] data), 1000 runs, 19 msecs
And with that, you're opening a door to the realm of transients. Which might be slower for small data and faster otherwise.
(def data (vec (range 500)))
(js/console.log "1")
(simple-benchmark [] (into [] (filter odd? data)) 1000)
(js/console.log "2")
(simple-benchmark [] (into [] (r/filter odd? data)) 1000)
(js/console.log "3")
(simple-benchmark [] (into [] (filter odd?) data) 1000)
(js/console.log "4")
(simple-benchmark [] (persistent! (reduce (fn [acc x] (if (odd? x) acc (conj! acc x))) (transient []) data)) 1000)18,8,8,6 is the results
or if I increase to (range 5000) it is 133, 65, 69, 50 (using chrome)
Transducers are just reducers with a little bit of extra overhead. I think in Clojure JVM as well reducers are a tad faster. This seems to check out.
So why don’t we use r/* all the time? 🙂
For me, transducers are around 4% faster on data size >= 100 in Chrome.
Reducers are a worse abstraction. IIRC there was a mention that if transducers were thought of first, reduces wouldn't even appear.
Cause transducers are nicer to use
yeah reducers in CLJS make little sense given there is never any parallel stuff
surprised they win over transducers, but its close enough to not matter
What material difference does 2ms over 1000 runs make compared to a nicer API surface and broader usability?
Well, that’s why I haven’t used Reducers for a long time and today I was surprised how faster they are. And I also “remembered” that I shouldn’t use them unless I have more than a thousand elements, but that’s obviously not true 🙂.
The 1000 items probably comes from a different mindset. Not that "it's faster" but "it's not worth it to make it faster". Of course, you can have 1000 iterations that each processes 1000 items, but then you'd probably use transients directly anyway.
into already uses transients, the reduce variant just had to use them manually
FWIW, I don't even think about such things, at all. I use transducers all the time, occasionally a lazy seq here and there. If something is slow in an optimized build, I profile and fix it, and invariably the fix is much more involved than replacing a seq with a transducer or something like that. The last performance fix that I did has improved the performance of a particular activity by 10x. I had to use nested transients and pass them through a relatively deep call tree.
I usually start with seqs and optimize directly going to reduce where necessary 😛
That's especially true if you write code for a browser. Especially so if it's not a mobile browser. "If it can be computed between frames, I don't care." :)
@p-himik I agree. I’m asking out of curiosity, I don’t need to tune the performance within these milliseconds. And you’re probably right about my misunderstanding of the reducer “thousand rule”, which is why I’m surprised.
I care very much if it takes up valuable time in my frame 😛
Yeah, I'd probably use seqs more frequently if not for the laziness that can blow up who knows where. As into is unconditionally eager, I can start thinking with my spinal column and not get myself into trouble.
Hum "how faster", I mean, they're barely faster from your benchmarks. Normally people use them for the parallel r/fold in Clojure JVM. I don't think there's a reason to use them in ClojureScript. Transducers are materially faster than lazy-seq, and also provide the benefit of being eager. But reducers are only marginally faster, and don't have any other benefits over transducers.
for really critical stuff I also often bypass cljs structures completely and just use arrays and interop
By the way, thanks for showing reduce + persistent!, it’s not a pattern I often think about.
but it is rare to get to that level
@didibus in the simplest test it is 2x faster than seq filter.
I mean faster than transducer.
The choice is, if you want an eager and more efficient version of the seq functions, are you going to use transducers or reducers. And here I mean that reducers isn't attractive because it has less features, isn't part of the core namespace, and brings only marginally faster performance. So almost always transducers seem the better choice.
Also, I guess there is filterv and mapv. If that's all you need, then just using them as an eager variant is good enough, and I'd expect faster than both reducers and transducers.
I understand that, that’s why I’m trying to find out why the transducer takes so extremely long in one case that the browser prompts me to “stop script”, but I can’t replicate it yet 🙂.
Are you using Firefox, by any chance?
Firefox is just consistently at least 50% slower than Chrome, in whatever CLJS benchmarks I've tried.
That 50% slowdown that I mentioned before? It was Chrome getting down to the performance level of Firefox after introducing an additional def that put the size of the ns object right beyond the threshold Chrome had for when some particular optimizations stop working.
Fighting against monopoly is a form of masochism 🙂. Anyway - the bigger / more real data I use, the more the results confirm what has been said here. But I am surprised by one thing I didn’t know (the BIG difference between lazy-seq and vector in this case).
(defn active? [item]
(:active? item))
(defn add-1 [item]
(update item :value inc))
(defn odd-item? [item]
(odd? (:value item)))
(defn special? [item]
(:special? item))
(def complex-data (repeatedly 10000 #(hash-map :active? (rand-nth [true false])
:special? (rand-nth [true false])
:value (rand-nth [1 2 3 4 5 6 7 8 9]))))
(defn example-transformation [items vectorize?]
(into []
(->> (if vectorize? (into [] items) items)
(filter active?)
(map add-1)
(filter odd-item?)
(remove special?))))
(example-transformation complex-data true), 1000 runs, 1329 msecs
(example-transformation complex-data false), 1000 runs, 2059 msecsI thought you attributed that to issues with flow storm
mapv and filterv are just like a direct operation over the vector. reducer applies a reducing function over a collection directly. transducer apply a reducing functions over a reducing context. lazy-seq lazily apply a function on elements being pulled in batches, but needs to create intermediate seqs to maintain immutability and reuse. Basically when thing are just more directly manipulating the data structure it's faster.
That said, mapv and filterv when combined don't perform loop fusion, where-as reducer and transducer do, so while individually I'd think them faster as they have less overhead, when combined they might take longer as it'll iterate twice.
About Flowstorm - I hope so, I’m yet to find out. I just meant that in a normal situation I probably wouldn’t even think of using clojure.core.reducers. And yes, now i get you. In my case i have many maps and filters so transducer should be best.
It gets tricky sometimes. I've seen combinations of mapv and filterv be faster still. It depends on small factors. Iterating twice might be faster than the overhead of additional function calls being made by reducers or transducers for example. Depending on the runtime, the size of the collection, the underlying collection type, the CPU caches, etc.
By the way, I kinda like this library: https://github.com/johnmn3/injest for when you're playing around. It lets you swap ->> for others that automatically rewrite the same thread code to use transducers, reducers or core.async.
Great, thanks for that. I’m glad I played. I had wrong idea about how much faster the vector is than lazy-seq/list.
I'm looking at either https://github.com/storybookjs/storybook or https://github.com/cjohansen/portfolio for isolated component development in a CLJS project of mine. At a high level, both are great at generating snapshots of components in certain states, but I'd prefer to use the CLJS-first Portfolio.
However, I'm tempted by Storybook's https://storybook.js.org/docs/essentials/controls, which provides a really nice UI for experimenting with different combinations of component state. It seems like the closest thing Portfolio has to in terms of experimenting with different component state with quick feedback is tap> in the REPL.
That leads me to two questions:
1. Is there a similar UI-based way to quickly experiment with component data in Portfolio?
2. And even if not, should I instead be persuaded to simply use tap> and the REPL?
If anyone has any input on deciding between these two tools in general, I'm also open to that as well!
here is a quick example of how you could use storybook https://github.com/thheller/shadow-cljs-storybook-reagent
thanks @thheller! I'll give it a try
did you manage to make storybook work ? I had issue when going from v6 to v7 because it was doing some code analysis and my generated js was not compliant