Fork me on GitHub

@mfikes Pretty certain that the (list ..) creation is the main culprit for CLJS-2486. The macro could probably be improved to just output (List. .... (List ...)). Though I agree that IndexedSeq is right here since it'll make destructing faster.


@rauh I tried (List. nil key (List. nil val () 1 nil) 2 nil) as a replacement for (IndexedSeq. #js [key val] 0 nil) and for

(simple-benchmark [me (first {:a 1})] (first me) 10000000)
I get speedups of 0.88 for V8, 0.83 for SpiderMonkey, and 0.73 for JavaScriptCore. (In other words, the List. construct is slower than the IndexedSeq, construct.)


@mfikes Yeah, I guess 2 obj creations vs 1 obj creation. Though much faster than doing (list key val) right?


The list macro creates a bunch of IIFE's and conj calls which are all unnecessary i think.


Yeah, specifically looking at V8: list: 596 ms, IndexedSeq.: 163 ms, List.: 185 ms


So many things to optimize, so little time 😄


I won't create a ticket for list macro. I don't use it...


Good call on the IndexedSeq @mfikes. It's a shame it's not as quite fast as the old PV based map entry.


Interestingly, it seems that trying to optimize the list macro might slow down JavaScriptCore:

Benchmarking with V8
[], (first (list 17 41)), 10000000 runs, 437 msecs
[], (first (List. nil 17 (List. nil 41 () 1 nil) 2 nil)), 10000000 runs, 234 msecs
Benchmarking with SpiderMonkey
[], (first (list 17 41)), 10000000 runs, 438 msecs
[], (first (List. nil 17 (List. nil 41 () 1 nil) 2 nil)), 10000000 runs, 380 msecs
Benchmarking with JavaScriptCore
[], (first (list 17 41)), 10000000 runs, 337 msecs
[], (first (List. nil 17 (List. nil 41 () 1 nil) 2 nil)), 10000000 runs, 352 msecs
Benchmarking with Nashorn
[], (first (list 17 41)), 10000000 runs, 4130 msecs
[], (first (List. nil 17 (List. nil 41 () 1 nil) 2 nil)), 10000000 runs, 5782 msecs
Benchmarking with ChakraCore
[], (first (list 17 41)), 10000000 runs, 871 msecs
[], (first (List. nil 17 (List. nil 41 () 1 nil) 2 nil)), 10000000 runs, 660 msecs


@mfikes That's very strange, wouldn't expect that. Probably some weird optimizations happening. Also, the list macro will produce a little better code since you're passing in constants.


Probably optimized the heck out of those conj calls (which also do nothing else but (List. ...)), in real world scenarios there is no way those are faster (probably).


Also, I never benchmark with inline code in simple-benchmark since that code is not within a function (which is not realistic). So I always call a fn and make sure GCC doesn't inline that fn (using (goog/exportSymbol "benchfn1" the-fn)).


code like that used to explode on jsc, now it's faster than flat array code?


@favila FWIW, CLJS-910 is fixed on modern versions of macOS and likely fixed on iOS as well


@rauh Does this match what you were describing as a valid way to completely avoid inlining?

(defn bench-fn1 [] (list 17 41))
(defn bench-fn2 [] (List. nil 17 (List. nil 41 () 1 nil) 2 nil))
(goog/exportSymbol "indirect1" bench-fn1)
(goog/exportSymbol "indirect2" bench-fn2)
(simple-benchmark [] (js/indirect1) 1e7)
(simple-benchmark [] (js/indirect2) 1e7)


(Hoping ^ is valid, because it does lead to a perf improvement across all engines.)


@mfikes Yes, exactly like that. I kinda feel like the simple-benchmark macro could probably do something like that automatically.


One minor thing I discovered with that benchmark is that the first one warms up the second. So you really need to run them separately.


It might make sense to give the list fn a 1 and 2 arity case and then call those from the macro.


Benchmarking cljs is so tricky...


I've updated the benchmark in It still looks like a good change on the surface. The change to the macro is trivial so I'm putting that through some tests.


Oh 1617: I hadn't considered the right-to-left eval. Now that let binding makes sense... I guess the macro could also check for symbol? and not just consts.


The revised macro I'm looking at is this. We get the left-to-right out of it for free:

(core/defmacro list
   '(.-EMPTY cljs.core/List))
  ([x & xs]
   (core/let [cnt (core/inc (count xs))]
     `(cljs.core/List. nil ~x (list ~@xs) ~cnt nil))))


oh right...


In other words, it seems this simpler variant is universally applicable (regardless of whether the arguments are consts)


If that pans out, then this is an excellent optimization for the non-const case 🙂


Wow @rauh For whatever reason, your suggested optimization does really well in V8 (graphs included in