@dnolen is there a reason ICounted is not implemented for ChunkedSeq? The perf difference is massive, constant time vs linear time basically
(there's a ticket with a patch from 2018 https://clojure.atlassian.net/browse/CLJS-2471)
(def v4 (seq (vec (range 1e5))))
(simple-benchmark [] (count v4) 1e4)
== master
Node/V8
[], (count v4), 10000 runs, 5097 msecs
Bun/JSC
[], (count v4), 10000 runs, 6764 msecs
== with ICounted implemented
Node/V8
[], (count v4), 10000 runs, 1 msecs
Bun/JSC
[], (count v4), 10000 runs, 0 msecs@roman01la did you check to see if is implemented in Clojure, and if not if there's a discussion about it?
Here's Clojure implementation https://github.com/clojure/clojure/blob/6a4ba6aedc8575768b2fff6d9c9c7e6503a0a93a/src/jvm/clojure/lang/PersistentVector.java#L512-L514
nice! go for it!
no reason it wasn't done far as I know
cool! the patch is already there π https://clojure.atlassian.net/browse/CLJS-2471
I can update it against master if you want
that would be great
I just looked through both codebases.
Clojure implements chunked seqs as classes in 4 places, with 2 of them being counted (`PersistentVector$ChunkedSeq` and LongRange), while the other 2 are not (`ChunkedCons` and Range).
Meanwhile ClojureScript has ChunkedCons, ChunkedSeq, IntegerRange, Range. The IntegerRange type is already counted. I'm speculating that ChunkedSeq was overlooked because it's not embedded inside PersistentVector like it is on the JVM, and also because it's hard to copy everything exactly the same way π
I recently hit a difference between Clojure and ClojureScript: if you conj an existing value into a set in Clojure, then it always tests if the value is already present, and returns this if it is. However, ClojureScript explicitly creates a new object every time.
(let [s #{1}] (identical? s (conj s 1)))
;; Clojure => true
;; ClojureScript => false
This happens for each implementation of Set, with the ClojureScript implementations looking like:
(-conj [coll o]
(PersistentHashSet. meta (assoc hash-map o nil) nil))
(-conj [coll o]
(PersistentTreeSet. meta (assoc tree-map o nil) nil))
Is this difference intentional?
Fortunately, it doesn't happen for maps, where ClojureScript and Clojure will both return the original object when adding a redundant key/value. This would allow for a small change if we want parity:
(-conj [coll o]
(let [m (assoc hash-map o nil)]
(if (identical? m hash-map)
coll
(PersistentHashSet. meta m nil))))
(-conj [coll o]
(let [m (assoc tree-map o nil)]
(if (identical? m tree-map)
coll
(PersistentTreeSet. meta m nil))))
@quoll This behavior bit me a few times in the past:
(def x1 (with-meta 'x {:tag 'boolean}))
(def m (assoc {x1 x1} 'x 'x))
(meta (get (conj #{x1} 'x) 'x))
=> {:tag boolean}
I guess it's just the way it works on Clojure/ClojureScript, so just a note.The final line makes sense to me. The thing inside the set has metadata. We add something that is evaluated as equal, and therefore it doesn't get added. So when we look up that same object in the set via an "equivalent" key, we get the original object, which is the one with metadata.
It's also why
(meta (ffirst m)) => {:tag boolean}
and
(meta (second (first m))) => nil
It's the consequence of allowing equality between things that have differing metadata.
I actually reported a bug around similar behavior some years ago: https://clojure.atlassian.net/browse/CLJS-2736
Looking a value up in a set used to return the thing that had been the lookup argument, rather than the thing that was inside the set.
The behavior differed between Clojure and ClojureScript. At the time, I got around it by replacing the set with a map of values to themselves.
Hi David. I've a question about your comment on that ticket:
> it triggers lite mode test failures because we had missing coverage
I'm not sure which tests you run for this, sorry.
I mostly stick to script/test* but also tried clj -M:lite.test.build. But you seem to be mentioning another test.
If you can point me at it, then I could take a look?
oh ha, it's pretty hard to run all the tests locally - there's just too many. One moment I can show you
https://github.com/clojure/clojurescript/actions/runs/18513863715
you'll see there's a lite test run - which uses different compiler flags.
I always set up a PR for patches, yours is here - https://github.com/clojure/clojurescript/pull/272
Thanks for this. I was just keeping it to a few of the JS engines (GraalVM, V8, Javascript Core), and only trying to run the tests I thought I needed. At this point though, I just want to find whatever it is that failed for you π
https://github.com/clojure/clojurescript/actions/runs/18512262627/job/52755552761?pr=272
you didn't break anything, it's just that your tests showed that there was too few test covering this
in :lite-mode we swap in different data structures and they don't implement this behavior
I did add unnecessary tests, but thatβs because it just felt like the existing tests were too few π
I'm also looking into improving equiv perf for persistent vector when compared with a chunked seq. Specialized code path is already faster but with ICounted for ChunkedSeq the gains are more significant, 3-5x
(def v1 (vec (range 1e2)))
(def v2 (seq (vec (range 1e2))))
(def v3 (vec (range 1e5)))
(def v4 (seq (vec (range 1e5))))
(simple-benchmark [] (= v1 v2) 1e6)
(simple-benchmark [] (= v3 v4) 1e4)
== master
Node/V8
[], (= v1 v2), 1000000 runs, 1795 msecs
[], (= v3 v4), 10000 runs, 13003 msecs
Bun/JSC
[], (= v1 v2), 1000000 runs, 1351 msecs
[], (= v3 v4), 10000 runs, 13377 msecs
== equiv speciailized on ChunkedSeq + ICounted
Node/V8
[], (= v1 v2), 1000000 runs, 573 msecs
[], (= v3 v4), 10000 runs, 5492 msecs
Bun/JSC
[], (= v1 v2), 1000000 runs, 234 msecs
[], (= v3 v4), 10000 runs, 2499 msecs@roman01la sure - noting the order of the eq check matters here, (= chunked vec) vs. (= vec chunked)
right, I'll extend ChunkedSeq as well will create a ticket for that
@dnolen do you think it makes sense to specialise -equiv for other data types: PersistentVector <> LazySeq <> Cons <> ChunkedCons? I think those are the most commonly used ones. On the other hand this is going to complicate -equiv implementation for all of them.