Fork me on GitHub
#clojure-dev
<
2023-02-03
>
slipset09:02:26

Just stumbled upon https://ask.clojure.org/index.php/9010/distinct-should-support-sets. Basically I have a fn that takes a coll and I wanted to be sure it was distinct regardless of type. I see this question has no Jira ticket on it. Any chance of that happening?

slipset09:02:24

Sort of related, but why isn’t distinct implemented as (seq (set coll)) ?

favila11:02:56

Distinct is lazy and preserves order. That does neither

2
Alex Miller (Clojure team)14:02:08

btw, the transducer version will work with sets

slipset14:02:07

Does that make it worse or better?

slipset14:02:20

(sorry late Friday here)

Alex Miller (Clojure team)14:02:21

it provides a solution now in lieu of a change

Alex Miller (Clojure team)14:02:35

but I will file a ticket for it

JoshLemer17:02:45

I notice that there is opportunity to reduce a lot of allocations in int-maps during iteration, if we have Leaf nodes (https://github.com/clojure/data.int-map/blob/master/src/main/java/clojure/data/int_map/Nodes.java#L567) extend clojure.lang.MapEntry (since that is what they are basically anyways) Then its iterator can return this instead of new MapEntry(key, value) https://github.com/clojure/data.int-map/blob/master/src/main/java/clojure/data/int_map/Nodes.java#L596 and similarly its reduce method can invoke on this rather than new MapEntry(key, value) https://github.com/clojure/data.int-map/blob/master/src/main/java/clojure/data/int_map/Nodes.java#L614 On an int-map of 50k entries I observe about 40% speedup in reduce by doing that

JoshLemer17:02:26

a complication is that int MapEntry.count() { return 2 } clashes in both meaning and signature with long INode.count() (which should return 1). So INode.count would need to be renamed to like nodeCount() or something.

dmiller22:02:51

I don't have a way to check out against the latest build of Clojure(JVM), but I'm wondering about the effects of commit https://github.com/clojure/clojure/commit/b2366fa5c748f9d600879c3e0b549e631a5b386f on LongRange chunking. Prior to this commit LongRange had similar logic to what is still in Range for chunking, namely forceChunk would make sure that the LongChunk created had size no more than CHUNK_SIZE = 32 . in the new code, the LongChunk will have size equal to the size of the LongRange. I'm guessing this will be a surprise to anyone trying (take 1 (map f (range 1000000))) who likely will expect f to be called at most 32 times. (I'm basing this on comparing my current install of clj = 1.11.1 against the latest ClojureCLR which has this change. I used a side-effecting f . For clj f is called 32 times. for ClojureCLR, well, at least in only tested it against 100 and not 1,000,000. Tracing through my code vs current JVM code, I don't think the problem is just on my side.)

Alex Miller (Clojure team)22:02:50

this is been changed back in master

Alex Miller (Clojure team)22:02:38

or maybe it hasn't been ok'ed yet

dmiller22:02:33

doesn't show in the repo yet. My usual assumption is that something is broken in my code. It took me a long time to think of checking the commit history on that file.

Alex Miller (Clojure team)22:02:46

actually had a missing field set so was not showing up in the right list, so thanks! ;)

dmiller22:02:30

Also didn't think to check JIRA 🙂 And all I was trying to do was come up with a final example for a blog post.