This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2023-01-07
Channels
- # announcements (1)
- # babashka (38)
- # beginners (21)
- # calva (1)
- # cider (6)
- # cljsrn (1)
- # clojure-austin (3)
- # clojure-dev (23)
- # clojure-europe (51)
- # clojurescript (2)
- # clr (100)
- # conjure (3)
- # core-typed (3)
- # data-science (2)
- # fulcro (21)
- # joker (1)
- # joyride (1)
- # lsp (7)
- # malli (4)
- # nbb (5)
- # reagent (1)
- # releases (1)
- # shadow-cljs (5)
- # spacemacs (5)
- # squint (5)
- # xtdb (16)
Curious if anyone has any insight to share in this thread that might be useful/interesting https://twitter.com/quoll/status/1611001864891277312
I haven’t dug into JIRA to see if there’s any extensive murmur3 design discussion on this just yet that would cover this particular curiosity.
I don’t think there’s anything in jira but there was a doc I did with measurements on a variety of potential functions, comparing perf, hash distribution, and specific tests for common key types
That was like a decade ago, don’t think I’m going to take the time to hunt for it unless there’s some problem we’re trying to solve
The rehash on strings is to improve distribution because the default algorithm is pretty bad for that
Java and I think Scala do a rehash in the hashed colls I think, and we did look at that option too
Iirc I also looked at avalanche, sip, maybe some others. Kind of hard to remember now
Yeah I seem to remember a fair bit of discussion about murmur3’s selection but maybe that was in IRC or something. As you say, it’s been awhile!
The 8 queens stuff from Paul Butcher that drove really was due to hashed keys of colls of either strings or numbers, don’t remember which so it was a key consideration to have colls of common hashed values produce good distributions
So for numbers (which hash to themselves), it was previously trivial to have small sets create hash collisions and that was a key thing to avoid
nod the notes in the equality and hashing section do a great job of explaining this imo, so again, many thanks for that
Im sure having a good tested oss version to absorb was part of it too. Some of that was work Rich did w/o me so I don’t know the details
thanks for all of the added context. I may quote you in the thread if that’s alright. I didn’t actively ask if there was any particular problem to be solved, but it seemed quite specific and so I kind of assumed it may be related to real work.
Well, those are my vague decade old memories, hope they are correct ;)
(Though as an aside, a hearty thank you for the historical notes section in the equality and hashing section. They are great notes.)
I think a lot of those are due to Andy Fingerhut so thanks to him
You are welcome!