clojure-dev

dominicm 2025-03-13T10:09:08.078119Z

We've encountered a fun bug in some code, whereby:

user=> (set [Float/NEGATIVE_INFINITY Double/NEGATIVE_INFINITY 0])
#{0 ##-Inf}
user=> (set [Float/NEGATIVE_INFINITY 0 Double/NEGATIVE_INFINITY])
#{0 ##-Inf ##-Inf}
I've tracked this down to being due to the fact that the two forms of Infinity both share 0 as their starting hash, but have different hashes, except that they're Util.equiv due to both being Numbers & Infinity. I know something takes place to make 1 and 1N share a hash, should something be happening for the infinities to also be sharing a hash of some kind so this doesn't happen? Happy to point at deeper parts of the code I've tracked down in discovering this. This was quite informative:
user=> (defn get-segment [hash shift] (bit-and (bit-shift-right hash shift) 0x1f))
#'user/get-segment
user=> (defn hash-segments [x] (map get-segment (repeat (hash x)) (range 0 31 5)))
#'user/hash-segments
user=> (map hash-segments [Float/NEGATIVE_INFINITY Double/NEGATIVE_INFINITY 0])
((0 0 0 0 24 31 31) (0 0 0 0 31 31 31) (0 0 0 0 0 0 0))
user=> 

dominicm 2025-03-13T10:15:48.086759Z

I've just spotted this happens with (float 1) and (double 1) too:

❯ clj
Clojure 1.12.0
user=> (hash (float 1))
1065353216
user=> (hash (double 1))
1072693248
user=> (defn get-segment [hash shift] (bit-and (bit-shift-right hash shift) 0x1f))
#'user/get-segment
user=> (defn hash-segments [x] (map get-segment (repeat (hash x)) (range 0 31 5)))
#'user/hash-segments
user=> (hash-segments (float 1))
(0 0 0 0 24 31 0)
user=> (hash-segments (double 1))
(0 0 0 0 31 31 0)
user=> (= (float 1) (double 1))
true
user=> (set [(float 1)])
#{1.0}
user=> (set [(float 1) 0])
#{0 1.0}
user=> (set [(float 1) 0 (double 1)])
#{0 1.0 1.0}
user=> (set [(float 1) (double 1)])
#{1.0}

oyakushev 2025-03-13T10:55:41.238179Z

Can be simplified to this:

(assoc (hash-map (float 1) true, 0 true)
       (double 1) true)
=> {0 true, 1.0 true, 1.0 true}

(assoc (hash-map (float 1) true)
       (double 1) true)
=> {1.0 true}

👍 1
dominicm 2025-03-13T11:49:21.156329Z

Yep, happens with hash maps too, as sets use hashmaps under the hood.

2025-03-13T23:05:18.668249Z

• It is strongly recommended not to mix floats and doubles in Clojure collections. e.g. this statement "`hash` is consistent with = for numbers, except for special float and double values. Recommendation: Convert floats to doubles with (double x) to avoid this issue." on this page: https://clojure.org/guides/equality

2025-03-13T23:17:09.444139Z

In particular this section: https://clojure.org/guides/equality#_other_cases_of_hash_inconsistent_with

2025-03-13T23:24:03.868189Z

Even if you restrict yourself to double type only, if you expect to be able to do arithmetic on doubles and then look up the results as a key in some keyed data structure like a set or a map, there are things like this (which you may already be aware of, due to the approximate nature of most floating point numbers):

2025-03-13T23:24:05.913419Z

user=> (= (+ 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1) 1.0)
false

oyakushev 2025-03-13T23:35:03.866809Z

Sure, anything related to floating-point and equality goes under UB in my book. But the behavior discovered by Dominic is still peculiar, even if only for curiosity reasons.

2025-03-13T23:37:40.125489Z

Agreed it is peculiar. Agreed it might not be widely known and unexpected for many on their first encounter with it. It seems very likely to remain a part of the behavior of Clojure.