Fork me on GitHub
#clojure-dev
<
2019-04-12
>
slipset12:04:50

So over in #clojure, I’ve been discussing with myself why (sort - [1 3123 6]) work predicably as long as - doesn’t return a number larger or smaller than an integer.

slipset12:04:22

It basically comes down to this line here:

slipset12:04:40

Would a ticket for making this work for the case where n is bigger/smaller than an int be worth it?

slipset12:04:27

Here’s the code that started my investigation:

slipset12:04:32

user=> (sort - [23413461236412630 10000000000000000 1 0 -1])
(-1 0 1 23413461236412630 10000000000000000)

gfredericks13:04:25

@slipset I'm not sure there's a reasonable way to "fix" that; - is just not a good comparator function

gfredericks13:04:53

the Comparator interface defines the return value as an int, so I'm not sure what else you would do there

dpsutton13:04:02

public int compare(Object obj1, Object obj2): is the interface for comparable.

gfredericks13:04:20

to make that sort work you'd have to expand sort so that it takes some broader definition of a comparator

gfredericks13:04:48

I doubt there's a compelling use case for that kind of change; I'm getting a headache just trying to think about what (sort - ...) is supposed to do

gfredericks13:04:11

I think it's supposed to be the same as (sort ...)?

slipset13:04:27

@gfredericks thing is all fns in Clojure implement Comparator.

gfredericks13:04:07

@slipset sure, but in general a function could be even less appropriate for use as a Comparator than - is

gfredericks13:04:14

e.g., hash-map also implements Comparator

gfredericks13:04:29

would you want there to be some way of using hash-map with sort also? where do you draw the line?

gfredericks13:04:03

I don't think functions implement Comparator because they're expected to all be useful comparators, but just so that when you use a jvm interface that requires a comparator you don't have to reify it, you can just use an appropriate function

gfredericks13:04:51

all functions are also Runnable, even though only functions with 0-arg arities can have .run called without an exception

slipset13:04:23

The thing is that passing str as a comparator throws, whereas passing - sometimes give the wrong results.

gfredericks13:04:16

Are you concerned that somebody might be tempted to use -?

gfredericks13:04:57

Was it a situation where you were programmatically picking a comparator and you used - as the comparator for normal sorting order?

slipset13:04:42

Of course I should have reached for compare, but - holds the contracts when applied to integers.

gfredericks13:04:26

the arg contracts, maybe; not the return contract, apparently 🙂

gfredericks13:04:13

there's also (comparator <)

slipset13:04:14

I think that - over Integers returns an int?

gfredericks13:04:36

yeah but over larger numbers it can return larger numbers, as in this case

gfredericks13:04:55

actually I think you could overflow ints using -

gfredericks13:04:06

(- Integer/MIN_VALUE Integer/MAX_VALUE)

gfredericks13:04:19

or just (- Integer/MIN_VALUE 1)

gfredericks13:04:53

I agree it's an easy mistake to make; the only idea I can think of would be a check in the .compare impl that throws an exception if the return value can't be losslessly cast to an int

gfredericks13:04:06

I'm not sure if that would be prohibitive performance-wise though

slipset13:04:33

You could do that, or you could check if the Number that was returned was less than, equal to, or greater than 0

gfredericks13:04:56

and then return -1,0,1 you mean?

slipset13:04:12

But java.lang.Number doesn’t have the required functions it seems, so you’d have to figure out the type and coerce it correctly and so on. Probably not worth it 🙂

gfredericks13:04:31

yeah, that's not terrible actually; it's just subject to A) performance B) could it possibly break anything relying on the current behavior an alternative that's a bit easier on B is to only change the return value if it's out of range; if it's less than Integer/MIN_VALUE, you return Integer/MIN_VALUE, and similarly on the positive side

gfredericks13:04:32

what if you called .longValue

gfredericks13:04:42

I wonder if that can ever flip the sign of a number the way .intValue does

gfredericks13:04:52

.doubleValue could be used also

gfredericks13:04:55

apparently .longValue can

user=> (.longValue -139783912983471927356912387921386334)
6181708512281126050

gfredericks13:04:50

this is promising w.r.t. .doubleValue

user=> (.doubleValue -139378391298347192735691238792138633782139865127983479821395627895189234198273356812756812735819723567812873919356981235178904354893251237851298356123897561235879261389751287356182973589172369812758917236589712365871234523134443289689238945934327896327887888888888888888888888888888888888888888882389572983478927893562398578929834652394876234987982345238945)
##-Inf
user=> (.doubleValue 139378391298347192735691238792138633782139865127983479821395627895189234198273356812756812735819723567812873919356981235178904354893251237851298356123897561235879261389751287356182973589172369812758917236589712365871234523134443289689238945934327896327887888888888888888888888888888888888888888882389572983478927893562398578929834652394876234987982345238945)
##Inf

gfredericks13:04:28

notably the methods on Number mention "truncation" for every method except .doubleValue and .floatValue

gfredericks13:04:36

for those they only mention "rounding"

gfredericks13:04:38

ratio & bigdec seem to work right also

gfredericks13:04:22

so it seems like it doesn't have any logical problems; so performance would be the big thing

slipset13:04:35

Could probably use clojure.lang.Numbers functions for this

gfredericks13:04:45

I guess if you get a NaN you should throw an exception

gfredericks13:04:22

actually (.intValue ##NaN) returns 0 apparently 😄

😬 8
souenzzo21:04:40

javascript detected

gfredericks13:04:59

I wonder if that's in the floating point spec or just something the jvm authors made up

slipset13:04:23

Funny how deep the rabbit holes become once you start digging…

andy.fingerhut14:04:44

@slipset I would suggest reading the Clojure guide article I wrote on comparators. It explicitly warns against using - in any way for a comparator function, because the Java return type is a 32-bit int. There are very special cases where it is safe, but fewer than the cases where it is a subtle bug waiting to happen: https://clojure.org/guides/comparators

andy.fingerhut14:04:41

If you want descending numeric order for a sort or sorted collection, #(compare %2 %1) is about the safest thing I know

andy.fingerhut14:04:36

And avoids the Clojure/JVM possible performance issue of using > where it calls > twice if the two arguments are equal.

andy.fingerhut14:04:58

which is not an issue if you have none or a small fraction of equal elements in the things to be sorted.

slipset14:04:25

Nice writeup. Seems like you’ve covered most of what I’ve learned today 🙂

andy.fingerhut14:04:47

Please suggest additions if you learn new things 🙂

andy.fingerhut14:04:48

It probably wouldn't hurt to mention ##NaN's in that article somewhere.

slipset14:04:08

Will do. I understand and accept why most things are as they are wrt to comparators, the only thing I guess I don’t fully accept is that we silently get wrong results sometimes when using - as a comparator.

andy.fingerhut14:04:44

You mean, why does Clojure call .intValue and not do any kind of bounds checking?

slipset14:04:59

But I guess that falls into the same category as the discussion we’ve had multiple times regarding how the various set functions should work for non-valid inputs.

slipset14:04:16

I guess the answer is that it works for correct programs 🙂

slipset14:04:27

and for uncorrect programs, it’s undefined.

andy.fingerhut14:04:45

For all I know there is a 10 year old Clojure Google group message covering that topic, but I haven't gone looking for one.

andy.fingerhut14:04:35

When you have the source code and an open source license, you don't have to fully accept anything as it is. That said, I fully deeply understand that having a locally modified version is often not the right thing.

slipset14:04:23

The other benefit of OSS is that I could read the code and figure out why it behaved like this.

andy.fingerhut14:04:05

Definitely. It doesn't always answer "why?", but at least does answer "what?"