Fork me on GitHub
#clojure-dev
<
2019-01-26
>
slipset10:01:47

Really appreciate this, as I’m preparing for https://2019.flatmap.no/talks/assum

borkdude19:01:15

is clojure directly linked for performance, or also so third-party libs can’t mess with clojure internally?

Alex Miller (Clojure team)21:01:19

We”re cool with messing :)

andy.fingerhut19:01:09

I believe the motivation was primarily the small but measurable performance improvements.

andy.fingerhut19:01:54

The fact that things like tools.trace no longer work for core functions calling other core functions seems like primarily a disadvantage to me, but it is pretty easy to build a non-direct-linked version of Clojure if you want to enable that.

Alex Miller (Clojure team)21:01:52

We build and publish the slim one for that so no need to build

👍 10
andy.fingerhut19:01:28

caveat: I'm going from memory and observation-from-a-distance here, not actual knowledge of the full reasons.

borkdude19:01:09

The reason I’m asking: I’m running into an issue with instrumenting hash-map which doesn’t work in Clojure, because spec uses binding in spec-checking-fn and that call is expanded into a non-directly linked call of hash-map which then results in a loop. A solution would be to inline the hash-map call. If the argument was that third-party libs can’t mess with var redefs, than that would align with that change and there would be an extra motivation. e.g.

$ clj
Clojure 1.10.0
user=> (binding [*print-length* 1] (println (range 10)))
(0 ...)
nil
user=> (alter-var-root #'clojure.core/hash-map (fn [v] (fn [& kvs])))
#object[user$eval138$fn__139$fn__140 0x5f7b97da "user$eval138$fn__139$fn__140@5f7b97da"]
user=> (binding [*print-length* 1] (println (range 10)))
Execution error (NullPointerException) at user/eval144 (REPL:1).
null

andy.fingerhut20:01:17

Would building your own non-direct-linked version of Clojure help you with your current purposes, even if others did not have that version of Clojure readily available via Maven/Clojars?

borkdude20:01:32

no, it would make this issue worse 🙂

borkdude20:01:06

the point here is that the call to hash-map is not happening against the directly linked version

borkdude20:01:33

if you write (binding [*print-length* 1] ...) there is going to be a call to the var #'hash-map

andy.fingerhut20:01:34

Oh, because hash-map is used in the implementation of spec itself?

borkdude20:01:56

because binding is used in the implementation of spec (spec-checking-fn)

andy.fingerhut20:01:18

Yeah, using spec / trace / etc. on functions that are themselves low enough in the implementation that the implementations of those mechanisms rely upon them makes my brain hurt. I know you press on sometimes in spite of that hurt 🙂

borkdude21:01:59

I found a nice performance improvement in binding but it didn’t solve the issue:

user=> (time (dotimes [_ 10000000] (binding [*print-length* 10])))
"Elapsed time: 3754.649969 msecs"
nil
user=> (time (dotimes [_ 10000000] (binding* [*print-length* 10])))
"Elapsed time: 1356.306343 msecs"

borkdude21:01:57

If you’re interested, the speedup comes from creating the hash-map in binding at macro-expansion time

borkdude21:01:06

but I’m no longer sure if I diagnosed the issue correctly, since that didn’t help 🙂 the perf improvement could be worth a ticket maybe

borkdude21:01:10

Oh, wait, it does solve the issue, but it’s not the only one

hiredman22:01:24

it has the potential to execute side effecting expressions in a different order

borkdude22:01:00

it being what?

hiredman22:01:22

constructing a hash map at macro expansion time

borkdude22:01:42

not something code should rely on I think

hiredman22:01:00

sure, but it is a potential backwards incompatibility to make a judgement call about

borkdude22:01:17

I wonder who is using side-effects in binding and relying on the order.

andy.fingerhut22:01:04

When clojure.core/hash changed from Clojure 1.5.0 to Clojure 1.6.0, and thus the order of return for seq changed for hash based maps and sets, there were a smattering of example-based tests on projects, including some of Clojure's, that needed updating because they began failing.

andy.fingerhut22:01:47

I do not recall any of them intentionally relying on the order, but they had repeatable results as long as the hash function remained the same.

andy.fingerhut22:01:34

For the insanely curious, here was one of those commits to restore some at-that-time-recently-deleted-tests: https://github.com/clojure/clojure/commit/91dd867b4229a31d4d915aece97f41b3811cf4d4

borkdude22:01:34

@andy.fingerhut yes, I remember that. I also had to deal with that. We used a datomic query function that relied on the order of map-entries, it was very bad to begin with and our pain was deserved.

andy.fingerhut22:01:49

binding is probably not as pervasively used as seq on maps and sets, but where it is used, perhaps that is the kind of change in behavior hiredman is raising a flag about.

borkdude23:01:15

why is/can the order of map-entries be different at compile time btw?

hiredman23:01:38

because the iteration order on clojure's hash maps is based on the trie structure which is based on the hash of the key which is, to the degree in which it is a good hash function, random

hiredman23:01:04

so when write code in some order in a text file, and a macro or the read turns it in to a hash map, then hands it to the compiler, the compiler will iterate it in some order based on the hash, not the order in the text file

borkdude23:01:05

@hiredman I’m aware that the order of side effects can happen in a different order than written down. E.g.

#{(do (println 1) 1) (do (println 2) 2) (do (println 3) 3)}
1
3
2
#{1 3 2}
My question was about the difference in order when creating a map compile time or run-time in binding. Can you give an example when this would matter?

borkdude23:01:47

(I’m afk now… late here)