Fork me on GitHub
#clojure
<
2022-11-26
>
henrik01:11:58

I was looking through the old transducer talks, and Rich made a point that the non-transducer arities of functions like map, filter, etc. could essentially be implemented with a call to the transducer arity of the same function. This is not how they ended up implemented in Clojure core, though. Why is this? Do they incur costs not worth paying when not comped with other transducers?

seancorfield01:11:21

Probably because it was simpler (and safer) to keep the existing implementation, rather than rewrite all of that code?

henrik01:11:21

Fair enough; safer. Simpler, is it really? Slightly different code in two places executing something that hopefully ends up being the same in the end. Surely it would be simpler to just issue the call to the other arity?

seancorfield01:11:04

Sorry, I meant the process would be simpler: of only adding the new arity and not touching the existing code.

skylize01:11:52

Transducers are inherently greedy. And the core functions are mostly inherently lazy. I would expect some difficulty making those congruent. No?

seancorfield01:11:58

(map f coll) = (sequence (map f) coll) is lazy.

henrik01:11:55

And you can do some funky things with eduction to make it super-lazy (and skip chunking)

henrik01:11:11

I think that’s one of the really neat things about transducers: they can be lazy or greedy as required, given the context they’re used in.

seancorfield01:11:35

eduction produces a reducible, not a lazy sequence, but yeah eduction is an interesting one...

skylize01:11:36

I definitely did not know that.

henrik01:11:26

No, you’re right. I mean, you can make it behave lazily. I’ll whip up an example.

hiredman01:11:37

Education doesn't cache values

seancorfield01:11:53

It's probably worth noting that the chunking behavior of sequence does seem to match the core lazy functions tho'...

skylize01:11:40

Don't the core functions do chunking too? How does it differ from sequence?

henrik01:11:04

(defn print-inc
  [n]
  (println "inc" n)
  (inc n))


(defn print-even?
  [n]
  (println "even?" n)
  (even? n))


(->> (range 1000)
  (map print-inc)
  (filter print-even?)
  (take 2))
;; 64 printlns


(sequence
  (eduction
    (map print-inc)
    (filter print-even?)
    (take 2)
    (range 1000)))
;; 8 printlns

hiredman01:11:07

sequence constructs an iterator then builds chunking and a seq on top of the iterator

hiredman01:11:31

So it doesn't chunk based on the chunking of the incoming seq

seancorfield01:11:31

(take 2 (filter #(do (println "n" %) (even? %)) (range 100)))
prints up to n 31 but
(take 2 (sequence (filter #(do (println "n" %) (even? %))) (range 100))) 
prints up to n 64

skylize01:11:31

@U06B8J0AJ In addition to Sean's first answer, this ^ likely also contributed to the decision not to reimplement. Some potential for breakage of code built around old chunking size.

henrik01:11:58

Well, if you skip eduction, the behaviour is identical. Edit: no, I seem to be wrong.

(sequence (comp
            (map print-inc)
            (filter print-even?)
            (take 2))
  (range 1000))
8 printlns

seancorfield01:11:06

(transduce (take 2) conj [] (eduction (filter #(do (println "n" %) (even? %))) (range 100)))
only prints up to n 2

seancorfield01:11:57

Code shouldn't rely on chunking size -- it's an implementation detail 🙂

henrik01:11:51

Yeah, indeed

hiredman01:11:32

Reducing over range directly skips creating a seq so there is no chunking

hiredman01:11:47

Things get weird combing a chunked seq like from a vector with sequence, because the vector seq and the iterator seq from sequence could have differing ideas about chunks

skylize01:11:58

Core Team won't guarantee the implementation details, but they definitely still seem to weigh such breakage potential as a meaningful factor regardless.

henrik01:11:40

Yeah, possibly. So we have two theories: • Minimum amount of update to the codebase needed, and • Keep some implementation details from changing

skylize01:11:31

Those aren't conflicting theories, if that's not clear. I would bet the first one held more weight.

henrik01:11:25

No, could be either or both, or some unknown third case. “Couldn’t be arsed”, “the existing code has a special place in my heart”, etc.

henrik01:11:17

Or perhaps, performance characteristics that would change slightly, but unnecessarily.

henrik01:11:27

I wonder if Clojure would have looked slightly different if protocols and transducers had been there from the beginning.

seancorfield01:11:00

I think Rich has said "yes" about that at some point...?

1
seancorfield01:11:19

(def. about protocols -- but I think also about transducers too)

henrik02:11:20

Aha! How uncharacteristic. I know from the anniversary chat that he doesn’t subscribe to the many-worlds theory 🙂

seancorfield02:11:27

Well, the implementation would be different. The language would be largely identical I suspect.

seancorfield02:11:54

There would probably be less Java and more Clojure if protocols were there from the start.

henrik02:11:04

Even if the language wouldn’t change, it might have had an impact on codebases. We might have seen more extends around. And, I guess, fewer ->>s if transducers were imprinted on users early on.

henrik02:11:57

It’d probably be a mixed bag, just like now.

Carlo09:11:00

I'm starting a project with this deps.edn file:

{:deps {io.github.nextjournal/clerk {:mvn/version "0.11.603"}
        link.szabo.mauricio/spock {:mvn/version "0.1.1-SNAPSHOT"}}
 :paths ["resources/jpl.jar"]
 :aliases {:prolog {:jvm-opts ["-Djava.library.path=resources"]}}}
and when I fire up the repl I get the warning:
WARNING: Use of :paths external to the project has been deprecated, please remove: resources/jpl.jar
how could I solve it while maintaining the resources/jpl.jar file?

p-himik09:11:16

Do you want to simply add that jar to the classpath? If so, you can use :local/root within :deps: https://clojure.org/guides/deps_and_cli#local_jar

🙌 1
Carlo09:11:24

Awesome suggestion as usual @U2FRKM4TW! This is what worked in the end:

{:paths ["src"]
 :deps {io.github.nextjournal/clerk {:mvn/version "0.11.603"}
        link.szabo.mauricio/spock {:mvn/version "0.1.1-SNAPSHOT"}
        jpl/jpl {:local/root "./resources/jpl.jar"}}
 :aliases {:prolog {:jvm-opts ["-Djava.library.path=./resources"]}}}

👍 1
Carlo09:11:11

also pinging @U3Y18N0UC as this is probably worth of inclusion in the spock readme