Fork me on GitHub
#clojure
<
2019-10-01
>
jumar06:10:51

I'm still puzzled by eduction and what others claim about that. Both Alex Miller ("eduction is pretty specialized in that it is a delayed eager evaluation") and Chouser (http://chouser.n01se.net/apply-concat/ - at the very end) seems to suggest that the whole sequence is realized once you ask for the very first element. However, my experiments suggest something else (only the first chunk is realized): https://github.com/jumarko/clojure-experiments/blob/master/src/clojure_experiments/transducers/eduction.clj#L37-L42 Am I missing something?

jaihindhreddy07:10:30

The source of clojure.core.Eduction clarifies what's happening here a bit.

(deftype Eduction [xform coll]
   Iterable
   (iterator [_]
     (clojure.lang.TransformerIterator/create xform (clojure.lang.RT/iter coll)))

   clojure.lang.IReduceInit
   (reduce [_ f init]
     ;; NB (completing f) isolates completion of inner rf from outer rf
     (transduce xform (completing f) init coll))

   clojure.lang.Sequential)
eduction gives us something that implements Iterable (and thereby is seqable), and IReduceInit If you treat it as a seq, then the work is done lazily in chunks, but if you just educe it some more, all the transformations fold together, and no work is done still.

jaihindhreddy07:10:48

@jumar You also might wanna re-watch the "Transducers" talk at Strange Loop where eduction was implemented by using sequence for the seq part and transduce for the IReduceInit part (as it still is...)

jaihindhreddy07:10:39

Does this help?

user=> (def xf (map #(do (println %) (inc %))))
#'user/xf
user=> (def e (eduction xf (range 5)))
#'user/e
user=> (def s (seq e))
0
#'user/s
user=> (first s)
1
2
3
4
1
user=> s
(1 2 3 4 5)
user=> s
(1 2 3 4 5)
user=> (def more-e (eduction xf e))
#'user/more-e
user=> (into [] more-e)
0
1
1
2
2
3
3
4
4
5
[2 3 4 5 6]
user=> (into [] more-e)
0
1
1
2
2
3
3
4
4
5
[2 3 4 5 6]
user=> 

jaihindhreddy07:10:53

Subsequent calls to s doesn't do the work again and again because that's a lazy seq, whereas uses of IReduceInit on eductions do the work each time.

jumar07:10:18

Thanks for the explanation but I'm not sure I'm following. Here we are talking about eduction, when used as IReduceInit (via into) doing the work again and again (which is eduction's contract basically). However, I'm talking about the impression I get from the comments above when they claim that when you realize just the first element you somehow get the whole thing realized. Is that because I treat it as a lazy seq via first? What else I should be doing to "realize only the first element" yet get the whole thing realized? into is obviously going to realize the whole thing so I'm not sure how is that different from sequence or any other transducers related stuff

jumar07:10:44

@U883WCP5Z any insights into this part?

jaihindhreddy08:10:57

sequence is not lazy in the output production, it's lazy in the input consumption. This is because transducers can be expansive. For example, if we have a mapcat in our transducer, consuming one input may produce multiple outputs, so all those will get realized.

jaihindhreddy08:10:28

Transducers can also be reductive, so a filter in the transducer means putting one input in may give us no output, so the impl of sequence and the seqable part of eduction, when asked to realize one element (by calling first perhaps) will keep feeding inputs until an element is realized. But more than one may be realized, because of expansive transducers.

jumar08:10:08

Right, I think I got this part. But still, my original question is about eduction and that the people claim it should consume all the input elements once asked for the first element (as Alex called it "delayed eager evaluation"). But that's not what I observe. Or perhaps I misinterpreting them?

jaihindhreddy08:10:30

"delayed eager evaluation" correctly describes the IReduceInit part of eductions. We can keep creating more eductions from eductions, and all that happens is the recipes are combined, not executed. That's the "delayed" part. The "eager" part is if used with IReducedInit, if seq is called on it, then it isn't eager.

jaihindhreddy09:10:43

So Eductions do provide "delayed eager evaluation" but also "delayed lazy evaluation" by being seqable.

jumar09:10:57

Hmm, that makes sense but is it then different from sequence (apart from the caching stuff)?

jumar18:10:03

Maybe @U064X3EF3 could shed some light on this; that is how was the "delayed eager evaluation" meant.

Alex Miller (Clojure team)19:10:35

maybe I shouldn't have said "eager" there

Alex Miller (Clojure team)19:10:36

it's a delayed reduction, how eager it is depends on what you do with it

jumar19:10:43

Ah, okay - that makes more sense 🙂. Thanks!

jumar07:10:03

Btw. it was a good point with seq and subsequent calls not re-realizing the "collection" backed by eduction, but when I just try via first two times then it's being realized two times:

(let [xs (eduction (map #(doto % print inc)) my-seq)]
  (first xs)
  (first xs))
;;=> prints: 
0123456789101112131415161718192021222324252627282930313201234567891011121314151617181920212223242526272829303132

jumar07:10:47

compared to sequence which realizes only once:

(let [xs (sequence (map #(doto % print inc)) my-seq)]
  (first xs)
  (first xs))
;; prints:
01234567891011121314151617181920212223242526272829303132

jaihindhreddy08:10:05

Each time we call seq on eduction, the work is done again. eduction itself isn't a seq, it's seqable, meaning we can call seq and get back an ISeq, and each time we do this on an eduction, we do the xform again.

jaihindhreddy08:10:14

Try this:

(let [xs (seq (eduction (map #(doto % print inc)) my-seq))]
  (first xs)
  (first xs))

rickmoynihan08:10:11

> Btw. it was a good point with seq and subsequent calls not re-realizing the “collection” backed by eduction, but when I just try via first two times then it’s being realized two times: The reason for this is that (do (first e) (first e)) is essentially (do (first (seq e) (first (seq e)). i.e. you’re throwing away the intermediate lazy seq. It’s different to (let [s (seq e)] (first s) (first s) as there you hold onto the lazy seq for the second call to first and benefit from the caching the lazy seq provides.

Prateek Khatri10:10:50

Using https://github.com/plumatic/schema, can I create a schema which has an OR condition on a map’s keyword? For example, I want to make sure that only {:handler {s/Keyword (s/pred #(fn? %))}} or {:handler-fn {s/Keyword (s/pred #(fn? %))}} should be allowed. This basically says that only :handler or :handler-fn is allowed in this map’s first key. What would a schema look like in this case?

lispyclouds10:10:23

@UM6UT4D5L this works?

(s/defschema MySchema (s/either {:handler {s/Keyword (s/pred #(fn? %))}}
                                 {:handler-fn {s/Keyword (s/pred #(fn? %))}}))

lispyclouds10:10:44

(s/validate MySchema {:handler-fn {:key1 identity}})

Prateek Khatri11:10:10

@U7ERLH6JX Thanks! It worked 🙂

carocad19:10:13

hey guys, quick question. I am doing some clojure parsing and just came across the case of the keyword :/. It is hard-coded in the test.check library but from what I understand from both edn format spec and clojure reader website this should not be a valid keyword. Is this a bug or are the docs outdated ? :thinking_face:

skuttleman19:10:43

According to the reader spec, "Keywords are like symbols", and / is a valid symbol: https://clojure.org/reference/reader Maybe that's why?

carocad19:10:36

mmmm maybe I am misreading this edn spec? > / by itself is a legal symbol, but otherwise neither the prefix nor the name part can be empty when the symbol contains /.

carocad19:10:11

so / is legal but he/ is not ?

andy.fingerhut19:10:49

I believe the quick answer is that there are multiple functions for reading Clojure code and edn, and they are not all identical in what they permit vs. what they do not permit. If you want a bright shiny line between legal vs. not legal, you will find implementations that differ slightly, and the English specs are slightly ambiguous in some of these corner cases.

4
andy.fingerhut19:10:19

When you say that the keyword :/ is hard coded in the test.check library, I am not sure what you mean. Do you mean that keyword appears in test.check's source code?

andy.fingerhut19:10:28

Clojure's built-in reader accepts :/ as a keyword, apparently. Not something I had ever tried before.

carocad19:10:18

this is what I meant with hard-coded https://github.com/clojure/test.check/blob/master/src/main/clojure/clojure/test/check/generators.cljc#L1480 It seems that it was made on purpose since it cannot be generated the “standard way” or at least it would be quite difficult.

noisesmith19:10:35

what does "cannot be generated the standard way" mean?

noisesmith19:10:43

based on reading the other code, adding a special case to frequency seems to ensure known degenerate cases are exercised

Alex Miller (Clojure team)19:10:08

Gary and I had an extended conversation about this. Unfortunately, I don't remember any of it. :)

😄 16
Alex Miller (Clojure team)19:10:42

but he probably does, and there may even be a test.check ticket about it

Alex Miller (Clojure team)19:10:54

or at least check the blame for the commit

andy.fingerhut19:10:12

That case was added to test.check generator with this commit: https://github.com/clojure/test.check/commit/21936eed5b4f3b04e9791697a2bae21115254467 . Commit message mentions adding it explicitly, but not sure if there is any further recording of details of why.

andy.fingerhut19:10:44

test.check has code for programmatically and pseudo-randomly generating a large variety of what it considers valid data, and probably :/ was not easily generated via the "general case" code, so he added a special case to make sure it was generated with a minimum frequency, which from the code I would guess is at least 1/(100+1) fraction of the time, but that is only my guess there.

andy.fingerhut20:10:19

It also is not clear to me how from reading this page https://clojure.org/reference/reader that you conclude :/ is not a valid keyword?

andy.fingerhut20:10:51

But as I mentioned earlier, the English is likely to be slightly ambiguous in these kinds of corner cases.

carocad20:10:10

oh no, I thought that from the edn spec. See message above. The part on the prefix, name part 😅

carocad20:10:57

In any case, at least now I know that this is an intentional case and not a bug which is why I tried to figure out. Thanks a lot guys 🙂

walterl00:10:38

project idea: Smiley Oriented Programming:

(def happy (keyword ")"))
(def sad (keyword "("))

(println "yay!" happy)
(println "aw." sad)

JanisOlex19:10:12

question: given clojure map, how to convert it to java HashMap? bonus if the same can be done recursively for map of maps of maps of lists of sets etc... so sets to HashSets, Lists to ArrayLists, maps to HashMaps?

carocad19:10:34

(new HashMap {:a 2 :b 2})) works transparently 🙂

skuttleman19:10:22

clojure maps implement java.util.Map. Does it specifically have to be a HashMap? (doto (java.util.HashMap.) (.putAll {"foo" "bar" :baz :quux}))

JanisOlex19:10:38

hmmm not specially 🙂 ok need to test that

JanisOlex19:10:44

Okay it works. What about keywords? If something is keyword in clojure, how can I ask for that key

JanisOlex19:10:25

like {:key "Value"} in clojure But how would Java treat the :key part of the map? What would be the key I have to put for such a map, to get the "Value" ?

skuttleman19:10:23

The usual ways. (:key (java.util.HashMap. {:key "Value"})) (get (java.util.HashMap. {:key "Value"}) :key) (.get (java.util.HashMap. {:key "Value"}) :key)

Alex Miller (Clojure team)19:10:40

are you asking from the Java side?

Alex Miller (Clojure team)19:10:59

the key will be a clojure.lang.Keyword instance in the Map, if so

JanisOlex19:10:46

thanks, all looks fine now and working

noisesmith19:10:49

from the java side you can use the invoke method of clojure.core/keyword to get a keyword from a string for lookup

noisesmith19:10:56

there might even be something for this in RT (of course you can also just use the read method on ":foo")