This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2017-09-12
Channels
- # aleph (11)
- # aws-lambda (1)
- # beginners (158)
- # boot (19)
- # cider (14)
- # clara (23)
- # cljs-dev (3)
- # clojars (4)
- # clojure (133)
- # clojure-dev (57)
- # clojure-dusseldorf (1)
- # clojure-finland (2)
- # clojure-gamedev (31)
- # clojure-greece (15)
- # clojure-ireland (1)
- # clojure-italy (3)
- # clojure-russia (8)
- # clojure-spec (149)
- # clojure-uk (51)
- # clojurescript (88)
- # community-development (1)
- # component (5)
- # cursive (17)
- # datomic (3)
- # emacs (6)
- # fulcro (142)
- # graphql (1)
- # juxt (15)
- # lein-figwheel (1)
- # luminus (3)
- # lumo (6)
- # off-topic (11)
- # om (8)
- # onyx (5)
- # portkey (6)
- # proton (2)
- # protorepl (3)
- # quil (6)
- # re-frame (14)
- # reagent (9)
- # shadow-cljs (226)
- # specter (11)
- # testing (96)
- # uncomplicate (5)
- # unrepl (8)
- # vim (11)
I was wondering if there is any reason why r/take-while
, r/take
and r/drop
in reducers don't use folder
(instead of reducer
)? With the following practical effects:
(require '[clojure.core.reducers :as r])
(time (->> (range 50000)
(into [])
(r/map range)
(r/mapcat conj)
(r/drop 0)
(r/filter odd?)
(r/fold +)))
;; "Elapsed time: 45516.963356 msecs"
;; 10416041675000
(time (->> (range 50000)
(into [])
(r/map range)
(r/mapcat conj)
(r/filter odd?)
(r/fold +)))
;; "Elapsed time: 9190.562896 msecs"
;; 10416041675000
they are stateful, and the state results from a linear scan, which doesn't parallelize as a tree
not sure I get that. The reduction happens on a core on a single chunk. So are the results going to be different if the drop is foldable?
Still trying to understand. Created:
(defn drop2
[n coll]
(r/folder
coll (fn [rf]
(let [nv (volatile! n)]
(fn
([result input]
(let [n @nv]
(vswap! nv dec)
(if (pos? n)
result
(rf result input)))))))))
and used in place of r/drop. What would you expect to happen/not-happen?and it assumes things are happening in order, which by definition when running in parallel it is not
actually, the volatile doesn't appear to escape each partition, so it's not a race, but it's not tracking a global count
right, you end up dropping from each partition vs. dropping from the whole thing, which is an entirely different behavior from the non-folding drop
drop2 returns the same results as r/drop, it's performing in parallel (as in fork-join parallel) but timings are much longer.
for example you are pushing in something that doesn't fold in parallel (like a seq) or you are dropping 0 which does nothing, etc
If I (println (Thread/currentThread))
from inside (fn [result input]) I see fork-join threads. I thought that indicated going fork-join correctly.
I'm executing the fold at the beginning alternating r/drop and drop2 with different initial ranges. I was expecting different results (even repeating the same) in case of race conditions. Not sure that's good definition of correctness
user=> (require '[clojure.core.reducers :as r])
nil
(defn drop2
[n coll]
(r/folder
coll (fn [rf]
(let [nv (volatile! n)]
(fn
([result input]
(let [n @nv]
(vswap! nv dec)
(if (pos? n)
result
(rf result input)))))))))
#'user/drop2
user=>
user=> (def x (vec (range 10000)))
#'user/x
user=> (def y (->> x (drop2 10) (r/fold +)))
#'user/y
user=> (def z (->> x (drop 10) (reduce +)))
#'user/z
user=> (= z y)
false
user=> z
49994955
user=> y
49969955
user=>
there's a devequivtest
macro in the test suite that checks for seq/reducer equivalence https://github.com/clojure/clojure/blob/master/test/clojure/test_clojure/reducers.clj
with most of c.c.reducers effectively obsoleted by transducers, I think it would be a useful exercise to translate some fold examples to use transducers instead
(r/fold ((drop 10) +) (vec (range 10000)))
looks like it's returning different results each run?
what is happening in your scenario: you create one reducing function ((drop 10) +)
, it's passed to a bunch of threads that all use it in a non-specified order
my pleasure. the relationships between Transducers / Reducers / Fold have not been made explicit
I think the "problem" is work stealing. A chunk can always fly on a thread where the volatile doesn't have the right count, since it already processed a chunk