This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2020-04-28
Channels
- # announcements (11)
- # aws (2)
- # babashka (35)
- # beginners (173)
- # calva (3)
- # chlorine-clover (2)
- # cider (17)
- # clara (2)
- # clj-kondo (28)
- # cljs-dev (11)
- # cljsrn (53)
- # clojure (178)
- # clojure-argentina (1)
- # clojure-europe (12)
- # clojure-germany (5)
- # clojure-italy (4)
- # clojure-nl (5)
- # clojure-spec (25)
- # clojure-uk (88)
- # clojurescript (109)
- # conjure (34)
- # cursive (2)
- # data-science (35)
- # datomic (15)
- # emacs (6)
- # events (1)
- # fulcro (28)
- # graphql (15)
- # helix (21)
- # hoplon (7)
- # jobs (4)
- # jobs-discuss (1)
- # joker (15)
- # lambdaisland (1)
- # lein-figwheel (4)
- # local-first-clojure (1)
- # malli (8)
- # meander (17)
- # off-topic (33)
- # parinfer (2)
- # rdf (16)
- # re-frame (3)
- # reagent (21)
- # reitit (14)
- # remote-jobs (5)
- # ring (8)
- # rum (1)
- # shadow-cljs (184)
- # sql (2)
- # testing (1)
- # tools-deps (23)
mawning!
ooh do you get to be King For A Day?
(hoping they don't spill your blood to fertilise the fields at the end)
you wouldn't want to have to elect a head of state though, would you
I'd prefer the German model. No one actually know who the Federal President of Germany is.
same with India I suppose now you mention it
he's not objectionable so noone remembers him
morning
@thomas a schrödinger president ?
nah he was president in the 90s
But as in a cat: yes, @mccraigmccraig
But @alex.lynham proves my point completely... no one knows who the German President is. Therefore it is the solution. (On a different note, why do we need a head of state anyway? surely it is just an invented artifact)
someone to sacrifice in time of famine
pestilence might do
I’ve just looked at a function I wrote and have realised that I might be abusing reducers or underusing loop/recur…
(defn gen-data [{:keys [init scale count noise-factor acceleration] :or {acceleration 0}}]
(-> (reduce
(fn [{:keys [scale] :as state} i]
(-> state
(update :scale + acceleration)
(update :result conj (+ init (* scale i) (* (next-rand) noise-factor)))))
{:scale scale
:result []}
(range count))
:result))
Am I overthinking this?I don’t know, just thought that the rationale that I was using it was a bit odd? I mean I’m building a complex accumulator and then reducing over it, then once I’m finished, using a keyword to pull out the result…
Hence asking, is this normal? Or is there a better way of expressing this? Does someone look at this code and think ick, why didn’t you use x
?
Mentally I think I tend to treat loop recur as something to pull out only if required…
Yea, so it might not be a bad idea to rewrite it then =)… But do you find yourself doing this kind of thing when there is a sequence involved? ie, building this -> {:keys [scale] :as state}
and {:scale scale :result []}
or should this sort of structure generally imply loop
/`recur` or something else instead of reduce?
@folcon in this case, you can easily break this into 2 phases: Calculating scales & calculating the results:
(defn gen-data
[{:keys [scale acceleration count init]}]
(map (fn [scale] (+ init scale))
(take count (iterate #(+ % acceleration) scale))))
This tidies it up a little. I'd probably use an anonymous literal #()
for the (+ init scale ...)
part, but I wanted to name it for example purposes.That’s not quite doing the same thing? Original fn:
(gen-data {:init 10 :scale 0 :count 20 :noise-factor 0 :acceleration 2})
#_=> [10.0 12.0 18.0 28.0 42.0 60.0 82.0 108.0 138.0 172.0 210.0 252.0 298.0 348.0 402.0 460.0 522.0 588.0 658.0 732.0]
Yours:
(gen-data {:init 10 :scale 0 :count 20 :noise-factor 0 :acceleration 2})
#_=> (10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42 44 46 48)
@folcon I simplified the "math" for demonstration purposes. You need to do the (+ init (* scale i) (* (next-rand) noise-factor))
still :)
@folcon this impl gives me identical results (although I made up next-rand
)
(defn gen-data2
[{:keys [scale acceleration count init noise-factor]}]
(map-indexed
(fn [i scale] (+ init (* scale i) (* (next-rand) noise-factor)))
(take count (iterate #(+ % acceleration) scale))))
Something you might note, and this will be dependent on your domain, the count
is redundant now.
You can just return an infinite sequence of data, and then the consumer can call take
themselves. This is not possible with a reduce-based solution.
criterium should be better but:
(defn gen-data+acceleration-old [{:keys [init scale count noise-factor acceleration] :or {acceleration 0}}]
(-> (reduce
(fn [{:keys [scale] :as state} i]
(-> state
(update :scale + acceleration)
(update :result conj (+ init (* scale i) (* (next-rand) noise-factor)))))
{:scale scale
:result []}
(range count))
:result))
(defn gen-data+acceleration
[{:keys [scale acceleration count init noise-factor]}]
(map-indexed
(fn [i scale] (+ init (* scale i) (* (next-rand) noise-factor)))
(take count (iterate #(+ % acceleration) scale))))
(time
(dotimes [_ 100]
(gen-data+acceleration-old {:init 10 :scale 0 :count 2000 :noise-factor 0 :acceleration 2})))
"Elapsed time: 1407.115525 msecs"
(time
(dotimes [_ 100]
(gen-data+acceleration {:init 10 :scale 0 :count 2000 :noise-factor 0 :acceleration 2})))
"Elapsed time: 0.410662 msecs"
(it's a lazy seq, so you're not doing anything until you print it, or otherwise do something)
(time
(dotimes [_ 100]
(doall
(gen-data+acceleration-old {:init 10 :scale 0 :count 2000 :noise-factor 0 :acceleration 2}))))
"Elapsed time: 1046.060648 msecs"
=> nil
(time
(dotimes [_ 100]
(doall
(gen-data+acceleration {:init 10 :scale 0 :count 2000 :noise-factor 0 :acceleration 2}))))
"Elapsed time: 198.482497 msecs"
=> nil
My experience has been that combining reducers gets you a performance boost. Historically I've taken several separate filter/map/etc. and combined them into a single reduce producing a single result & seen orders of magnitude improvements.
Hmm, I guess in the reduce case you're constantly iterating on a hashmap, and my version can avoid that. You just pay the lazy seq cost which I guess is cheap relatively. Useful to know.
I’m a little late to the golf party here; but if you take @U09LZR36F’s improvements and port it to use a transducer
it’s almost 2x quicker in my crude experiments:
(defn gen-data+acceleration-transduce [{:keys [init scale count noise-factor acceleration] :or {acceleration 0}}]
(transduce
(comp (take count)
(map-indexed (fn [i scale]
(+ init (* scale i)
(* (next-rand) noise-factor)))))
conj!
(transient [])
(iterate #(+ % acceleration) scale)))
Note this version also uses a transient
which leads to a further small improvement over using conj
and a standard vector; though the transient doesn’t seem to make as big a difference as you might think.conj!
should do that in the arity-1 function body
oh actually 👀 not sure it does
I think this is right:
(defn gen-data+acceleration-transduce [{:keys [init scale count noise-factor acceleration] :or {acceleration 0}}]
(persistent! (transduce
(comp (take count)
(map-indexed (fn [i scale]
(+ init (* scale i)
(* (next-rand) noise-factor)))))
conj!
(iterate #(+ % acceleration) scale))))
i.e. it does call (transient [])
for you for an initial value; but won’t call persistent!
— which makes sense, as I guess you may still want to do more on the transient value.
Conceptually I like that the function generates infinite data and the consumer can use take instead of there being a count
Yeah — I chose to keep the existing contract for comparison.
And I agree with you about not limiting it artificially.
Transducers are supposed to give you that flexibility by splitting the xform
from the starting sequence; however here the xforms are pretty tightly coupled to the starting sequence.
You could probably refactor to take extra xforms to comp in, e.g. you could supply (take 10)
as an argument — but I think that would start getting a bit messy.
Though you could possibly supply the collection of scaled accelerations, and take
on that; prior to transducing.
agreed — and in this case I don’t think it make sense, mainly suggesting it as a potential solution in other cases when this sort of thing arises
yeah they are — though I do sometimes struggle with splitting their concerns, whilst retaining things a lazy solution might give… the above tight coupling between input seq and the xform being an example of it
Thanks for this @U06HHF230! I use transducers a lot, but pretty much excursively in the context of (into [] some-xf coll)
. I really need to get comfortable using it in other ways =)…
One thing I still find difficult is working out good ways of thinking around composing them, I’m also still not super comfortable just pulling an xform from a place and intuiting, oh, it just fits here…
I suppose I don’t have a good feel for treating them similarly to how I treat higher order functions…
TBH that way is even nicer too:
(defn gen-data+acceleration-transduce [{:keys [init scale count noise-factor acceleration] :or {acceleration 0}}]
(into [] (comp (take count)
(map-indexed (fn [i scale]
(+ init (* scale i)
(* (next-rand) noise-factor)))))
(iterate #(+ % acceleration) scale)))
And it will already use a transient
and conj!
so should be just as fast
Comparing:
(time
(dotimes [_ 100]
(doall
(gen-data+acceleration-transduce {:init 10 :scale 0 :count 2000 :noise-factor 0 :acceleration 2}))))
#_#_=> "Elapsed time: 83.439927 msecs"
(time
(dotimes [_ 100]
(doall
(gen-data+acceleration-into-transduce {:init 10 :scale 0 :count 2000 :noise-factor 0 :acceleration 2}))))
#_#_=> "Elapsed time: 112.475216 msecs"
Not quite as good as your non-into version…Really need to spend some more time with criterium and up my benchmarking / profiling >_<… But in my mind not worth doing that until I have an app that’s got some meat to it =)… Almost there!
I suspect that difference is due to unreliable benchmarking that criterium would help iron out; if you look at the definition of into
, it’s essentially exactly what I wrote above.
i.e. into
is just a thin layer on transduce
in that call path.
After spending some time restarting because of editor existential failure
(criterium/bench
(doall
(gen-data+acceleration-old {:init 10 :scale 0 :count 2000 :noise-factor 0 :acceleration 2})))
Evaluation count : 35280 in 60 samples of 588 calls.
Execution time mean : 1.639113 ms
Execution time std-deviation : 53.330771 µs
Execution time lower quantile : 1.588486 ms ( 2.5%)
Execution time upper quantile : 1.788930 ms (97.5%)
Overhead used : 4.082033 ns
Found 3 outliers in 60 samples (5.0000 %)
low-severe 1 (1.6667 %)
low-mild 2 (3.3333 %)
Variance from outliers : 19.0160 % Variance is moderately inflated by outliers
(criterium/bench
(doall
(gen-data+acceleration {:init 10 :scale 0 :count 2000 :noise-factor 0 :acceleration 2})))
Evaluation count : 44100 in 60 samples of 735 calls.
Execution time mean : 1.239084 ms
Execution time std-deviation : 87.727749 µs
Execution time lower quantile : 1.162137 ms ( 2.5%)
Execution time upper quantile : 1.502526 ms (97.5%)
Overhead used : 4.082033 ns
Found 4 outliers in 60 samples (6.6667 %)
low-severe 2 (3.3333 %)
low-mild 2 (3.3333 %)
Variance from outliers : 53.4285 % Variance is severely inflated by outliers
(criterium/bench
(doall
(gen-data+acceleration-transduce {:init 10 :scale 0 :count 2000 :noise-factor 0 :acceleration 2})))
Evaluation count : 100620 in 60 samples of 1677 calls.
Execution time mean : 593.705999 µs
Execution time std-deviation : 23.468176 µs
Execution time lower quantile : 568.977017 µs ( 2.5%)
Execution time upper quantile : 649.404382 µs (97.5%)
Overhead used : 4.082033 ns
Found 5 outliers in 60 samples (8.3333 %)
low-severe 3 (5.0000 %)
low-mild 2 (3.3333 %)
Variance from outliers : 25.4815 % Variance is moderately inflated by outliers
(criterium/bench
(doall
(gen-data+acceleration-into-transduce {:init 10 :scale 0 :count 2000 :noise-factor 0 :acceleration 2})))
Evaluation count : 90540 in 60 samples of 1509 calls.
Execution time mean : 652.417338 µs
Execution time std-deviation : 20.273083 µs
Execution time lower quantile : 630.189989 µs ( 2.5%)
Execution time upper quantile : 697.334784 µs (97.5%)
Overhead used : 4.082033 ns
Found 6 outliers in 60 samples (10.0000 %)
low-severe 5 (8.3333 %)
low-mild 1 (1.6667 %)
Variance from outliers : 17.4268 % Variance is moderately inflated by outliers
I suspect that small difference between the into
variant and the transduce
one is just the instance check and the attaching of metadata that into
does, and you’ll find that if you increase your count to maybe 50,000 or something larger that the difference will become much less significant.
Depends on how you plan on using this though.
Attaching metadata? I didn’t know that into
adds metadata. They’re basic functions to generate sample data, so I won’t be generating large numbers =)…
Yeah just that: (meta (into (with-meta [1 2 3] {:a :b}) [4 5 6])) ;; => {:a :b}
It’s the same semantics with most of the core collection functions; that metadata should be forwarded onto the new collections as you build them.