This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2019-07-15
Channels
- # aleph (9)
- # announcements (6)
- # beginners (42)
- # calva (4)
- # cider (9)
- # clara (2)
- # clj-kondo (1)
- # cljdoc (108)
- # cljs-dev (10)
- # clojure (25)
- # clojure-brasil (1)
- # clojure-chicago (1)
- # clojure-europe (4)
- # clojure-italy (42)
- # clojure-nl (14)
- # clojure-uk (66)
- # clojurebridge (3)
- # clojurescript (23)
- # clojutre (2)
- # community-development (1)
- # cursive (2)
- # datomic (4)
- # figwheel-main (21)
- # fulcro (23)
- # jobs-discuss (1)
- # kaocha (1)
- # off-topic (10)
- # pedestal (4)
- # reitit (2)
- # shadow-cljs (41)
- # spacemacs (7)
- # sql (20)
- # xtdb (3)
morning
mΓ₯ning
Say you have a result of a partition (say 10 lists (but could vary!) of 100,000 items each). And you wanted to hand-off each part to be processed in parallel, is there a particular approach one might take?
quick google gives me this too, https://github.com/reborg/parallel I think reborg is in this chat too iirc?
@dharrigan put descriptions of the work on a manifold stream, map over the stream with the work function (doing the work either async or on a separate thread), set a buffer size to control concurrency, reduce to get results
i imagine something very similar with core.async would be good too
can you clarify what "put descriptions of the work" means? (I've had a basic introduction to manifold with some kafka stuff, so sorta familiar with it, to a basic level).
@dharrigan just a plain old clojure datastrucure describing the work to be done... {:type :http-fetch :url "
etc
or a variant or whatever else makes sense... just pure data though
understood thanks @mccraigmccraig and @guy
I didn't do much π€· but thank you for the thanks π
I'm a bit late but https://juxt.pro/blog/posts/multithreading-transducers.html talks about parallelisng inside a transducer chain
Even more late - I didnβt see the question @dharrigan :( this is the most basic solution if I understand correctly: (pmap #(map inc %) (partition-all 10 (range 100)))
. This hands over your partitions to 12+2+32 parallel threads (assuming you have 12 cores) at a time. Itβs also lazy (considering when to realize the next chunk)
actually I think that might be Budapest Keleti
and thanks to @ben.hammond too! π
I think I still have to get my head around transducers. I mean I sorta get them, but I need to feel them π
they look a bit intimidating at first but you can think of it like;
outermost level is usually
(fn [rf] ...
where 'rf` stands for reducing function. but you can think of it like saying
> "Hook me up to the outside world"
and then inside that is this wierd arity 3 function which always feels really clumsy to me - each arity means a different thing, which is usually a big nono
(fn []
means
> give me your initial value
(fn [acc]
means
> stop writing please and finish up
(fn [acc next-val
means
> Process this
and then you can drip-feed values to the outside world by calling the rf
that you received previously
but you don't have to
or you can call it alot when you are feeling chatty
Thanks very helpful. I'm also trying to understand when/where I should use them? I mean, they're not appropriate for everything (or are they?) I guess it comes down to understanding when it's time to transduce and when it's not...
comes down to scale; if you are processing zillions of rows with lots of steps, then lazy sequences are slow and hard to control because each step gets its own chunk of lazy sequence.
you can't 'just stop' processing because each step has to chew through its own Chunk data
each one consuming memory and potentially causing head-retention
whereas because transducers are plugged together into each other's rf
they work in exact lockstep; one incoming item gets fully processed before the next one gets taken
which is gentler on the memory usage
and other system resources (like file handles)
and mean that if you decide that your bjg job is taking too long. you can just interrupt it and it will immediately stop
(rather than interrupting it and waiting for all the intermediate lazy sequences to drain)
I quite like the ephemeral nature of Slack in some ways
I hope that each time I retell this story, I do it a little better
But @ben.hammond your link is for reading tonight π
@otfrom I think we need to start a support group π https://twitter.com/otfrom/status/1150880887313182721?s=19
I'm good with anything that has the magic "J" ingredient in it... I love jalapenos!
There's a brand of jalapeno crisps here -- made from crinkle-cut fresh jalapenos and then deep-fried -- that I love, and my fridge is full of all sorts of hot sauces to put on any food I make. I also have some ghost pepper salt which is awesome on eggs π
@otfrom the crisps I mentioned -- also another desk snack π
(and, yes, this thread in the background!)
omg! thank goodness I don't have access to the crispy jalapenos!