Fork me on GitHub
#core-async
<
2021-02-01
>
Faiz Halde05:02:47

purely for learning purpose why does

(require '[clojure.core.async :refer [go >!! close! <!! >! <! thread]])

(def c (chan))

(do
  (go
    (do
      (doseq [i (range 1000000)]
        (>! c i))
      (close! c)))

  (go
    (time
     (loop [i (<! c)]
       (when i
         (recur (<! c)))))))
perform better than
(require '[clojure.core.async :refer [go >!! close! <!! >! <! thread]])

(def c (chan))

(do
  (thread
    (do
      (doseq [i (range 1000000)]
        (>!! c i))
      (close! c)))

  (thread
    (time
     (loop [i (<!! c)]
       (when i
         (recur (<!! c)))))))
i understand when there are lots of go-processes, the switching time pays off if goroutines are used as compared to thread but with chan size = 1 and two goroutines, ultimately the threads from go threadpool are going to block on the channel until a put happens. so essentially at low level it is the same as thread blocking right?

phronmophobic05:02:28

for the go example, it could theoretically just bounce back and forth between pushing and pulling on the same thread.

Faiz Halde07:02:27

ya that’s what i thought too but the thread ids were surprisingly different

phronmophobic07:02:41

Interesting. not sure how to investigate further without using a profiler like https://github.com/clojure-goes-fast/clj-async-profiler

👍 3
noisesmith16:02:29

at a low level a process / thread context switch is much more expensive than a goroutine state machine context switch

noisesmith16:02:41

for example the saving of state needs to save all state when doing an OS context switch, while go reuses most of its state and just switches out state machines to pick up different blocks. even if multiple thread ids are being used, it's a fixed number of them with cooperative context switching, and it never spends time in OS allocated work slices waiting on a signal like thread can.

cassiel20:02:57

Rather a newbie-flavoured question here re: combining core.async with Stuart Sierra’s Component machinery. Which would be better style? (i) Have one of the component modules own all the channels in the system, and do (chan) and (close!) on them all, with other components referring to them when firing up their (go) blocks (ii) Assuming a pipeline of consume-produce components, have each one create and close the channel it sends to only (which it can be assumed to “own”)?

hiredman20:02:21

I would not do that

cassiel20:02:05

I think you got in as I accidentally posted … I went back and edited.

hiredman20:02:16

yes, I would do i

hiredman20:02:29

i is making a global singleton

cassiel20:02:45

Hmm, (ii) feels better structured to me, but (i) feels simpler and more reliable.

cassiel20:02:43

Bonus question: given that components are firing up their own go threads/coroutines, should they also be responsible for shutting them down? Or should the go blocks be required to exit on closed input channels, so that close! everywhere brings them down? I’m veering towards the latter: close the channels to shut down the go blocks.

hiredman20:02:08

i is only more reliable if you don't understand that component does things synchronously, so if you have asynchronous tasks (via go blocks, threads, threadpools, whatever) you need to bridge that divide

hiredman20:02:25

components must be responsible for shutting them down

hiredman20:02:52

your stop function shouldn't return until all the async tasks you have started have exited (it is not enough to signal them to stop)

cassiel20:02:08

(Aside: I’m in CLJS so it’s all coroutined.) The only way to ensure a shutdown then is to make every go block hang on an alt! and have explicit shutdown channels. That feels like a lot of machinery to achieve something which feels like it should be simpler.

cassiel20:02:35

Unless I’m missing an obvious pattern to do that.

hiredman20:02:00

that is not entirely correct

cassiel20:02:26

Am happy to be enlightened…!

hiredman20:02:35

it is often easier to use an explicit shutdown channel, but you can make all your go blocks check for reading nil from the input channel

hiredman20:02:54

oh, in cljs you are screwed anyway

cassiel20:02:57

Yes - they do - so a closed channel will always bring down a go-block consumer.

hiredman20:02:07

you can't bridge the async -> sync divide

hiredman20:02:28

so in clj the component's stop will close the input, which will signal the go block to exit, and then do a blocking take from the channel returned when that go block is started, to ensure that after the component is stopped the go loop has exited

hiredman20:02:02

(if you just close the channel, your component may return from stop while the go loop is processing a back log of messages in the channel)

hiredman20:02:19

of course you can't do blocking takes in cljs

cassiel20:02:26

I was about to say…!

cassiel20:02:33

There’s no way to block on a channel from within the “main thread” in component, so I guess all I can do is arrange for go blocks to stop on their next read-from-closed, and be content with that.

hiredman20:02:33

the "correct" thing would be to write your own version of component where the lifecycle protocol is itself asynchronous

cassiel20:02:53

Sometimes something can be too correct!

cassiel20:02:14

OK, so it feels like I have to live with some decoupling here - I can shut down a component, and make it shut channels, but have to code the go blocks so that they obligingly shut themselves down when the thread jumps to them.

cassiel20:02:56

Am happy to go with a singleton channel “owner” which mints fresh ones on start and closes them all on stop. I don’t think that leads to any deadlocks or orphaned go contexts.

cassiel20:02:24

Thanks very much for the discussion - it’s been illuminating.

noisesmith20:02:46

"orphaned go contexts" - IIRC go blocks are stored on channels, if they aren't running or waiting on a channel they go out of scope and get gc'd

cassiel20:02:49

I believe so - they’re just parked contexts. But I think I read something recently that hinted that they came from a pool. As I think about it, that doesn’t make sense to me - I don’t see any reason why they’d need to be limited.

noisesmith20:02:20

threads are pooled, the blocks are "unlimited"

cassiel20:02:27

Sure - though in CLJS I don’t have threads and am just jumping around inside a coroutine state machine, so I guess there are no limits of any kind.

noisesmith20:02:31

well "only one thread ever exists" is a bit of a limit 😄

cassiel20:02:52

True - but once you get over that, the sky’s the limit.