core-async 2019-12-17 | Slack Archive

otfrom09:12:19

where are the best posts/articles/books/chapters to help understand how to use core.async to build data processing pipelines? I've found this one: https://adambard.com/blog/stream-processing-core-async/ but would like to find some more (and/or hear opinions about this one)

Alex Miller (Clojure team)14:12:50

Clojure Applied has a chapter on that

otfrom17:12:55

@alexmiller cool. I'll review that.

Kevin19:12:35

If I understand correctly, using IO in a go block is bad because it would block the thread that the go block is using. Meaning that if you would want to read from a database (for example) you could use core.async/thread to spawn a new thread. Then park the go block while listening to the spawned thread using <!. Is this correct?

hiredman19:12:20

yes

hiredman19:12:07

async/thread just gets the work off the go block threadpool, so you can use something else to do that as well

Kevin19:12:25

Right

Kevin19:12:43

Does thread have an unbound thread pool?

hiredman19:12:51

yes

Kevin19:12:52

If it does, then you’d have to manage it yourself, right?

Kevin19:12:57

If you just spawn thread unmanaged, you could use up all your resources I would assume?

hiredman19:12:58

it depends, but yes it is often useful to limit that in someway, which is one reason you might using something other than core.async/thread (at work we have something similar that uses a threadpool we control)

hiredman19:12:20

but we have a few places where a singleton go loop sometimes calls async/thread and waits on the result, so by construction it can't create many threads

Alex Miller (Clojure team)19:12:14

there's nothing special about the threads created by thread

Kevin19:12:21

Ok, but in the case of where for example traffic dictates the amount of spawned threads, you’d definitely need a pool

Alex Miller (Clojure team)19:12:32

use any thread pool you like, just communicate via channels

Kevin19:12:01

All right, sounds good, then I have one more question..

Alex Miller (Clojure team)19:12:13

thread is just a helper and does the extra return channel thing - there's almost nothing there

noisesmith19:12:35

it's mostly the binding conveyance by loc

Kevin19:12:23

(I might be misunderstanding some concepts) Let’s say you limit your thread pool to X amount of threads. If all X threads are busy you will have to wait for a thread to free up. How is this different from bumping up the go loop thread pool to X amount of go loop threads?

noisesmith19:12:27

if you did that, you'd easily have code that succeeds in local / staging / tests and fails under real load

noisesmith19:12:30

for one thing

Kevin19:12:46

Why is that?

Kevin19:12:12

I assume because you can’t replicate the load, but how is that different from testing a regular thread pool?

noisesmith19:12:13

because you can starve the thread pool for go blocks, if it happens faster / at lower usage, you can catch it easier

noisesmith19:12:48

go blocks can do coordination faster and cheaper than a thread pool if used correctly - because they context switch without system calls

noisesmith19:12:53

(or at least can)

hiredman19:12:02

the main thing is the go block threadpool is a shared resource

hiredman19:12:23

other libraries, other parts of your code, etc may want to use it, so if you are are blocking it that is a problem

Kevin19:12:43

Ah I see

Kevin19:12:48

That makes sense

Kevin19:12:07

But doesn’t that mean that if no libraries are using core.async, and you manage the go loop pool, then it would work the same as managing your own thread pool?

Kevin19:12:22

Hypothetically

hiredman19:12:43

sort of, it is a complicated threadpool where tasks run for a bit, then get put on the back of the queue, which is more complicated to manage then a threadpool that pulls a task and runs it to completion

hiredman19:12:58

assuming you are actually running go blocks doing channel stuff

hiredman19:12:10

which if you aren't, there is no reason to use the go block pool at all

Kevin19:12:21

Go blocks doing channel stuff?

hiredman19:12:39

reading and writing to channels

noisesmith19:12:47

I dont' have proof but my personal theory is that the go block thread pool was intentionally shrunk as an anti-foot-gun measure, to lead core.async users toward the kind of constructions that actually benefit in any way from core.async

hiredman19:12:16

when a go block reads from a channel, the continuation of the block is added as a callback to the channel, and the go block stops running so some other go block can run on that thread, and once something is written to that channel the callback is put back on the queue

Kevin19:12:14

Right

Kevin19:12:56

But if you didn’t do that then there wouldn’t really be a point to using core async, right? It’s all about channel communication

hiredman19:12:13

yep

hiredman19:12:57

but I dunno, you seem to be asking wild blue sky questions

Kevin19:12:07

haha sorry

Kevin19:12:12

I’m just trying to understand