Fork me on GitHub
#core-async
<
2019-10-08
>
dharrigan16:10:25

So, I'm trying to reason about core async and how to use it. I'm new so lots to learn. I have this, but I'm not sure if it's correct way to use core.async, please ask questions if not clear on the intent:

dharrigan16:10:33

(let [c chan]
  (go-loop []
    (when-let [ids (a-function-that-gathers-ids)]
      (do-something-with-ids-dumping-results-into-channel c)
      (<!!
       (go
        (a-function-that-updates-something-with-the-results (alts!! [c (timeout 5000)])))))
    (Thread/sleep 60000)
    (recur)))

dharrigan16:10:20

the a-function-that-gathers-results reads from a database, and a do-something-with-ids... reads from a RESTful API.

dharrigan16:10:25

(using clj-http)

hiredman16:10:02

you don't want anything to block in the dynamic extent of a go block, meaning you should not do any blocking operations or call any functions or methods that block from a go block

hiredman16:10:24

so for example <!! blocks, so you should not call it from a go block

hiredman16:10:33

the general form of what you want to do is something like:

(go-loop [] (<! (thread (some-blocking-thing))) (recur)) 

hiredman16:10:13

you have a go loop, you do the blocking operations on a real thread, then use <! to get the result

hiredman17:10:47

operations with double bangs (<!!, >!!, alts!!) block the real jvm thread they are run on, while operations with a single bang (<! , >!, alts!) sort of act like they block (the execution of the go block will stop until that operation completes), but they actually release the jvm thread to do other work while waiting

dharrigan17:10:56

I see, okay, thank you for the explanation. Will experiment 🙂

dharrigan17:10:18

It always confuses me when I should do async or block, esp with network/db operations

hiredman17:10:37

unless you know for sure otherwise, the safe assumption is anything that does io ( network, db, file, etc) is going to block, so give it a real thread

dharrigan18:10:58

that's great advice, thanks!

dharrigan20:10:09

Generally, why do blocking operations in a thread and not in a go block? Shouldn't the go block run in a separate thread from the main thread and suspend itself if something blocking occurs (thus the main thread keeps going?)

noisesmith20:10:06

go blocks use a thread pool of restricted size

noisesmith20:10:23

they aren't meant for worker threads, they are meant for coordination between channels

noisesmith20:10:57

jvm threads don't suspend themselves in the way the go block abstraction does

noisesmith20:10:58

sure, threads don't block each other (unless they monopolize resources), but context switches on the OS level (which threads do) is much more expensive than context switches in a state machine in a thread (what go blocks do)

noisesmith20:10:13

which is why we even have go blocks

dharrigan20:10:50

Okay, cool, but aren't go blocks meant to be "light-weight" and thus hundreds (thousands?) can be created? I appreciate the explanation of the context, but are you saying that go blocks are designed for just channel (message) coordination? Simply trying to understand when/where to use a go block vs. threads 🙂

noisesmith20:10:14

you can create thousands of go blocks, right, but only N of them can be running

hiredman20:10:01

I really want to say "there is no main thread" because the jvm is a real multithreaded runtime and threads are all the same, but technically there is often a thread with the name "main" which is just the first thread the jvm starts

hiredman20:10:10

hundreds is nothing

hiredman20:10:09

generally normal threads are light weight enough to run hundreds of thousands

hiredman20:10:43

go blocks only do anything useful when used with core.async channels

hiredman20:10:31

(core.async channels however are pretty useful outside of go blocks)

dharrigan20:10:12

I see, so go blocks for channels. So what's the approach then, if I have something, i.e., in the original quesiton I asked, that does two things, go out to a db and read results, then go out to the web for further results, and I want those two operations to be tucked away, doing their own thing without disturbing the "main" 🙂 flow (so-to-speak), should I do what you kindly put up above?

dharrigan20:10:32

something that would "scale" (magical word!)

hiredman20:10:16

I think you need to get more specific about what you want and use fewer magic words 🙂

noisesmith20:10:46

(go-loop (let [x (<! rq-chan) db-info (<! (thread (lookup x)))] (<! (thread (web-search db-info))) (recur))

dharrigan20:10:09

very interesting. thanks noisesmith

noisesmith20:10:27

I almost missed the first chan

hiredman20:10:32

but like, why use a go block at all there?

dharrigan20:10:01

(to be honest, I assumed go-block since I thought it was 'the thing to do' (tm))

hiredman20:10:18

you likely just need future

noisesmith20:10:32

yeah - my snippet doesn't really make sense without other coordination with async blocks

noisesmith20:10:48

(other channels used to coordinate or buffer)

dharrigan20:10:38

You know, as a newbie, there's precious little examples of how to use go blocks/theads in clojure that do "real-world-bread-and-butter-stuff" (tm) of writing to a db, reading from a web service and coordinating that. I'm welcome to be shown wrong! 🙂

hiredman20:10:50

because just use threads

hiredman20:10:53

if you are not comfortable writing multithreaded software with real threads, I don't know that core.async is going to solve anything for you

hiredman20:10:55

like, absent any other requirements it sounds like you just want (future (do-something-else (do-something)))

dharrigan20:10:14

Oh, learning, and trying - all the cool kids seem to be using core async and go blocks these days....so trying to understand if that's for me! 🙂

dharrigan20:10:08

I really appreciate your very helpful feedback! I have much studying to do! 🙂

dharrigan20:10:31

(both of you!) 🙂

noisesmith20:10:38

my heuristic is that every PR that first adds core.async to a project has subtle bugs where the code only works accidentally, and the problem being solved doesn't actually need core.async - I've yet to see it proved wrong

noisesmith20:10:22

but that being said, there are coordination tasks where core.async helps a lot (eg. when you have an expensive thing to process and something might be in flight or need to be retried...)

hiredman20:10:25

every pr that adds core.async or maybe just every pr

dharrigan20:10:56

Perhaps future is all I need for now 🙂 keep it simple 🙂

hiredman20:10:22

yeah, core.async, in my view is all about communication between logical threads of execution, in your example you don't have any of that

noisesmith20:10:28

there's also claypoole which has more flexibility than future but works in the same basic paradigm

dharrigan20:10:30

It's definitely an area I need to understand a whole lot more

hiredman20:10:03

for personal stuff if I am playing around with flow control or a consensus algorithm, core.async ends up being a way to experiment with that stuff without having to re-invent how communication happens

hiredman20:10:44

for work we use core.async pretty heavily for our chat system, which lends itself to a sort of agent view of the world, lots of process loops exchanging messages

hiredman20:10:43

there are some cases where you might use something like pipeline-blocking from core.async without explicitly having a model of multiple communicating processes, but for the most part core.async is for communication between multiple things

👍 4
dharrigan20:10:21

thank you all! 🙂