Fork me on GitHub
#core-async
<
2020-01-23
>
dpsutton03:01:56

I'm doing dining philosophers in cljs and wanted to recreate the standard deadlock situation before moving to the standard deadlock prevention strategies. However, I'm unable to actually trigger deadlock here. Does anyone see anything that would not allow the cyclic deadlock?

dpsutton04:01:07

playing with it some more, it seems that there is no context switching between the (<! left) and (<! right) takes. if i put a small timeout channel take there it does happen. I thought it was a bit more cooperative multitasking there but it doesn't seem like it

hiredman07:01:02

That is because you are buffering the channels

hiredman07:01:10

The buffer immediately accepts 1 item, without having to wait for a consumer

didibus22:01:03

Why would they switch?

didibus22:01:38

If you find what you're looking for in the channel immediately, it won't context switch

didibus22:01:54

It will once something is waiting only

didibus22:01:04

Otherwise it would just slow down the whole process

didibus22:01:35

But, it does switch from the put to the take

didibus22:01:42

You can think of it as a rendez-vous

didibus22:01:06

put waits for a taker and take waits for a puter

didibus22:01:15

Well, up to the buffer, so like @hiredman said, in your case the puter won't wait for the taker unless the buffer is full and the taker won't wait for the puter unless the buffer is empty

didibus22:01:49

So the buffers control when things get to switch in a way. That said, from what I understand, two takes in a row won't switch if the take immediately succeeds.

didibus22:01:10

I believe the context switch is round-robin? Actually, can someone confirm? So if you have something putting super quickly, and something taking a more slowly, it won't starve other processes, since between each rendez-vous of the put/take, it'll make a visit to the other processes

didibus22:01:10

I like to think of it as I am the process. As I'm doing something, I won't stop to do anything else, until I reach a point where I can't continue on my current task, because I'm waiting for something, that's when I'll wonder off and see if there's anything else I can do in the mean time. Rince and Repeat

hiredman22:01:00

the context switch is a fifo queue

hiredman22:01:14

as in, go blocks are scheduled by being put on the end of a queue, which some threads are pulling tasks off and running

hiredman22:01:00

there is no central list of go blocks that is traversed to schedule them

hiredman22:01:53

the way they are schedule is the go macro turns them into callbacks attached to channels, and when something happens on the channel the callback is put on the queue

Alex Miller (Clojure team)22:01:55

having no central scheduler/controller is one of the most clever parts of core.async imo (esp when you look at alts etc)

didibus22:01:02

Couldn't that open the door for starvation?

didibus22:01:23

If two process just ping-pong with each other forever

hiredman22:01:02

they still wait in the queue

hiredman22:01:38

if A and B are ping ponging, then first A runs, which puts B on the queue and puts A "to sleep", and then B is taken off the queue and run, which puts A on the queue and puts B "to sleep"

hiredman22:01:30

at any point something else could be put on the queue which would be process beforeish(there are acutally multiple threads consuming and running tasks) the next A or B

hiredman22:01:43

there have been bugs related to this I think in the cljs implementation, where it would try to optimize by skipping the queue step, would easily results in starvation on a js vm because you only have one thread

didibus22:01:49

That's exactly the scenario I was thinking. I can see how the thread pool for processed on the JVM would help a bit, but it still seems at least theoretically prone to it

hiredman22:01:37

no, given a fifo queue it won't happen, if you go through each step of A and B ping ponging, and assume at some point something else puts something on the queue, you can show eventually it reaches the front of the queue ahead of any work for A and B

didibus22:01:36

Right, but how is that something else going to do that, if A and B are ping ponging

hiredman22:01:19

in the specific case of core.async, core.async itself has multiple threads in its pool and the jvm has multiple threads outside of core.async

hiredman22:01:48

but even if you are restrict yourself to a single thread and only core.async, it is still the case

hiredman22:01:58

in order for a task to be starved it must exist first

hiredman22:01:56

if the task exists for core.async it is either currently running, attached as a callback to some channel, or in the queue to be run

hiredman22:01:19

if it is attached to some channel that means it is waiting for activity on that channel and has nothing to do

hiredman22:01:51

when a task has nothing to do it cannot be starved (by definition starvation is a task with work to do not being able to get time to run)

hiredman23:01:16

so in order for a task to be starved it is either currently running or currently in the queue

didibus23:01:17

"in order for a task to be starved it must exist first", hum, is this true? wouldn't task start processing before every go block is evaluated?

didibus23:01:26

Say you had a go-loop just taking and puting on a channel

hiredman23:01:26

if you write a go block you cannot complain it is being starved because it is never run in my repl

didibus23:01:57

and later you have another go trying to put a "stop" on it, which should sto the go-loop from ping-ponging

hiredman23:01:03

if you write a program you cannot say it is being starved by the scheduler if you never run it

didibus23:01:07

And we are in ClojureScript with only one thread ?

hiredman23:01:35

the problem with taking clojurescript as an example is I have seen people writing patches for cljs choose performance over what in my estimation is "correct" over and over again, so it feels like thin ice

hiredman23:01:45

The key invariants are: at some point one of the processes must yield (do some channel operation that can't complete immediately so it registers itself as a callback on the channel) and when it does complete the task must pass through the fifo queue before running again

didibus23:01:52

(def a (async/chan 1))
(async/put! a 10)
(do (async/go-loop [a a]
      (let [v (async/<! a)]
        (println v)
        (async/>! a 20)
        (when v
          (recur a))))
    (async/go (async/>! a false)))

didibus23:01:03

I think something like that

didibus23:01:32

Logically, given a single threaded environment, I feel it would starve the second go process, and thus infinite loop

hiredman23:01:33

that is not two processes ping ponging

didibus23:01:32

Hum..., isn't each recur a new process?

hiredman23:01:13

and even if it was, you would have successive processes doing stuff to a, not two processes ping ponging

didibus23:01:49

True, sorry, I guess I meant with a buffer of 0:

(def a (async/chan))
  (async/put! a 10)
  (do (async/go-loop [a a]
        (let [v (async/<! a)]
          (println v)
          (async/>! a 20)
          (when v
            (recur a))))
      (async/go (println "ending the loop")
                (async/>! a false)))

didibus23:01:29

Otherwise the first go-loop never yielded

hiredman23:01:43

not sure what changing the buffer size is supposed to change about it not being two processes ping ponging

didibus23:01:26

Well, on the take of a, a continuation of the rest is created and put on the queue no ?

hiredman23:01:34

a ping pong is A sending a ping to B then waiting for a pong, B waiting for the ping, then sending a pong to A, then A sending a ping to B

didibus23:01:06

right, so here I'm using the same loop for that, put its like sending a ping to itself 😛

didibus23:01:17

Okay, I mean you can do it with two seperate go block as well

didibus23:01:27

But I was seeing this as recursion

hiredman23:01:28

recursion is not the same thing as multiple processes

hiredman23:01:12

if instead of saying "couldn't two ping ponging processes lead to starvation" you said "couldn't an infinite loop that does nothing but read and write to a channel that no one else is reading and writing to lead to starvation", the answer is yes, because those are different things

didibus23:01:57

(do
  (def a (async/chan))
  (async/put! a 10)
  (async/go-loop [a a]
    (when a
      (async/>! a 20)
      (recur a)))
  (async/go-loop [a a]
    (when a
      (println (async/<! a))
      (recur a)))
  (async/go (println "ending the loop")
            (async/close! a)))

hiredman23:01:02

but even then I think it maybe a case where cljs has "optimized" things and it breaks things

didibus23:01:17

What about this, this infinite loops, but isn't that due to starvation?

didibus23:01:35

unless I just have a bug

didibus23:01:54

Even in Clojure it infinite loops

hiredman23:01:35

nah, I was wrong, I was just checking the code

hiredman23:01:16

closing a doesn't end the loops

hiredman23:01:30

because you don't stop looping when a is closed

didibus23:01:35

well, possibly my repl chokes because of the print, and it could eventually close

didibus23:01:59

Wouldn't, oh ya

hiredman23:01:05

a is always truthy

didibus23:01:25

ya, got confused with my old version

didibus23:01:57

Hum... okay not sure how to create a repro of the scenario I'm imagining then 😛

hiredman23:01:01

on clojure that will terminate always anyway because you need at least 8 tasks not yielding

didibus23:01:32

Alright, well I can't make it fail 😛

didibus23:01:43

This works as well, in clojure and cljs:

(do
  (def a (async/chan))
  (def keep-going (atom true))
  (async/go-loop [a a]
    (when @keep-going
      (async/>! a 20)
      (recur a)))
  (async/go-loop [a a]
    (when @keep-going
      (println (async/<! a))
      (recur a)))
  (async/go (println "ending the loop")
            (reset! keep-going false)
            (async/close! a)))

hiredman00:01:18

Strictly speaking this is only pinging, no ponging. For ping-ponging to occur the second go loop needs to send a response to the first, and the first will have to wait for the response, and that will need another channel

didibus00:01:31

Well, fair enough.

didibus00:01:48

But it still demonstrates my idea where something could starve

didibus23:01:44

So, something makes the third go block execute, even though the first and second ping/pong between putting and taking

didibus23:01:50

I think its got something to do with the evaluation actually

didibus23:01:53

Ya, if I just send those 3 forms to the REPL, then it infinite loops, but wrapped in a do it doesn't, will try from a file