Fork me on GitHub

can i tell, either in clojure or in the jvm, whether the core async threadpool is deadlocked?


for example, i see this in a jvm thread analyzer. these eight threads with this stacktrace means deadlock?


at the same time, perhaps i should trust the analyzer


"waiting on notification" is the opposite of blocked

👍 4

ok thanks. one-ish more dumb question: what should i look for in a thread dump, for blocked threads?


calls to IO or CPU intensive methods, or non-parked waiting on locks (which should be very rare in idiomatic clojure code, outside interop)


ok--thank you--damn. it makes sense that deadlock is a condition, and not a denoted state.


yeah, by its nature core.async obscures things that would be obvious or impossible in sync code


the way I usually present it to people who are considering adopting it is that async is a liability, and you need to have a big enough benefit from the async to buy into the corresponding liability


yeah--thanks again. looks like we’re not deadlocked, which (ha) means we avoided that landmine, and are back to the drawing board.


I had a lot of problems with core.async thread pool becoming exhausted, causing hangs. I've been pulling my hair out over this until I found that -Dclojure.core.async.pool-size=96 helps. Not a solution, I know. The problem is that in a large application many libraries might make use of core.async, not just the main code.


I reworked every go block in my app to make sure it doesn't perform computation or block on I/O and I've still been getting hangs with the default thread pool size. Apparently the combination of my use of core.async and the libraries causes the issue. Extremely annoying, as this can happen in production after a minute or a month of uptime.


the more reliable approach is never making calls inside go blocks that don't return quickly - you could use jstack (or the equivalent C-\ shortcut in a terminal) to see all stack traces


this is not a small codebase, and i can’t guarantee that blocking calls aren’t being made


if a code base might be deadlocking, and it has gotten so large without the people building it making sure that isn't happening as they are building it, then it might be junk


looks like it’s not; it’s my naivete of reading thread dumps. but i appreciate the feedback.


the large symptom of the core.async threadpool being deadlocked is when you create a go block that reads from a channel, that part of the go block after reading from the channel never runs

👍 4

that happens if all the threads are blocked


the lesser symptom is a reduced number of go blocks running in parallel (which is harder to quantify and test) and that happens if only some of the threads in the pool are taken away

👍 4

indeed. i worry about it in general because the thing that stops us from using up the threadpool is us knowing what’s going into core.async and what we put on threads. as best as i can tell, we are careful about it.


you can also reduce the size of the threadpool to try and make a total blockage easier




there is also a newish feature, a property you can set that will warn you if you use blocking channel ops in go blocks

👍 4

clojure.core.async.go-checking and clojure.core.async.pool-size


clojure.core.async.go-checking is not particularly fancy, so it is prone to false positives

Alex Miller (Clojure team)18:04:24

Assuming a positive is an error, I think it’s more prone to false negatives actually

Alex Miller (Clojure team)18:04:50

That is, not telling you about a problem, rather incorrectly telling you there is a problem