Fork me on GitHub
#core-async
<
2017-06-23
>
neumann01:06:40

I'm trying to debug a performance problem that seems like a combination of core.async, threads and deref.

neumann01:06:09

I have (core.async/thread ...) that processes work it receives via a channel and send the result back via a separate channel. It has a thread-local atom to preserve state. When I reset! that atom, it's very slow and visual VM shows about 25% of the total time spent in the app is in

neumann01:06:16

clojure.core$promise$reify__7005.deref()

neumann01:06:34

When I comment out the reset! the performance issues go away.

neumann01:06:09

Well, it improves significantly.

neumann01:06:27

I also have a few hundred async-thread-macro-### threads in Visual VM.

noisesmith02:06:14

sounds like the code that runs async/thread might be starting up more times than you thought?

noisesmith02:06:53

spending your time in promise$reify.deref means that some lib you use waits on a result in one thread and delivers in the other - looking at more of the call stack should figure out what you are actually waiting on

neumann02:06:16

@noisesmith Thanks for your help. It looks like one of my components was starting a thread and not exiting properly on "stop". Each restart (via reloaded repl) was making more and more threads.

neumann02:06:43

After more investigation of the call stack, almost all of those promise deref calls are for alts!! with a data channel and a timeout channel

neumann02:06:15

That makes me wonder if there is a better way to timeout waiting on a channel.

neumann03:06:23

The java.util.concurrent.CountDownLatch.await() is taking up about 16% of the application time.

neumann03:06:01

Well, the deref on the promise wrapped around that.

noisesmith03:06:49

when I'm profiling I count anything that is an intentional wait, whether that's sleeping or waiting on a contdown for a timer, as a non-issue

neumann03:06:16

I see there is a "Self Time" and a "Self Time (CPU)"

noisesmith03:06:17

because assumedly as soon as there was real work to do, it would pre-empt my waiting (if I'm doing it right at all)

neumann03:06:37

The "Self Time (CPU)" is 0.

neumann03:06:59

(This is Visual VM.)

neumann03:06:06

What you're saying totally makes sense.

noisesmith03:06:20

yeah, that shouldn't be hurting anything -- that said it is possible to construct things with core.async and other buffering / queueing mechanisms that hurt performance in ways that are really hard to diagnose when profiling

noisesmith03:06:27

but spending time sleeping isn't proof that's happening

neumann03:06:21

Ah...I see. Right. Of course it will take a lot of wall clock time for a timeout. That totally makes sense now.

neumann03:06:36

And while I was at it, I did manage the find the thread that's doing all the work and I have schema checking on for all function calls. Turns out that is the bottleneck!

neumann03:06:59

@noisesmith Thanks again for your help. I appreciate it!