Any experience reports of using the new alpha of core.async with vthreads JVM option set to target?
We're trialing that at work and, making no code changes -- just updating the version and providing that JVM option for all invocations (incl. AOT during JAR building -- I think, if I got it right) -- we see quite a change in thread usage:
The process went into a GC tailspin overnight and pegged the server at 100% so I'm rolling back changes. I realized I'm probably not running the process with that JVM option right now, so that fits with Alex's comment about the AOT-target version most likely being identical to the runtime version. So what I started out testing -- and what I'm back testing now -- is just the updated core.async without targeting vthreads, just running in default mode. I'm going to run like this for a few hours to check it is stable, and then I'll ensure that vthread=target property is enabled and restart the process, and I expect we'll see climbing heap again. Will report back in a few hours.
Assuming you are in Java w vthreads you should not need the flag, it will use vthreads (plus you’re compiling to vthread assuming code already)
That is, the default (non compiled) is vthreads
You did see significant change in thread usage, further evidence for that
I guess this sentence is what I find confusing in the async virtual threads post about this: "If AOT compiling, go blocks will always use IOC (no change)."
We AOT our uberjars, so this sounds like we should get the old core.async behavior -- without that vthreads=target property?
Here are my three possible scenarios -- can you explain what, if any, differences I should see between them: 1. update core.async to alpha2, make no changes to anything, AOT without vthreads=target, run without vthreads=target 2. update core.async to alpha2, make no changes to anything, AOT without vthreads=target, run with vthreads=target 3. update core.async to alpha2, make no changes to anything, AOT with vthreads=target, run without vthreads=target
I believe I was running scenario 1. successfully at the start of this thread(!) -- no async-io threads, lots of forkjointhreads, higher throughput, higher cpu, stable heap. Then I switched to scenario 3. and saw the same overall pattern except heap steadily climbed until we hit a GC spiral and 100% cpu.
presuming you are AOTing everything and running on Java 21+, the runtime setting is making no difference. so 1+2 will have go state machines (running on vthreads) and 3 will have vthread go blocks
Running on JDK 24, yes. And always AOT'ing everything (because it makes a massive difference to startup speed when we restart services).
so you will still see a thread difference even with 1+2 b/c io-threads and go threads are running vthreads instead of platform threads, but doing the same thing they used to
So 1. & 2. are equivalent, regardless of the JVM property at runtime? And 3. is the different scenario. Okay, so it seems the non-state-machine scenario is the one with the bad heap behavior...
yeah, you're compiling so the flag at compilation time is the important one
We're not using io-thread (yet) -- no code changes, just update core.async.
How would you like me to proceed with debugging the apparent heap problem with scenario 3.?
depends what tools you have to investigate the leak. we will try to replicate independently.
No specific tools. We could probably get you a heap dump if we re-run scenario 3. (remind me how best to do that for the format you want)
The app where we're seeing this is the heaviest core.async use in our whole suite, and it seems fairly easy to trigger it...
a heap dump is likely to be huge and not very useful to us w/o code. better to monitor heap histograms via jcmd over time
Thanks for your help Sean. Just catching up now. Quick question: The problematic scenario #3 and is run with the vthreads flag unset?
something like jcmd <pid> GC.class_histogram | head -50 periodically should give you a clue
@fogus Correct. As it turns out, none of my testing has had that vthreads flag set at runtime, hence no scenario 2.
@alexmiller Thanks. I will try that command when I get back around to testing the AOT-compiled-with-vthreads=target scenario again.
Or use Java Flight Recorder for allocation/leak hints:
• Increase stack depth:`jcmd <pid> JFR.configure stackdepth=256`
• Start: jcmd <pid> JFR.start name=leak settings=profile duration=5m filename=/tmp/leak.jfr include=OldObjectSample
• Wait 5m to accumulate to the .jfr file
• Open the .jfr file in Java Mission Control, go to Memory -> Old Object Sample, sort by Allocation Site or Object Type
I don't have JMC locally, but I can capture a .jfr file if you want to examine it?
In the meantime, the GC histogram quickly shows this as the #1 memory usage, steadily increasing:
1: 34449 52606000 jdk.internal.vm.StackChunk (java.base@24.0.1)
1: 47076 69978016 jdk.internal.vm.StackChunk (java.base@24.0.1) (and still growing)
This is output from the GC histogram. The first block is from the alpha2 core.async with no property set and no code changes. The subsequent blocks are from the same app AOT'd with vthreads=target
I have leak.jfr.gz (about 6.7M) if you're interested @alexmiller (or @ghadi)?
Old gen (G1) CPU times on that server since starting that AOT'd-with-vthreads=target version -- the other servers are not doing any old gen GC:
And young gen CPU times are steadily increasing:
OK, I'm stopping this test and rolling back to alpha2 without vthreads=target (scenario 1).
LMK if you need more information, or need me to run more tests.
seems like what I would expect with vthread go blocks not being "done" / gc released
But with the state machine, and running on vthreads, that doesn't happen?
🤷 can't explain it, just observing atm
I guess the first Q would be: how can I confirm I set the option correctly for AOT?
The thread state change is definitely nice to see:
Hmm, I think I've confirmed that the JVM option didn't affect compile-clj...
With the default it should compile to go blocks, and you’ll see state machine classes (I think they have “state” in their name). You will want to set the new prop to target, and then you shouldn’t see those
You’ll need to set that prop when calling compile-clj
Right, for running everything, we have the property set to target. My question is just about AOT -- compile-clj -- how do I ensure that property is set?
My :build alias has:
:build
{:extra-deps {ws/build {:local/root "projects/build"}}
:jvm-opts ["--enable-preview"
"-client"
"-Dclojure.core.async.go-checking=true"
"-Dclojure.core.async.vthreads=target"
...
but I don't know if that is correctly affecting the compile-clj call in build.clj -- is there a way to verify that?I tried (println (System/getProperty "clojure.core.async.vthreads")) at the top of the main ns so it would print when it was loaded (for AOT compilation) and it prints nil 😐
in the call to compile-clj, use :java-opts ["-Dclojure.core.async.vthreads=target"]
compile is forked, so the props on the build process itself are irrelevant of course
Ah, that's what I'm missing...
Thank you! I couldn't remember how/where to set that.
very curious about observable change in latency/throughput
Yeah, I'll have to rebuild with this change and do another round of testing -- what about clojure.core.async.go-checking=true? Is that runtime only or would adding it to compile-clj affect anything?
well, it's irrelevant if you're using vthreads
I'd certainly not turn it on in production regardless
Heh, no, that's dev/test only. I was just curious.
We do have clojure.spec.check-asserts=true in production tho' 🙂
you do you :)
Now I have that fixed, it's nice to see the size reduction in the JAR files:
About to replace this JAR file: build-2025-10-20_17.32.42
-rw-r--r--. 1 tomcat tomcat 63664369 Oct 20 17:52 /var/www/worldsingles/build/uberjars/wsmessaging-1.0.0.jar
With this updated JAR file: build-2025-10-20_21.34.44
-rw-r--r--. 1 tomcat tomcat 59678051 Oct 20 21:49 /tmp/wsmessaging-1.0.0.jarYou could also exclude tools.analyzer.jvm under core.async too if you wanted
Good to know, thanks.
@seancorfield curious what the source of those graphs is. Looking at improving our insight into our prod jvms.
New Relic.
I love New Relic. We have the Linux agent installed on all our servers, and the Java agent in all our applications, as well as a bunch of custom metrics we compute and post to New Relic. Happy to talk your ears off about it 🙂
good to hear, I’ve only used it ages ago with ruby on rails and liked it a lot for that. I see you https://corfield.org/blog/2013/05/01/instrumenting-clojure-for-new-relic-monitoring/, are you still annotating functions like this today?
We use Paul Rutledge's library for it these days... lemme get the deets...
https://github.com/RutledgePaulV/newrelic-clj -- so you can use defn-traced
@alexmiller @ghadi FYI: with the compile-clj :java-opts change so that AOT picks up target, we're seeing slowly increasing heap memory usage that we didn't see previously (with target just set for runtime). I'll be tracking this overnight to see what happens. If it still looks "bad" in the morning, I'll roll back to the earlier version of the day, and run that for a full day and see if heap is stable.
Can you think of anything that might affect heap/GC between the AOT'd with target version and the regular runtime target version?
No, should be the same bytecode either way
Vthreads store their stack on the heap so there is more heap usage in general, but I would expect gc to reclaim as go blocks complete
The four hours with today's first version, heap was very stable. The three hours with the AOT change have seen steady heap climbing.
In either case go blocks are just compiled as normal clojure code and run on vthreads so not much magic. I guess it’s possible something is retaining refs and preventing cleanup, but can’t imagine what that would be that’s different between those
Okay. I'll see where we are in the morning. I'm only running this on one server in the cluster so far.
the state machine doesn’t use threads regardless of JVM/vthreads, it’s hand-rolled inversion of control (callbacks)
vthreads will use memory for parked stacks
@richhickey I think https://clojure.org/news/2025/10/01/async_virtual_threads is pretty confusing on when vthreads will be used. When we upgraded core.async, we saw a big shift in threads from async-io (went away) to fjpool-worker (big increase). But we AOT everything and were not using the vthreads=target option for compile-clj -- so, based on your comment, we get IOC behavior and no vthreads that way? (just the change in thread types/names seen in that initial graph).
If we AOT with vthreads=target, we'll get only vthreads from go blocks -- and it seems, from discussions with @ghadi, that some vthreads have strong references and won't get GC'd, depending on how they're created (executors create vthreads that can be GC'd, the Thread API creates vthreads that can't be GC'd until they're completed -- based on a loom-dev mailing list discussion).
If we weren't AOT'ing, we'd get vthreads instead of IOC just be virtue of running on JDK 24 (21+). And we'd likely still be in this situation (large heap, GC unable to recover space) I think?
I think we can recover GC'ability of vthreads created by the Thread API if we add -Djdk.trackAllThreads=false but that feels like a workaround. I haven't looked at the core.async source (yet) but I'm curious how it creates vthreads -- executors or the thread API...
(Ghadi helped me identify some problematic code in one of our apps that creates a huge number of go blocks and associated channels -- which doesn't matter for IOC but creates vthreads in the new model and those channels were not always closed and the go blocks were left hanging... and un-GC'able)
My first cut at trying to repro was unsuccessful until I added a bit that didn't close its channels. Then I saw the same behavior.
@seancorfield Looking to summarize for my own understanding.
Throughout your experiments, the constants in play were:
• JDK 24
• Using alpha2 of core.async
• No source code changes
• Always AOT
The only changing variable was:
• AOT without -Dclojure.core.async.vthreads=target and everything was fine
• AOT with -Dclojure.core.async.vthreads=target and you saw the GC issues
Regarding the problematic code, was the channel not being closed the channel returned by the go block? Or a manually created channel within the go block?
We were creating a huge number of channels and go blocks in this app and they were only closed on the unhappy path which was rare. The assumption was they'd be GC'd. Which they are in the "old" IOC world, but in the VT world, a vthread that isn't terminated doesn't get GC'd by default, depending on exactly how it is created (according to the loom-dev mailing list discussion from a year ago).
We're fixing our code to avoid creating so many channels/`go` blocks -- it wasn't really "buggy" code but it was a bit sloppy (but I suspect we create channels/`go` blocks all over the place that we just assumed would get GC'd when they went out of scope).
(and, yes, your bullets accurately reflect how we have things set up)
I'll let Sean answer his specific case, but an easy way to replicate is to make a bunch of go blocks in a loop that read from a channel that never receives a value. The problem is not exactly related to channel closing but more so to whatever causes a go block to not complete.
I would expect that a lot of core.async code in the wild creates go blocks that never complete -- because it has never caused any problems and go blocks (and channels) "feel" very lightweight...
I'm sure you're right
It's not going to bite you in the new world unless you create a LOT of those, however...
Reading the loom-dev mailing list thread about vthreads and memory leaks was very interesting -- esp. the difference in between between executor-managed vthreads and the Thread API vthreads.
Links?
https://mail.openjdk.org/pipermail/loom-dev/2024-July/006895.html
That's what Ghadi sent me. There are also some StackOverflow discussions about the memory leaks -- can't remember what search term I used when I stumbled across those. I was researching the -Djdk.trackAllThreads=false property.
Thanks!
> make a bunch of go blocks in a loop that read from a channel that never receives a value
Never receive a value, or the channel they read from never closes? I think core.async as a library exercises good go hygiene by ensuring go terminates when the read source closes (e.g. see core.async/pipe as an example). But I'm not sure that practice or discipline is followed in the wild.
Okay, just pushed a version of that app to one server, with -Djdk.trackAllThreads=false and compile-clj passed vthreads=target and our runaway go block creation addressed. Or at least that particular known issue fixed. We'll see how much of a leak we have now...
did you swap out the executor?
or just the sysprop
We use component in our projects and I try to aggressively manage our channel lifecycle as part of the system so we don't leave garbage laying around. But we still largely ignore the return chan from go blocks. Would that cause issues in the virtual thread world?
@seancorfield I wasn't able to make the leak go away without swapping vthread executor (as well as prop...)
in the IOC world, only channels (used by a go block) refer to the go block data and thus when those channels are GCed (not necessarily closed), so too can the go block data be GCed
@ghadi “swapping vthread executor” for what?
jdk.trackAllThreads seems a culprit
defaults to true?
Given the advice is to set jdk.trackAllThreads to false, I assume it now defaults to true -- and from what I've seen discussed online that changed around JDK 21?
this negative interaction is so basic and predictable, sheesh
how could anyone write a large vthreaded IO app with this default?
I've been trying to use vthreads since JDK 19 and have had to roll back nearly every attempt 🙂
I managed to switch some of our future stuff over to vthreads about a year ago. Not sure what JDK version we'd upgraded to by that point, but that had failed previously.
I thought it was interesting at Conj last year, when I asked the room when Alex et al were talking about Clojure 1.12 "how many of you are using vthreads in production?" (very few) "who has had problems with them?" (those few hands all stayed up).
Seems like more tweaks are needed around vthreads in Java/JDK-land to make them reliable in production 😞
makes me appreciate our IOC impl all the more 🙂
Yup, pretty amazing what you can already do with core.async -- and via a library as well!!
thanks for the report Sean - we’re running some tests now but think we have enough info for the moment
need trackAllThreads=false, and swap ExecutorService (which tracks all spawned so that .close() works) with an Executor that spawns without tracking
(reify Executor
(execute [_ r]
(Thread/startVirtualThread r)))
trackAllThreads=false was necessary but insufficient
@fogus ^
FWIW, we're using Executors/newVirtualThreadPerTaskExecutor in a few places, but they are all where the VT is guaranteed to complete, and we're using Thread/startVirtualThread in many more places.
I thought, based on some of the discussions on loom-dev, that the Thread API was the unsafe one and executors were the safe one -- but @ghadi your code suggests the opposite? Did I misunderstand how vthread tracking works there?
The green line is the server that is running the vthreads version -- I'll let it run a bit longer but I think heap is just going to keep climbing 🙂
yes it’s doomed, but (kindly) capture a heap dump before you nuke
Haha... Okay... That command is either in this thread or our DM. I'm on my phone right now.
jcmd ${PID} GC.heap_dump ${output.file}
then pray you have enough space on target fs 🙂
@ghadi LMK when you're around and I'll wormhole you that heap dump.