New to core.async. I have a go block, that calls a function, which does Java BufferedReader.readline(). Is this allowed or recommended? I think, not, because readline blocks. I am asking, because it seams to work correctly.
I might be getting confused in the macro meta, but what is a function boundary when there’s no state machine? In an ideal world without any thread pinning (as discussed in the link I posted) the whole JVM becomes one big Giant Go 🙂 But I might be misunderstanding what you meant.
… if there’s no state machine… can (go ...) even “realize” it’s crossing a fn boundary?
I am guessing impl details/plans that I’m not familiar with, but I would have assumed “no” (unless some minimum amount of code analysis is still going to take place, as part of (go ...) )
There is no analysis or rewriting when running on vthreads, but you still should not do things that don’t work now in go (parking ops across function boundaries) - if you want to do things like that, use io-thread and blocking ops
Existing code gets the benefits. New code should be intentional
Yes, I was wondering if it would still analyze just to keep it backwards compatible and error on crossing function boundary. Otherwise you have to just be careful if you want your Go block to be backwards compatible with version of core.async that don't use vthreads or if running on a JVM that doesn't support it, like an older JDK. Asking, because in theory they could have taken it any possible way, like keep go as-is, and introduce vgo or wtv. But to be honest, I like just saying, look, this is core.async 2.0, Go no longer has any limitations, and no longer colors your code, can yield from inside an inner function, etc. And just you have to have a modern JDK for it. Would not work in cljs anymore though
That’s why there is a new thing - use io-thread if you want new semantics
(:require [clojure.core.async :refer [io-thread] :rename {io-thread go}]
😛This thread was very interesting and illuminating! Thanks didibus and Alex for that dialogue.
I think I'm still a bit confused about the new/planned semantics for core.async, curious if it's just me. I'd have expected it more that: 1. thread is still used for blocking IO (but uses virtual threads if they are available) 2. go works the same as always, uses the state machine 3. vgo or some new thing is introduced, is like go but without any of the state machine constraints and runs on virtual threads Where now it seems that: 1. thread is the same as always, but why use it anymore, feels deprecated? 2. go works either the same as always or if configured, will run without any of the state machine constraints and use virtual threads instead 3. io-thread, is like thread, except it uses virtual threads if configured to do so I think it's the addition of io-thread that confuses me. Why would I use io-thread? Without vthread, it's same as thread, and with vthread it is same as go?
unclear to me too, but I think maybe "thread-for-implementing-an-event-loop" might be a good way to think about io-thread
I think the key is that (as always) we are trying to retain semantics and make this a non-breaking version upgrade but adding new capabilities. So thread and go are semantically the same as they always were, except if you are on 21 (and ... extra condtions not repeated) then go is more efficient in that it uses vthreads by default.
AND you have a new thing io-thread that is explicitly about running on vthreads (when available) and supporting blocking IO (which go does not).
ugh
I screwed that up I think, io-thread would be the opposite then maybe?
if you want to use blocking io, use io-thread
if you want mixed workload semantics (IO or compute), use thread
and your existing code continues to work like it did
going forward, you probably won't need or want to use go unless you expect to run in pre 21 or mixed jvm envs where you want parking semantics
The two things about this though is that: 1. Go is actually changed in a way it is very easy now to use it and introduce incompatibilities. I use a macro from a lib that I didn't know used a HOF internally, it works, I never notice, doesn't work on non-vthread Go variant. Or if I'm new to core.async, I won't even "learn" how to write a Go block the old way and will just start writing non-compatible Go blocks 2. io-thread has io in its name, but in reality, it would be awesome for async coordination and lightweight compute as well
I also feel like, the only use-case virtual threads are not a good fit for, isn't covered by anything, which is when kicking off a long async compute task. Like if we had: • compute-thread <-- bounded to num of core, unbounded pending task queue • thread <-- use for IO • go <-- use for async coordination I'd say that would make sense, and you also accept that core.async 2.0 is not compatible with no virtual thread JVMs, because Go has changed.
but it is compatible and go semantics have not changed
we're trying to thread a lot of needles at the same time here. the assumption here is that pre-21 jvms are going to be increasingly less relevant in the near future (Java 25 LTS is out in like 5 months)
I think most users want Go semantics to change, in that, yes it is still used for async coordination, but we want it to support parking across functions.
it is a far greater good for people to mostly get the (presumably) big benefits of vthreads and no analyzer than to protect people from doing things they already are not doing (because it doesn't work now)
But I don't think we should pretend that go that can park across functions is semantically unchanged... Because, that's a pretty huge semantic change
it's capable of that, but that doesn't mean we are saying you should do that
if you are going to do that, use io-thread
That's where I get confused. Because I don't want to do io, I want to use map.
it's a fair point, but I don't think it's the most important consideration
originally we called it task but that really didn't say anything relevant - the important here is that it's running on a thread capable of blocking for io (ie vthreads)
I agree that I think it's best to be able to use core.async where the analyzer is gone (maybe it doesn't even pull in the dependency for it), and so the namespace takes a lot less time to initialize, and Go uses vthreads and is more performant. But I feel we're playing semantics with trying to say it's still "compatible", when in reality it's more like, if you restrict yourself to a subset of the new Go that works with the old Go, then it is backward compatible. At that point, let's just call it a breaking change no?
And if we acknowledged that, than why even introduce an io-thread? Just use Go for IO and everything else. And now the only thing you could argue Go is a bad fit for, is long compute, but there isn't anything for that even now.
it's very importantly not a breaking change - you can upgrade without anything breaking (and as someone with like 1000 nubank services lying around, I am acutely aware of)
old stuff continues to do what it does. new stuff can intentionally say what it needs
there are things for long compute now - platform threads
and thread is for mixed use, which also includes ... compute
Hum... ok right, I guess I was thinking more cross-compatibility, maybe more forward-breaking. You make a good point. So I think it's still io-thread that I'm confused about. If we forgo forward-compatibility (new code works with old jdk), which Go won't be. What's the point of io-thread?
so you are saying that you explicitly want to do blocking io
which you can't do in go
But I can do it in Go, when running in the new version. And it'll literally expand to the same code.
another more subtle compatibility issue is cljs - the original portability goal here was making the same go blocks work in both clj and cljs, which they still do
so for the 1000th time - we are not adding any ALLOWED semantics in go
I mean, io-thread looks like to me it's designed so if you write code that targets vthreads, and then run it on an old JVM, it would still work.
Except, my Go blocks will likely not work, because I'll probably have started using map, reduce, run! and so on inside them.
"probably" = you chose to write brand new code, knowing the limitations
if you are choosing to write new code, use io-thread
I don't really know what else to say here, you can choose to agree or not
No it's good. I think I understood. It was mostly confusion I wanted cleared up in the mental model. So it could be something like: 1. Code written with non vthread and older version of core.async will still work with the new core.async using vthread, and you get upgraded to vthread for free <-- Awesome! 2. io-thread exists if you want to write code that could work in ClojureScript and with JVMs that don't have vthreads, but if you want that, also be very careful your Go blocks are compatible with state machine Go. <-- Kind of tricky to pull off, but available if you need too 3. thread <-- Will always use an OS thread?
1. yes 2. io-thread won't be in cljs - you have to stick to go for that 3. yes (jvm only of course)
And last thing to make sure I really got it 😛 io-thread will use vthread, but fallback to OS threads if vthreads are not available. And that's the difference with thread, where the latter is always OS threads even when vthreads are available?
yes
> I agree that I think it's best to be able to use core.async where the analyzer is gone (maybe it doesn't even pull in the dependency for it), and so the namespace takes a lot less time to initialize we can't kill the dependency (although if you know you're not using it, you can exclude it), but the load of the analyzer namespace will be conditional and it will only be loaded if needed, which yes, does greatly reduce the loading time of the async ns
From the perspective of ClojureScript, it makes sense to not expand the officially allowed semantics of go because if you start using map reduce etc in go (which is technically possible with JVM 21+, as discusses above) that won’t be Clojure -> ClojureScript compatible.
Clarification: “using map reduce ” as in, “crossing fn boundaries”… Of course you can use pure map/reduce/etc, but you can’t park inside.
Historically - nothing blocking, with VirtualThreads that’s becoming less of an issue, but I believe the changes haven’t officially landed yet/still in flux; there are ways to manually “tweak” core.async into VirtualThreads but I wouldn’t recommend them for someone “new” to core.async;
If you want something blocking inside a (go ...) you can do (<! (thread do-blocking-stuff-here))
The cost is a bit of memory usage for a real thread but at least you won’t run into a deadlock
thread is clojure.core.async/thread … basically like a (future ...) but puts the return value on a channel
If you block inside (go ) it will likely work “correctly” when you try one at a time but with any type of volume/concurrency it will work right up to the points of a total deadlock (I believe)
And actually… all of the above might no longer apply… given: https://ask.clojure.org/index.php/14428/core-async-beta1-cached-thread-pools-hundreds-threads-doing
It sounds like (and I verified via YourKit) that core.async will actually spin up the total number of threads required to process the task… so perhaps if it will never totally deadlock anymore.
(dotimes [n 1000]
(a/go
(Thread/sleep 1000)
(println "I slept fine!")))
… creates ~1000 threads named async-io-...Yes, go blocks are now multiplexed on an unbounded CachedThreadPool.
But I have not been able to really reason about the impact, does it mean you can freely block inside go now, with no impact to performance and so on? Or would you still want to use thread or io-thread when you know you will need to block or compute something that will take a long time?
Logically, it now appears there is no difference between thread or go, except that go still rewrites things into the state machine and can't cross function boundary. But it would mean, a non-blocking go will be more "OS thread efficient", as it will quickly release the OS thread back in the pool to be reused while "parked", instead of locking the OS thread and it's resources.
But, if you are doing blocking IO, it will just lock up the thread, but won't prevent other Go blocks from running, and now this thread takes as much resources as would have been taken if you'd wrapped it in a thread or io-thread. Maybe someone needs to benchmark.
Yeah… it feels like it should be the same as doing a bunch of thread calls inside go but yeah – benchmark
I would only caution about the:
> no impact to performance
each “real” thread takes some amount of memory… so that’s one impact… once they are truly non-blocking VirtualThreads… then yes, virtually no impact on performance at that point 🙂
> virtually no impact on performance at that point
iseewhatyoudidthere
Nothing has changed semantically. Go blocks should not block for IO. If you want to do blocking IO, use either thread or better if using the latest version io-thread
@alexmiller I think one confusion I have, 1) if that's the case, then is the plan to have io-thread use vthreads, but Go and Thread will continue to be scheduled on an unbounded CachedThreadPool? 2) And I'm also curious about the pattern where inside a Go block you used to wrap a blocking call in thread and <! from it, does this still make sense to force an extra thread for the blocking, a state machine transition and all that, when you already are inside a thread you could just block? 3) Finally, if I was going to spawn a big computation, some O(n^2) thing that will take say 20 seconds to compute, I would wrap that in a future in the past from inside the go block, but now it seems it could just happen inside go as well, since it doesn't matter if you lock up the go thread for 20 seconds. Though ideally you don't want too much heavy compute contending for the CPU and it's probably still better to run in a bun CPU blinded pool. But are there any recommendations here for these uses?
@potetm Haha, but from the Oracle docs, they say that vthreads can possibly hurt performance, though maybe not, but shouldn't improve it. It only gives you better throughput, but not latency. At least last time I checked. Though I could see overall latencies improving if your number of real threads goes down, and hence your number of RAM frees up and so on. So I think they mean more that vthreads don't go any faster at computing, and possibly have a tiny overhead in scheduling over real threads. But if the switch frees up resources I think it could still result not just in better throughput but also overall better latencies.
i have no comment. i was only acknowledging a nice little pun.
As we move forward, both io-thread and go will be scheduled on vthreads (when available and pursuant to some config, details coming). We are not changing the semantics of go (because sometimes users of go will not be in environments where vthreads are available), so you still should not do io blocking in go. Similarly, you should not do big compute in a go block - if it’s on a vthread you are subject to the scheduler and could get paused. Use thread for that, or future or whatever.
If you want to do io-blocking, use io-thread. In that context, if running with vthreads, then no, there is no reason to spin a thread to do the io and block on the channel, just do the blocking op, the jvm will park as needed.
thread has always been a mixed use case expectation and will continue to schedule on platform threads. We are likely to also add future variants to Clojure itself (future-compute and future-io, existing future will stay as mixed workload).
To summarize, there are three workload types:
• io - blocking io allowed (prefer vthread)
• compute - no blocking io (platform thread)
• mixed - may have both compute and blocking io (platform thread)
There are three task constructs in core.async:
• io-thread - blocking io allowed, io workload
• thread - compute or io allowed, mixed workload
• go - no blocking, no compute, will run on vthreads if available, primarily for channel coordination.
• there is no “compute” task in core.async, just do normal Clojure stuff
@alexmiller Thank you for the clarifications! Out of curiosity, is the go state machine macro still going to be relevant assuming VirtualThreads are available? (I realize it is still very much needed for older JVMs, ClojureScript, etc)
We are going to retain the machinery for now - you may still need it if a) you are running from source and not yet on Java 21, b) you are compiling an application that may be deployed in jvms that do or do not have vthreads, c) you are using libs that already compiled core.async on an older version (datomic client in particular), d) you want to upgrade but can’t assess how this will change your application right now.
But, if you are on 21+ and haven’t set any of the new flags, by default go will be run on a vthread and parking ops will become blocking op equivalents (which the JVM may park if the the block is paused in a vthread). In this case there is no go block analyzer/expander, you’re just running Clojure code in a vthread
Very well thought out as always 🙂
21+ and haven’t set any of the new flags, by default go will be run on a vthreadOh so… no state machine magic transformation in that case? Or you’d specifically have to use thread ?
No go state machine (JVM vthreads do the same semantics for you)
Nice! 🙂
The jvm caught up to what we were doing 10 years ago (mostly joking, I know there’s a ton of hard work there)
hahah
I still haven’t fully internalized the improvements made around vthreads in JVM 24 vs 21+ (up to 24) and when it’s actually “truly” non-blocking
I’ll do my own research there, thanks for your time!
it is complicated for sure - the usual sources of blocking are locks (j.u.c.locks participate as of 21, synchronized monitors as of 24), socket stuff (since 21), file ops (since 21, but there is some nuance here - in some cases I believe these can still pin the carrier but I haven't really followed it). one very subtle "lock" that I believe is still pinning is what happens during class loading (although that's unlikely to be an issue). so generally if you are "blocked" waiting the jvm can unpin the vthread from a carrier thread and schedule it for other work, then reschedule you when the thing you are waiting for has completed. the vthread stack is saved on the heap while parked.
Yeah, I just watched this, I thought it did a pretty good job explaining a number of nuances in ~7 minutes https://www.youtube.com/watch?v=QDk1c0ifoNo
Cool, thanks. One follow up, if Go is not using the machinery, it means it could be made to cross function boundaries no? But doing so would make a go block not backward compatible. Is the plan to keep those restrictions in Go running in vthread to maintain backwards compatibility, or allow function boundaries to cross if running with vthreads, and those libs using Go that way must specify a min JVM version of 21?