2025-08-11 clojurescript | Clojure Slack Archive

clojurescript

whilo 2025-08-11T20:13:58.524879Z

I am trying to make async-await as fast as possible in cljs atm. The reason is that I recently have https://github.com/whilo/persistent-sorted-set/blob/cljs-async-io/src-clojure/me/tonsky/persistent_sorted_set.cljs used by DataScript and Datahike to asynchronous execution (for async storage IO), which will allow us to lift the query engine to asynchronous execution in the browser (ultimately yielding a convenient frontend programming model). Unfortunately core.async or missionary's https://github.com/leonoel/cloroutine/ CPS/state-machine transforms are fairly heavy and induce overhead for the transformed code that is noticeable in high-perf persistent data structure implementations (I get a 5-30x slowdown just for the transform even if awaited values yield immediately in the persistent sorted set implementation). I have found, forked and ported this purely syntactic CPS transform that only induces callbacks where await shows up in the syntax, yielding something fairly similar to what you would write manually with callbacks in JS (which has the fastest perf. in general). https://github.com/whilo/await-cps/blob/master/src/await_cps/ioc.clj Examples can be found here: https://github.com/whilo/await-cps/blob/master/test/comprehensive_test.cljs I am curious what other people think and whether there are major problems with such an approach, and whether it would be useful to others.

raspasov 2025-08-26T09:19:21.942539Z

Are you performing the benchmarks with :advanced ClojureScript compilation? Anecdotally, I’ve observed substantial performance improvements from :advanced in the past but I don’t have solid numbers.

raspasov 2025-08-26T09:23:40.568809Z

I just learned: apparently, the Closure compiler is so “smart” that it might “optimize away by removal” trivial benchmark code: https://blog.fikesfarm.com/posts/2017-11-18-clojurescript-performance-measurement.html (and ways to prevent that)

gaverhae 2025-08-26T09:38:04.874369Z

Outside of the JVM space, lots of compilers do that kind of thing. In the JVM space, there's also a chance the JIT might do it, but it's a bit easier to fool.

raspasov 2025-08-26T09:42:37.713829Z

I think that because Google Closure specifically performs dead-code elimination, it recognizes that a piece of code has no effect and removes it. On the JVM, the compiler/JIT doesn’t ever entirely remove code, I think.

gaverhae 2025-08-26T09:44:39.492559Z

That sort of depends on your definition - it can sometimes bypass instance checks and if statements, but it's indeed nowhere near the kinds of optimizations GCC can do.

✔️ 1

raspasov 2025-08-26T09:44:47.315339Z

As in, it might remove indirection and do inlining, but the thing still definitely runs.

borkdude 2025-08-12T09:04:15.771469Z

@whilo FYI, squint and cherry do support async/await (using js-await) syntax. It's just a syntactic thing, no complicated transformations except for tracking implicit IIFEs created by let bindings etc.. Not suggesting you would use this, but I thought I'd just mention it.

lilactown 2025-08-12T15:41:47.311759Z

personally, if performance is my main goal I would write the CPS myself and expose a channel or promise interface. it will allow you to better analyze and fix bottlenecks in your code without macro magic in the way

lilactown 2025-08-12T15:43:22.627409Z

it sounds like this is purely for slicing up computation, not I/O, is that right?

lilactown 2025-08-12T15:49:11.441149Z

also is core.async 5-10x slower than sync, or than your await impl?

whilo 2025-08-12T18:34:46.762449Z

I did more micro benchmarks just looking at the transformation overhead of the CPS transform vs. the go macro and sync code (without any await), and core.async is 1.5-3x slower than the synchronous code, while this CPS transform is 1-1.3x slower. The main bottleneck in my persistent sorted set benchmarks is the fact that await suspends, which induces these massive slowdowns in general. I think for all the async solutions it is desirable to detect whether a value is available without suspending and keep processing in this case. This CPS is still probably faster than the alternatives in general, but it is not the main bottleneck.

whilo 2025-08-12T18:35:36.636769Z

@borkdude Interestingly even in JS async-await is supposedly a lot slower than normal callbacks. My (limited) understanding is that it is also transpiled into a state machine somehow.

borkdude 2025-08-12T18:36:13.780509Z

it's not compiled into a bunch of unreadable promise code?

borkdude 2025-08-12T18:36:34.109449Z

(or whatever intermediate format the VM may use)

whilo 2025-08-12T18:36:49.656499Z

@lilactown The point is that the code you would write by hand is nonetheless a mechanical transformation that should be macroexpandable. This is what I am aiming for by modifying await-cps.

borkdude 2025-08-12T18:36:58.844889Z

I think it might be possible to transpile async/await for a version of JS that doesn't support it using babel, but could be wrong

whilo 2025-08-12T18:40:01.919679Z

CPS is a very powerful mechanism to implement language features, for instance https://probprog.github.io/anglican/index.html, the probabilistic programming system/runtime I worked on, uses a custom CPS to fork program state at sampling points. Having a near perfect CPS that expands into understandable code is generally valuable, I think. Clojure libraries often go through tools.analyzer, which then renders the whole thing fairly opaque.

whilo 2025-08-12T18:53:40.100229Z

In the ideal world you don't even need to know about async/await and clutter your code, effectively it is just a way to dispatch into an external mechanism that your programming model allows. The JVM now has green threads/fibers for that, which is a good example of strong runtime support for instance. Unfortunately in JS nobody cares deeply enough about fixing structural limitations like that it seems.

whilo 2025-08-12T18:55:05.855589Z

When the CPS transform is fast enough it could provide this automatically, but it loses stacks atm. because of trampolining, which is not nice for debugging.

2025-08-11T23:56:37.688099Z

The clj side of core.async has an optimization where if it detects that an expression doesn't contain terminals it doesn't transform it, which sounds like what you want with your ioc, but it hasn't been ported to the cljs side

👍 1

whilo 2025-08-12T00:00:29.654969Z

I see. Are there benchmarks of how much slower typical Clojure code is when transformed into core.async's state machine?

whilo 2025-08-12T00:05:07.439579Z

One thing that bothers me about both core.async's transform and cloroutine is that it effectively transforms everything. The interesting thing about this ioc is that it leaves syncrhonous sections synchronous and only injects callbacks where needed.

2025-08-12T00:06:19.602849Z

No, that is what I am saying, the clj side of core.adync has this thing it calls "rawcode" where if it detects no channel ops in an expression it does not translate it

whilo 2025-08-12T00:15:10.762209Z

https://github.com/clojure/core.async/blob/254cdb6938256679c300d69ee1fac163020690ba/src/main/clojure/clojure/core/async/impl/go.clj#L194

Clojurians Log v2

clojurescript