I have ported https://github.com/whilo/persistent-sorted-set/blob/cljs-async-io/src-clojure/me/tonsky/persistent_sorted_set.cljs (for async storage IO) lately, which will allow us to lift the query engine to asynchronous execution in the browser as well (ultimately yielding a convenient frontend programming model). Unfortunately core.async and missionary's cloroutine CPS/state-machine transforms are fairly heavy and induce overhead for the transformed code that is noticeable in high-perf persistent data structure implementations (I get a 5-30x slowdown). I have found, forked and ported this purely syntactic CPS transform that only induces callbacks where await shows up in the syntax, yielding something fairly similar to what you would write manually with callbacks in JS (which has the fastest perf.). https://github.com/whilo/await-cps/blob/master/src/await_cps/ioc.clj Examples can be found here: https://github.com/whilo/await-cps/blob/master/test/comprehensive_test.cljs I am curious what other people think.
Interesting! I should have published my work on this sooner, I also ported datascript and persistent sorted set to async and found similar issues from core.async. My notes say that I had a 10x slowdown in a simple micro benchmark loop comparing core.async to promesa. I couldn't figure out why so I went with promesa. Even with promesa there was quite a bit slowdown when comparing to sync execution, this was particularly an issue for me because I planned to cache in memory the index fragments that were most used, and I wanted the cached results to match the performance of regular datascript. The problem is that promesa, and any async execution at all is just not going to match sync execution, every function boundary adds nanoseconds of time that can really add up over the course of a pull query. I came up with this small wrapper around promesa which only awaits if the result is a promise https://github.com/panterarocks49/durable-persistent-sorted-set/blob/cljs-durability/src-clojure/me/tonsky/maybe_promise.cljc.
I'm curious how did you handle the lazy iterators with async? This was a big problem for me, I ended up making Iter prefetch all the chunks from storage so that iteration was sync
but that obviously has downsides when you want to start at X position and not consume the whole set
buffer individual values from pages and fetch page when buffer is exhausted
Ah I see, so any of the consuming functions would have to use those async varients of into to process the Iter? I wanted to avoid that in the beginning of my port but I think I might go back and add it for the places where it would be useful
yes fundamentally we are just sugaring callbacks so most things get +2 or +3 to arity, and we're going to support sync & async paths with a macro. the async parts there will be spunoff into a new lib
it will be nbd to add bindings to promises etc, but the internals are just callbacks and run without userspace scheduler. its pretty quick
Oh I think I was looking at the wrong branch without the await code. I see the async-seq code now
what are you using the requires-storage-access? for? Did you port any of datahike yet?
I'm curious how this benchmarks compared to my solution, do the benchmarks work for the async code yet?
starting that this week provided christian is happy with how this propagates up to query.cljc
from christian on saturday:
### Iteration Performance ###
=== Sync full iteration ===
Sync: mean=0.098ms median=0.096ms p95=0.101ms
Async: mean=0.257ms median=0.243ms p95=0.356ms
Overhead: 2.62x (+161.6%)
=== Sync slice (100 elements) ===
Sync: mean=0.014ms median=0.011ms p95=0.028ms
Async: mean=0.030ms median=0.027ms p95=0.032ms
Overhead: 2.19x (+118.7%)last thurs
### Bulk Operations ###
=== Sync conj 100 elements ===
Sync: mean=0.167ms median=0.145ms p95=0.207ms
Async: mean=0.170ms median=0.165ms p95=0.207ms
Overhead: 1.02x (+2.0%)
=== Sync conj 1000 elements ===
Sync: mean=0.939ms median=0.884ms p95=1.178ms
Async: mean=1.649ms median=1.630ms p95=2.219ms
Overhead: 1.76x (+75.7%)
=== Sync conj 10000 elements ===
Sync: mean=11.565ms median=11.162ms p95=12.095ms
Async: mean=28.680ms median=32.859ms p95=35.435ms
Overhead: 2.48x (+148.0%)
=== Sync conj 100000 elements ===
Sync: mean=218.058ms median=206.321ms p95=299.677ms
Async: mean=363.330ms median=361.486ms p95=450.930ms
Overhead: 1.67x (+66.6%)missionary was like a 5x hit, core async much higher
Is that from the https://github.com/whilo/persistent-sorted-set/blob/await-cps-io/bench-clojure/me/tonsky/persistent_sorted_set/bench.cljc in the repo or somewhere else?
i think thats in the test-clojure lib
if not, in the latest await-cps. theres been alot of vibe coding going on
haha I was just asking chatgpt about CPS vs promises. I spent quite awhile scratching my head about why promises were so slow, am wondering if CPS completely removes the overhead
my main contribution has been threatening to write them by hand
this is what you want https://github.com/whilo/await-cps/tree/fast-path
that is a lot of code to digest but yeah I think we are talking about the same thing
ok yeah thats another optimization, if the value is on hand invoke the callback rather than yield. everything else yields
I'm trying to run that benchmark, do you have any idea why it's not finding the await-cps library? looks like the deps point to the old await-cps on clojars which doesn't have the cljs implementation. Is there a mechanism for overriding it that I'm missing?
Easiest might be to override with git dep to the fast-path branch
I spoke to him earlier today about polishing a release, its coming
gotcha thanks!
I used promises for now to make the interface explicit. But I am working with @pat to just do callbacks without any intermediate data structures and allocations.