datahike

whilo 2023-12-04T20:40:50.573589Z

A new version 0.6.1555 has just landed, which writes all index updates in parallel and should shrink the latency of a commit (to storage) down to two round trips to the store. The writer also automatically batches transactions and commits them together if you submit them in parallel, significantly improving throughput in this case. It is basically only limited by how fast transactions can be processed now, not by store throughput and latency. This is also stress tested now. The S3 backend also has been updated, I would be curious to see whether S3 works for people in practical use cases now. I have tested the code on many machines and am confident that datahike's write performance should be competitive to the alternatives (maybe not necessarily faster, but we have the most horizontally read scalable distributed memory model in return). Query performance improvements are in the pipeline next.

6
🎉 5
grounded_sage 2023-12-04T23:03:55.334139Z

Combined with this could be a real competitive option. https://aws.amazon.com/s3/storage-classes/express-one-zone/

whilo 2023-12-05T03:17:25.511679Z

Interesting, did not look into that yet.

whilo 2023-12-04T20:43:41.824979Z

Datahike now can do (on the JVM), what I always wanted it to do. Datomic, but openly accessible and not confined to one backend. Next is an update of the README, which is seriously outdated, and improvements to the documentation. Please point out things you are interested in and that should be clarified 🙂.

✨ 5
whilo 2023-12-04T20:48:09.835729Z

I am also revisiting cljs support with @pat and would like to see Datahike work in logseq for instance. @tiensonqin has also started to work on durability for the persistent sorted set, I saw yesterday. The main open question for me there is how well asynchronous processing will fit into existing pipelines.

silian 2023-12-04T21:16:02.027589Z

Wow, bravo to the Datahike team. Impressive work. Looking forward to testing!

❤️ 6