2025-10-19 datalevin | Clojure Slack Archive

datalevin 2025-10-19

Huahai 2025-10-19T04:40:20.138769Z

An update: I am working on a fork of LMDB so that we can have two important features in our KV store: order statistics and prefix compression. Order statistics enable efficient range count and efficient sampling, critical for significantly cutting down query planning time. Prefix compression can reduce the DB size and may even speed up reads as we will have a smaller number of pages on disk. I will strive to induce minimal write overhead in the implementation of these features. Stay tuned.

❤️ 2

🎉 10

Huahai 2025-10-20T05:03:57.929129Z

https://github.com/huahaiy/dlmdb

Huahai 2025-10-20T05:04:41.748999Z

We have mostly achieved our performance goals.

🙌 1

maxweber 2025-10-19T05:19:41.015219Z

Last week I have worked with LMDB for another purpose. I came across the 511 bytes LMDB key limit. I'm now trying to understand what Datalevin's https://github.com/juji-io/datalevin/blob/master/doc/limits.md are in this regard. I understood that attribute names are not allowed to be longer than 511 bytes (not a problem for most folks I guess). But what it the limit for the whole datom / triple? Does Datalevin uses the btree of LMDB or does it use its own index structure which is just stored inside LMDB?

Huahai 2025-10-19T05:22:37.803119Z

511 bytes limit is a compiler config, so if you want, you can build maxkeysize=0 build that does not have any limit. However, Datalevin does use the default build that has 511 bytes key limit. For the triples, our limit is 2GB. So yes, we have our way to store large blobs in LMDB.

maxweber 2025-10-19T07:27:43.632969Z

Thanks a lot for the reply. Is it a btree, lsm-tree or something else? So in essence Datalevin uses LMDB like Datomic uses "storages" to store its segments, which contains Datomic's own index structure?

Huahai 2025-10-19T15:34:02.128129Z

It is a B+ tree. We don't use LMDB as a generic KV store (that would leave performance on the table), instead we use the features of LMDB to the maximal, e.g. Txn reset/renew, DUPSORT, DBI, etc. We are now at a point where we need to extend LMDB itself.

👍 1

Anton Shastun 2025-10-19T17:39:11.900669Z

@huahaiy is it any docs to read about extension of LMDB?

Huahai 2025-10-19T17:52:59.098279Z

not right now

Anton Shastun 2025-10-19T17:54:12.092579Z

will be really interesting to read about

Huahai 2025-10-19T19:34:36.185069Z

will do

2025-10-19T22:41:32.800229Z

I dimly remember that it was possible to use Datalevin with an off-the-shelf LMDB .so file (such as might have been packaged with a Linux distro) ... will doing that remain an option?

Huahai 2025-10-20T01:33:52.416019Z

it remains an option with degraded performance compared with using our forked LMDB.

Huahai 2025-10-20T04:54:52.281869Z

Here are some descriptions of the new features: https://github.com/huahaiy/dlmdb

👍 1

Clojurians Log v2

datalevin 2025-10-19