datalevin

Huahai 2026-02-04T00:14:18.955349Z

0.10.4 is released with document database feature: https://github.com/datalevin/datalevin/blob/master/CHANGELOG.md#0104-2026-02-03

πŸ‘ 5
πŸ”₯ 7
Ahmed Hassan 2026-02-04T13:17:12.205459Z

https://github.com/datalevin/datalevin/blob/master/doc/idoc.md How can we update/transform the document or nested value/object in it?

Huahai 2026-02-04T15:21:08.487689Z

A document is just a value, you just refresh the value. The system will figure out the changes and update the appropriate paths.

Huahai 2026-02-04T15:24:51.982059Z

The keyword here is "auto" indexing.

Huahai 2026-02-04T15:32:01.456619Z

Of course, if you want the DB to surface some ways to update an idoc directly, I am open to that. How do you like a transaction function, e.g. :db.fn/patchIdoc as part of the tx data?

Huahai 2026-02-04T16:11:43.428529Z

Something ike:

[:db.fn/patchIdoc e a [[:set [:profile :age] 42] [:unset [:profile :middle]] [:update [:tags] :conj "c"]]

Huahai 2026-02-04T16:13:13.118889Z

So you don't have to update the doc in your own code. It's more convenient.

Ahmed Hassan 2026-02-04T17:50:12.713989Z

will Datalevin load whole document to the memory to be updated (as Clojure data structures), in current implementation?

Huahai 2026-02-04T17:53:37.551119Z

The document is the value of a Datom. So, even if you patch the doc, the whole document is still replaced. The only thing you are saving by direct updates. are the path index updates. The system doesn't have to update all paths, just the paths that have changed. It is already the case. With direct update transaction function, we can be even faster, for we don't need to run the diff routine to figure out which paths changed. We just change the paths that you touch.

Ahmed Hassan 2026-02-04T17:57:44.443129Z

:set, :unset, :update, language seems similar to EditScript.

Ahmed Hassan 2026-02-04T18:00:28.701459Z

this kind of DSL can end up needing control flow like IF/ELSE/COND. https://www.mongodb.com/docs/manual/reference/operator/aggregation/cond/

Huahai 2026-02-04T18:02:07.651569Z

I will not want to make it more complicated than simple point updates. If you want complex updates, you can do it in your own code.

Huahai 2026-02-04T18:02:43.926899Z

This is just a fast path for simple update operations. For extensive changes, you better do in the user code. That's why I didn't introduce such a thing in the beginning.

πŸ‘ 1
Huahai 2026-02-04T18:12:29.030729Z

It is more convenient to use Clojure code to update a nested document, using Specter, Meander or whatever. It's up to you. DB is for persistence, integration, and performance. An idoc here is just a value in a Datom, for those cases where you want the flexibility. This design still encourages proper data modeling with attributes. This is not mongodb.

1
Ahmed Hassan 2026-02-04T18:19:22.732239Z

I agree. It seems like :db.fn/patchIdoc won't be required.

Ahmed Hassan 2026-02-04T18:20:18.898729Z

As triple store already fulfills granular updates requirements.

Ahmed Hassan 2026-02-04T18:20:48.164039Z

In mongodb, document is only way to model data.

Huahai 2026-02-04T18:21:08.693849Z

It is not required, but it is nice to have a fast path for small changes, i.e. increment a counter deep in the doc, assoc a value in a nested map, etc.

πŸ‘ 1
Huahai 2026-02-04T18:22:53.724599Z

Indexing by path does provide new capabilities. Triple store is mostly flat. The containment relationship is not strictly maintained. So adding document indexing feature complements existing solutions.

πŸ‘ 1
Huahai 2026-02-04T18:24:11.289849Z

E.g. you can make up more attributes, but :level-1/level-2-leve-3 would become unwieldily.

Ahmed Hassan 2026-02-04T18:29:01.710279Z

There's :db/isComponent attribute as well on schema level. but using :db.type/idoc we can store list of dependent entities.

Huahai 2026-02-04T18:40:44.647839Z

idoc is mostly useful for when you are already using some documents for data or the use case calls for a document. This feature provides some convenenice in working with the documents, more importantly, to do integrated queries together with other data in the DB. ACID tx property is nice too.

πŸ‘ 1
chromalchemy 2026-02-06T14:54:10.955149Z

@huahaiy For a local (offline) data store, I have been saving data from an api to disk (as transit json), and loading that into memory on program start. (then using native clj and specter as a query/update language) Now I’ve outgrown that approach and am using Datalevin to free up working memory (and add formal query language). Would an api-data-snapshot like that be a good candidate for the document feature? Or is this more in the ballpark of traditional data loading? I’m open to best practices. Like maybe i should update this api data directly in datalevin db (when online), instead of saving to disk manually first? (then there is no β€œdocument”, just db) But since i am already saving to disk, the document api is intriguing. I am particularly interested in maximal schema flexibility. Also, when you discussed updating values in a document, does that actually write changes to the document on disk (document mutated by transactions)?

Huahai 2026-02-06T23:40:36.711729Z

Idoc is a data type, so when you transact an idoc, in addition to save the value as part of a datom, Datalevin builds an index for the internals of the idoc. So yes, the document is saved at least twice, once as a serialized binary blob, again as paths, values -> doc id mapping, and other structures. These are all on disk.

Huahai 2026-02-06T23:55:02.981489Z

Update your api data directly in DB should be fine. As long as your api data is not too big. We do have a size limit of 2Gb for a data value.

siavash mohammady 2026-02-04T03:20:32.185589Z

Hi What is the downside of VAE removal beside the retraction?

Huahai 2026-02-04T03:34:43.308569Z

Nothing. Only upside.

πŸ‘ 1