2024-05-14 datahike | Clojure Slack Archive

datahike

alekcz 2024-05-14T15:14:20.709189Z

Is there way to directly load my datom into datahike without having to do a migration script?

whilo 2024-05-14T21:55:11.426799Z

How do you do it at the moment? I am not sure with what you mean with the migration script. There is load-entities which allows you to do a raw import from another database.

alekcz 2024-05-14T23:41:05.356479Z

I think that's what I need. I was trying sherpa and not winning.

alekcz 2024-05-14T18:57:00.353049Z

Second question, my cbor backup is ~100MB, but the postgress database is 16GB. Is this normal?

whilo 2024-05-14T21:52:17.685889Z

@alekcz360 you can apply (gc! @conn) to collect all snapshots that are older than the current one, that should bring the 16GB down a lot https://github.com/replikativ/datahike/blob/main/src/datahike/experimental/gc.cljc#L45

whilo 2024-05-14T21:54:00.785259Z

This is because all intermediate snapshots after each transaction are preserved by default and accessible to all distributed readers as long as they stick around. The hitchhiker-tree was more efficient in handling write operations, but we have not yet ported its functionality over to the persistent-sorted-set. Datomic uses a transaction log overlay before writing to the indices. This has the advantage of creating less index fragments on copy-on-write operations, but requires coordinating with the transactor explicitly to fetch the latest log, which induces a lot of complexity in the Datomic design from what I understand. The hitchhiker-tree has the logs integrated fractally into each node, which is in a sense optimal, but there are subtleties between having to reapply changes on read operations every time, while doing them once at write time and writing an optimal B-tree. Currently the latter is happening without the logs while causing more redundancy in storage usage.

whilo 2024-05-14T22:02:58.219969Z

So for now you can just invoke gc! in regular intervals and it should keep the memory usage much lower.

alekcz 2024-05-14T23:41:21.821279Z

Thanks. I'll give that a shot

Clojurians Log v2

datahike