datahike

whilo 2024-10-09T07:00:18.006509Z

I think this is fairly close to optimal, the main remaining issue is to write full tree snapshots less often by using a tx-log in the db root. This is what I am looking into right now.

🙏 1
lambdam 2024-10-09T10:15:00.653279Z

Hello, Another question about something that seems to be a strange behavior with :attribute-refs? activated. I declared an ident with the following schema:

{:db/ident :foo/bar
 :db/valueType :db.type/keyword
 :db/cardinality :db.cardinality/many}
Then when I transact this:
(d/transact conn [{:db/id id
                   :foo/bar [:plop]}])
It works, but when I transact the same information with :db/add:
(d/transact conn [[:db/add id :foo/bar :plop]])
I get the following error:
Bad entity attribute :foo/bar at [:db/add id :foo/bar :plop], expected reference number
In Datomic, those tx-datas are equivalent. And when I turn off :attribute-refs?, I don't get the error. Is it a known behavior and/or a limitation? Thanks again

lambdam 2024-10-10T10:01:42.732599Z

@whilo I just tried on a project where we use Datomic with a recent version, this is how a datom is printed in the :tx-data key of a tx-report:

#datom[17592186157768 127 :foobar 13194140623859 true]
The attribute is clearly printed as an integer. And (:a datom) returns the integer.

whilo 2024-10-10T17:27:02.918389Z

I opened a bug report here https://github.com/replikativ/datahike/issues/717, feel free to add any information if needed. I expect to fix this soonish.

Jonas Östlund 2024-10-09T10:23:44.721369Z

Which version of Datahike are you using? I believe I made this possible in the following pull request that got merged: https://github.com/replikativ/datahike/pull/698/files#diff-5c1ac5307484c8750cc542aa3136d7fd96a4ae4b49bf28c27fdb02320d8f20c8R202

lambdam 2024-10-09T10:25:29.749019Z

I'm using the latest one, 0.6.1575 . I'll try right now with 0.6.1591.

👍 1
lambdam 2024-10-09T10:31:27.534629Z

Thanks a lot. It works fine with latest version!

Jonas Östlund 2024-10-09T10:31:43.209249Z

Great!

whilo 2024-10-09T17:36:48.183529Z

I saw that in the tx-report the new Datoms are also now reported with integers for attributes when attribute-refs are on. I am not sure whether this is what Datomic does, I guess not.

whilo 2024-10-09T06:55:09.087149Z

For anybody who would like to reduce the storage size, these are the most compact settings for datahike at the moment:

{:store  {:backend ...
          :config {:compressor {:type :lz4}}}
 :keep-history? false
 :attribute-refs? true
 :schema-flexibility :write}

whilo 2024-10-09T06:59:18.487669Z

@dam @alekcz360 I did some REPL explorations with benchmarks. After gc! a Datom in this setting consisting of 4 longs (8 bytes) = 16 bytes costs ~20.1 bytes per index in a database with 10k stored datoms and this scales linearly in size from then on to 100k. This is also the case without lz4 compression since the value is a number in my benchmark. In case you store string data lz4 should greatly reduce size.