Fork me on GitHub
#datomic
<
2024-03-08
>
Loic00:03:03

Hi guys, I have a question regarding dumping datomic data into file using history.

(defn dump-props-history
  "Given a datomic `db`, query all the prop transactions between `t1` and `t2`.
   Return a coll of prop transactions following the [[prop-pull-pattern]]."
  [db t1 t2]
  (->> (d/q '[:find ?prop ?tx-time
              :in $ ?t1 ?t2
              :where
              [?t :db/txInstant ?tx-time]
              [(<= ?t1 ?tx-time)]
              [(<= ?tx-time ?t2)]
              [?prop :prop/id ?id]]
            (d/history db) t1 t2)
       (mapv (fn [[e t]]
               (d/pull (d/as-of db t) prop-pull-pattern e)))))
The prop entity has 6 attributes and :prop/id is the :db.unique/identity attribute. When I transact the new version of a prop, a few attributes change (prop/status, prop/op-at etc) so d/history returns all the txn for each attribute of a prop. However, I just want to dump the prop before and after new version (all the attributes are always updated at once). Therefore I use d/as-of to get the db snapshot at the time ?tx-time of the prop update. It seems to work but it is extremely slow and very demanding in terms of heap. Is there a better way to do it? I could not make it work using d/history only.

favila01:03:12

Perhaps this?

favila01:03:14

(defn dump-props-history
  [db t1 t2]
  (let [lookup-db (-> (d/history db)
                      (d/as-of t2))
        change-db (-> (d/history db)
                      (d/since (java.util.Date. (dec (inst-ms t1))))
                      (d/as-of t2))]
    (->> (d/q '[:find ?prop ?tx
                :in $ldb $cdb
                :where
                [$ldb ?prop :prop/id]
                [$cdb ?prop _ _ ?tx]]
              lookup-db change-db)
         (mapv (fn [[e t]]
                 (d/pull (d/as-of db t) prop-pull-pattern e))))))

nice 1
favila01:03:07

Your query is the cross-product of all transactions in your range and all props that ever existed

favila01:03:30

This narrows the dbs to only props whose ids were asserted before the end of your range, then narrows it again to only prop+tx pairs where something happened to the prop

favila01:03:49

that should reduce the number of duplicate pulls significantly to only those where some change happened to the prop

🙌 1
Loic03:03:06

Thank you very much for the help @U09R86PA4! It is indeed really fast now and work as expected. I am wondering why the need of lookup-db? I am planning to dump the props history every 24h so I think I just need change-db no?

favila03:03:14

If you only use change-db, you will only see :prop/id datoms asserted or retracted within the since-asOf range

favila03:03:49

Presumably you want changes to prop entities “created” (id asserted) before t1?

favila04:03:43

And to prop entities “deleted” (id retracted) before t2?

favila04:03:09

If you want something different you can adjust to taste

Loic05:03:48

I understand now, thank you!