In Datomic Cloud, given a database value, is it possible to get the txInstant of the last transaction without multiple API calls?
Reverse index-pull :db/txInstant?
I'm going to need an E value... right? I can get the basis-t, but then how to get the transaction eid?
Maybe I can omit the E value and just provide A. With a :limit that might work....
Yes, it does. Here's the snippet:
(first (d/index-pull db {:index :avet :start [:db/txInstant] :reverse true :limit 1 :selector '[:db/txInstant]}))Datomic beginner here. I intend to run datomic on top of postgres and reading through a guide I was wondering if bulk uploads and downloads are as well supported by datomic as it was a big req in my last workplace which used postgres.
what do you mean by bulk uploads? like ingesting large amounts of data?
yes, from csvs
afaict, there is no other mechanism apart from datomic.api/transact so you would need to transform the CSV into tx data
who takes care of the compute in this case? in case of pure postgres, it is postgres. but i am guessing in this case it is where transactor is running and not postgres. i am asking so i have an idea if i can run other computing tasks where datomic transactor is running.
any peer can do the CSV->tx-data conversion, but the transactor needs to do the transaction
@vaibhawc we recommend https://docs.datomic.com/operation/deployment.html#process-isolation? for production. Many folks during development undistribute Datomic by running more than one process. But if this batch import is going to be an on-going process on your system I recommend not doing that so you can get the tuning correct without competing for resources.
so which process will be participating the most for batch processing? transactor or storage?
The transactor.
Only in cases where you have provisioned storage (like AWS DDB) would you likely need to concern yourself with capacity tuning at the storage level.
Got it. Thanks!
I’d point out here that the majority of compute is actually spend by transactor during indexing, depending on size of dataset of course
(at least that’s observable for us)
so should I rather run transactor in a separate node? what is the best practice?
I would say it depends on the specifics of your dataset and workload