Fork me on GitHub
#datomic
<
2022-04-06
>
Shuky Badeer11:04:16

Hi guys! Few days ago I asked a question about datomic here and someone brought up XTDB. I had a chance to look into it and it seems pretty cool. Does anyone has a good guide for how to set xtdb up on the cloud like GCP or AWS? Thanks a lot!

tatut11:04:17

there's a channel for #xtdb

1
timo15:04:33

I took over a Datomic job and still a bit new to it. Is it bad for Datomic to transact huge amounts of data every night that is not changed? I read https://ag91.github.io/blog/2022/03/13/datomic-a-little-snippet-to-analyze-what-attributes-your-transactions-change-most-often/ and found out that there are more than 2 million transactions every night that only have a :txInstant in it and I have the feeling that this is not good... Do I need to sort out what to transact in my code or is there some kind of trick to it?

emccue15:04:10

As long as you can pay for the storage, conceptually its sound

emccue15:04:22

you are reasserting facts

emccue15:04:42

someone else can probably accurately tell you if itll be a problem

favila15:04:46

well, if they’re empty transactions, it’s not known what facts are being asserted

favila15:04:08

these could actually be submitted as empty transactions.

favila15:04:43

The presence of the transaction itself could be a signal that a job was done, but for that to be useful signal the transaction would need some other metadata on it, not be completely empty.

favila15:04:13

I would say this is possibly a code smell, but operationally it’s not a problem in itself

emccue15:04:21

i read it as re-transacting a bunch of data and the only difference was the timestamp

timo15:04:53

ok, thanks. so it is a problem for the underlying storage like sql-db but not for datomic itself?! It is growing strong and needs to be contained.

favila15:04:20

well, is it? 2 million empty transactions is not going to take much space

favila15:04:31

it’s only going to take log space, and won’t take any index space

timo15:04:57

yeah, it is every night and the underlying oracle is more than 2tb already and growing fast

favila15:04:02

> i read it as re-transacting a bunch of data and the only difference was the timestamp From the tx log, you can’t tell the difference between this and (d/transact conn [])

timo15:04:37

right, I am already checking for empty now

Linus Ericsson06:04:14

If you want to avoid doing a lot of entirely empty transactions (which doesn't really give you much in terms of traceability) you should look in the db (d/db conn) for the data you try to upsert. If it is already there, you don't have to transact anything. But maybe you should create a transaction tagged with data that makes it apparent the system has made the integrity check of the data instead. This probably wont have to use 2 million transactions per night, though...

👍 1
timo11:04:57

Does it make history queries slower when there are retransactions every night with unchanged data?

favila11:04:10

no, because there are no datoms

favila11:04:22

the only datoms are the :db/txInstant datoms