Fork me on GitHub
#datomic
<
2018-03-10
>
hmaurer10:03:18

Hello. Quick question: I have read a number of soft restrictions on the amount of data that can be stored in a Datomic database. A number that often pops up it “10 billion datoms”. Would Datomic be suitable for large datasets (in the order of multiple terabytes), and can I expect efficient common queries (sub 50ms) over that sort of dataset?

hmaurer21:03:38

@marshall would you have an idea?

marshall21:03:46

It would depend a lot on the type of data. Terabytes might be pushing it

hmaurer13:03:19

@marshall I see. The 10 billion datoms soft limit seems quite low though. Assuming entities have ~ 1000 datoms on average, that’s about 10,000,000 entities in the system.

marshall13:03:42

The 10 billion number you mentioned is definitely not a hard limit. There is no hard limit to # of datoms - the use case, schema, and data access pattern will all affect how Datomic behaves with a large dataset. In practice we have very rarely seen people exceed, or even come close, to the 10B datom size for systems that are a good fit for other attributes of Datomic. By that I mean most systems that need transactional fully ACID semantics are not also true ‘write scale’ systems. Obviously that’s a generalization and there are some workloads/systems that may not be a great fit or that would benefit from a sharded (multiple txors) or hybrid (i.e. Datomic + a write-scale store) approach.

denik14:03:03

@marshall we printed the key, secret and region env vars and they were correct. This also worked before and we didn’t change anything around permissions. I’m not sure this has anything todo with it, but I reran the cloud formation templates to shut down the bastion.

Datomic Platonic15:03:36

Is there a cloud client that has the same semantics as the datomic-free version? Otherwise, well have to have different code in dev and production? (e.g., adding {:tx-data} to a transact operation)

souenzzo13:03:54

Datomic-free uses peer API

Datomic Platonic15:03:31

If not, are people solving this issue with multimethods or is there a more straightforward approach?

JJ17:03:22

Is choosing a bastion server obligatory? I would like to create my own ec2 inside the vpc for access.