Fork me on GitHub
#xtdb
<
2021-08-28
>
emccue04:08:18

This might be a really dumb and common question

emccue04:08:34

But what is the difference between datomic and crux?

emccue04:08:18

The biggest thing making me uneasy is that with datomic it's all "relational" and "denormalized" (I think)

emccue04:08:01

And I associate "document db" very strongly with "inflexible data model made when you knew all the use cases up front"

dominicm08:08:51

> So I could continue to model all of my data de-normalized and that wouldn't be the "wrong" decision? Yep!

dominicm08:08:51

> the other associations i have are "kafka=high cost" A lot of the maintenance pain of Kafka is "legacy" at this point. A major release of ZooKeeper had a series of bugs in subsequent releases which Kafka didn't work well with. Those issues are largely fixed at this point. With the addition of https://www.confluent.io/confluent-cloud/ (which runs inside your cloud) and also cloud-vendor services like https://aws.amazon.com/msk/ the Kafka cost is not so high.

dominicm08:08:19

Oh, maybe you meant actual cost since you looked at rocksdb as high maintenance. https://aws.amazon.com/msk/pricing/ vs https://aws.amazon.com/rds/postgresql/pricing/?pg=pr&amp;loc=3 shows that pricing is a bit higher with Kafka, but not by much. I've heard Confluent cloud is cheaper, but that may only be "at scale".

dominicm08:08:38

> "rocksdb (or any other db not provided by cloud provider) = high maintenance" Comparing rocksdb to, e.g. rds, dynamo, mongo is a non sequitur. rocksdb is more akin to sqlite in terms of management/hosting, and in terms of choice (e.g. lmdb, rocksdb) is a matter of performance comparison that your average user doesn't need to worry about for a project until they hit a certain amount of scale. In crux, the maintenance of rocksdb is very close to zero beyond ensuring you have some disk for it to persist to.

emccue16:08:07

Thats for the added context. We have some features for which we kinda desperately want time based features and its really hard to choose between datomic and crux

emccue16:08:13

crux has far better documentation and is open source, but i feel like i understand datomic's model better and i can just set it up in a click on AWS with paid support

emccue16:08:32

i'm really just not understanding the document stuff as it relates to AEVT + valid time

emccue16:08:11

why can't i say "here is this one fact i learned about an entity" instead of reprocessing a whole document?

Aleed20:08:06

@U09LZR36F saw you recommended The Impedance Mismatch talk by Stuart Halloway in an off-topic convo, very informative, thank you. may be relevant to this convo too. https://www.infoq.com/presentations/Impedance-Mismatch/ i wonder how crux compares to decision matrix Stuart presented in above video. seems that difference is processing and structure being limited to documents (which Stuart warns against but I imagine Crux’s design works around majority of drawbacks)

👀 1
dominicm21:08:02

To see the differences, you may need to add some more criteria. They're very similar on that criteria set imo.

emccue03:08:52

I think from my reading so far that maybe "document DB" might be too overloaded

emccue03:08:35

crux seems to want to be a "record db" - facts that are tied directly to an identity

emccue03:08:24

whereas what I think of as a "document db" is focused on denormalizing data for query efficiency + lying about it to get investors if you are mongodb

emccue03:08:45

and the whole "schemaless" thing isn't so much about "do what you want!" for crux but more that "appropriately making a schema for potentially nested data is unsolved so we don't try to do it in the db itself"

💯 1
refset16:08:09

Sorry to chime in late - this message covers several messages/subthreads... > why can't i say "here is this one fact i learned about an entity" instead of reprocessing a whole document? You can always write a transaction function to emulate the full range of datom semantics. I started an attempt here once: https://gist.github.com/refset/a00be06443bc03ccc84a2874af3cdb8a I think the biggest issue that prevents it being ~trivial (ignoring the assumptions/inefficiencies baked into the code/architecture about documents) is deciding how to handle cardinality in the absence of a schema. For instance, should a new EAV always replace the existing EAV (as in same entity & attribute)? > I think from my reading so far that maybe "document DB" might be too overloaded It's true that "document" is not a very meaningful term, but we chose it because it's (1) somewhat accurate (moreso than "graph" or "relational") and (2) likely to attract more attention from people who already looking beyond the world of SQL than anything else we could think of. Thanks for chiming in with feedback about "record db" - we are definitely favouring that term at the moment also (https://opencrux.com/articles/strength-of-the-record.html) and may yet pivot :) > the whole "schemaless" thing isn't so much about "do what you want!" for crux but more that "appropriately making a schema for potentially nested data is unsolved so we don't try to do it in the db itself" That's a very good description - we may have to borrow it! > i wonder how crux compares to decision matrix Stuart presented in above video. seems that difference is processing and structure being limited to documents I think Crux's similarly graph-like indexing over documents more or less satisfies all the commentary about processing and structure mentioned in the talk (including all of dictionary / rectangle / column / graph / entity representations). Crux's documents provide a default escape hatch for arbitrary data, but otherwise, when used with intention, broadly behave like an atomic group of datoms.

👍 1
emccue04:08:31

And I haven't found any articles or blogs that clearly lay out what I should be thinking about this