Fork me on GitHub
#xtdb
<
2020-09-09
>
Janne Sauvala09:09:51

Hi 👋:skin-tone-2: I know Crux is a schemaless document database but is there a way to enforce schema during write? Or is the solution to use something like Kafka schema registry and validate the message schema before it gets stored?

refset10:09:44

Hey! There is a spectrum of options here. The simplest option, from a development and ops perspective, is to use transaction functions. The most efficient option would be to funnel all writes through a single node that could act as a "transactor" (but then fail-over for ops to a standby transactor is a challenge). Another approach is to have a two-phase write process, where docs are written schemaless and then some process later comes and validates that a doc conforms to some schema. Using Kafka's schema registry might well be possible and useful, but I'm not sure what the benefits would be over wrapping the regular submit-tx(with e.g. spec)

👍 3
refset10:09:16

Are you looking to enforce schema between documents or just the internal schema of a single document at a time?

Janne Sauvala08:09:27

Right, that is true I could just use spec to validate the document before submit - good call. Sorry, I didn’t quite get what you mean by enforcing schema between documents. I assume you meant by “the internal schema of a single document at a time” that I check the document I’m going to submit with spec so it has certain schema

refset10:09:01

Cool 🙂 in terms of between documents, I mean like, "this attribute ref must only point at this specific number of other type of document"

Janne Sauvala10:09:51

Thanks, now I got it 🙂 When I was asking the original question I was more wondering about enforcing schema of a single document at the time

🙏 3
Janne Sauvala10:09:32

That schema between documents sounds like foreign key -concept in relational db world. Out of curiosity, what should I do to achieve this?

refset13:09:38

That's when you need to reach for transaction functions. In the most simplistic case you might want to ensure that all references to a delete document are removed at the same time the document is deleted, which is possible as shown here with retract-entity https://gist.github.com/refset/a00be06443bc03ccc84a2874af3cdb8a

refset13:09:41

For more advanced constraints you may want to use combinations of with-tx and Datalog queries inside the transaction functions, as the "language" for declaring the invariants. Sadly I don't have a ready made example to hand for that

Janne Sauvala15:09:17

Now it makes sense. Thanks a lot, Jeremy. This was very helpful :thumbsup::skin-tone-2:

🙂 3