Fork me on GitHub
#xtdb
<
2021-08-18
>
tatut05:08:22

what is the "cost" of updating a document? I'm thinking should refs be modeled in such a way that you minimize how many documents you need to update in a tx (eg, a child referring to its parent instead of parent having a vector of child ids)

nivekuil05:08:52

I think you should always prefer back references unless you need ordering and can't model that ordering outside of crux

12
refset11:08:45

^ preferring back references is usually good advice. The main cost of generating many updates to the same document is the storage cost due to full duplication within the document store (which doesn't (currently) benefit from any possible structural sharing - unlike the local indexes), and also there is also some performance cost when processing larger documents with massive numbers of forward references

3
tatut11:08:37

makes sense

🙂 3
tatut11:08:42

another new comer question: in a clustered environment, how do the nodes coordinate who is doing writes? is that affected by what storage for tx log you are using

refset11:08:03

all nodes can submit transaction writes as equal citizens, but the transactional (ACID!) ordering of the writes is governed by the single-writer tx-log backend (which is a single-writer regardlesss of which tx-log backend storage you choose)

tatut12:08:40

so is there some locking in the tx-log protocol?

tatut12:08:38

also "submit" to whom, who does the storing and index updating? the node doing the submit

refset13:08:37

any form of locking happens behind the scenes in the implementation. submit-tx on the db/TxLog protocol simply expects a Delay which is deref'd (blocking) as part of the user-facing crux.api/submit-tx API

refset13:08:40

each node will put documents to the doc store and submit it's own transactions to the tx-log as part of crux.api/submit-tx

refset13:08:32

later, irrespective of which node did the submitting, each node will read and locally index the transactions (and pull down the referenced documents from the doc-store for processing)

tatut13:08:45

ok, so there's no global store if index segments? each node will have its own

✔️ 3
tatut13:08:50

oh, the lmdb, rocksdb and xodus state that they can be index stores

tatut13:08:18

so will the first node to locally index the tx also update index stores?

tatut13:08:41

nvm, those aren't networked servers... so I guess the above still stands, that each node will have their own indexes

tatut13:08:20

hopefully I understood it 😛

💯 3
jarohen13:08:40

sounds right to me 🙂 yep, the index stores are all local to each node, and there's no co-ordination between each node, except via the central TxLog (Kafka or equivalent)

✔️ 3
mac22:08:33

Is it a bug that I am not able to use (instance? java.time.LocalDate ?t) in a where clause? I get "Clause refers to unknown variable: java.time.LocalDate...

refset22:08:04

Not as such. The queries are strictly edn-only, so Java types aren't recognised. You can work around this using a custom function and referencing it via a fully-qualified name, e.g. [(? ?t)]

mac22:08:47

OK, that works. Thanks.

🙌 3