Fork me on GitHub
#xtdb
<
2022-03-24
>
J16:03:21

Hi guys 🙂 , I would like clarification on the internal workings of xtdb especially when there are multiple nodes. My config is simple: The document store and the tx log are managed by Postgres. I leave the index store in memory (RAM). Here the code example:

(def xtdb-node ...)
(def xtdb-node-2 ...)
;; Submit a doc with the xtdb-node
(xt/submit-tx xtdb-node [[::xt/put {:xt/id 1 :user/name "John"}]])
;; Submit a doc with the xtdb-node-2
(xt/submit-tx xtdb-node-2 [[::xt/put {:xt/id 2 :user/name "Sam"}]])
;; Pull the doc (created with xtdb-node-2) with xtdb-node
(xt/pull (xt/db xtdb-node) [:user/name] 2)
=> #:user{:name "Sam"}
;; Pull the doc (created with xtdb-node) with xtdb-node-2
(xt/pull (xt/db xtdb-node-2) [:user/name] 1)
=> #:user{:name "John"}
How the xtdb-node can retrieve a doc created with the xtdb-node-2 ? Is xtdb refresh all nodes after a transaction?

Hukka18:03:30

You need to call sync, which I think with postgres means that it will block until the normal refresh is done. So there can be as big delay as you set the transaction poll delay (100ms by default)

Hukka18:03:09

Do note though that even the same node needs to call sync in code like that. Otherwise the pull is happening before the transaction has had time to process

J20:03:40

Thanks @U8ZQ1J1RR for you answer. On a multi instance service, what is the best way to sync a node? Make the sync into a dedicated thread?

refset21:03:52

"best" really depends on what you're trying to achieve. The nodes automatically keep up to date, so you don't have to manually sync, that's only needed when you are trying to achieve a particular read-your-writes or strictly serialized consistency level with load balancing, etc.

J06:03:38

Thanks @U899JBRPF for you answer. What do you mean with strictly serialized consistency level with load balancing?

Hukka06:03:00

load balancing = requests are going to multiple places without the caller seeing that, strictly serialized = things never appear to be out of order from callers perspective

Hukka06:03:47

Well, I shouldn't try to explain things that Jepsen has already explalined: https://jepsen.io/consistency/models/strict-serializable

Hukka06:03:25

It's not just one callers perspetive, it's global. So if a slow transaction is created first on a node A, then fast transaction on node B, they still should end up in A→B order, which is trivial on a single node (i.e. the node processes the transactions in order always), but not trivial on multiple nodes

J06:03:16

Thanks @U8ZQ1J1RR 👍

1
Hukka06:03:52

Because I can't let go, the case for a single callers' (notice how I had missed where the ' goes in my earlier comment ; ) is solved often easily by using sticky sessions on the load balancer