xtdb 2024-03-08 | Slack Archive

Hukka11:03:55

Does the v2 local storage flush when the node is closed, or only after enough rows (or time) has been submitted? Didn't see any obvious way to make it persist the data at shutdown

jarohen11:03:45

no, only after either enough rows or time, as you say. It'll persist any submitted transactions to the tx-log, so they'll be durable, but you won't necessarily get anything in the object-store just from a node closing

jarohen11:03:35

reason being that the blocks in the object-store need to be at deterministic intervals so that when you have multiple nodes, one node closing doesn't affect the others

Hukka11:03:42

Hm, then I suppose I had problems with the flushed txs getting read after (another) node starts again

jarohen11:03:23

ah - you may be running into https://github.com/xtdb/xtdb/issues/2789 ?

jarohen11:03:07

the marker of this one is that, if you submit a transaction (even an empty one) on the new node when it starts up, do you see subsequent queries returning data?

Hukka11:03:52

Indeed, that's what happens: empty transaction forces a sync

Hukka11:03:37

Or just a tiny sleep

jarohen11:03:07

cool, ok 🙂 openly, haven't committed to the behaviour of this one yet (opinions welcome!) • we could get every node to sync on startup (this'd increase startup time) • we could find a way for the node to store its own transient state locally (although this wouldn't apply in the 'immutable architecture' case of starting nodes completely fresh) • others?

jarohen11:03:07

this won't affect a client-server use-case - the clients all keep track of the latest transaction that they've submitted and ask every server they talk to to ensure that it has at least that one (so read-after-write consistency is preserved)

Hukka11:03:24

Have to see how the client asks it, that would quite sufficient. Used to calling https://v1-docs.xtdb.com/clients/1.24.3/clojure/#_sync in v1 too, just didn't see an API for that in v2

jarohen11:03:43

yeah - that's the one that is now each client's responsibility

jarohen11:03:12

the XT language clients do this for you automatically, so shouldn't be noticeable there

jarohen11:03:11

they intercept the return of submit-tx before passing it back, and then supply the latest one they've seen as the after-tx option to any queries

Hukka11:03:53

Ok, so I can get the latest from status, and use that then

jarohen11:03:02

what's the client btw, in your case? i.e. is it an ongoing process/browser session or similar?

cli

And also repl 😉

naturally ☺️

so CLI's starting a new process every time, I'd guess, bit trickier to keep that state

Hukka11:03:29

Yeah

Hukka11:03:43

Found a nifty little helper in xtdb.time called after-latest-submitted-tx

jarohen11:03:21

REPL - if your REPL connection is longer living than your XT node I might be tempted to keep some sort of submit-tx/q wrapper which preserves that state

Hukka11:03:14

Definitely, and keeping the node around is not hard either. It's just the first use that goes off the rails

jarohen11:03:23

otherwise, empty transaction is probably the easiest thing to do - at least until we've figured out how peer nodes should behave at first startup 🙂

jarohen11:03:14

I wonder if your CLI could store its latest-submitted-tx in a local file?

Hukka11:03:39

Sure, but sounds more complicated

jarohen11:03:04

yeah...

jarohen11:03:23

would you mind if the XT node sync'd on startup, btw?

jarohen11:03:45

even if it meant waiting for longer

Hukka11:03:54

I would often want it to sync, but then again I could foresee wanting to start submitting txs too and don't care about the sync

👌 1

🙏 1

jarohen11:03:28

thanks - a good data point 🙂

Hukka12:03:37

Hmh, got an

; (err) Execution error (IllegalStateException) at org.apache.arrow.memory.BaseAllocator/close (BaseAllocator.java:477).
; (err) Memory was leaked by query. Memory leaked: (362588928)
; (err) Allocator(live-index) 0/362588928/1344965720/9223372036854775807 (res/actual/peak/limit)
; (err)

I guess I should try with an older Java than 22 :face_with_rolling_eyes:

jarohen12:03:59

ah - mind sharing a stack trace and/or the surrounding code, if you can?

Hukka12:03:54

Unfortunately it didn't output it, was just a print. I was loading a bunch of data in in batches of 10k. Looks like it got to 264070 with just 33 docs missing from what was in the dataset

Hukka12:03:25

I'll change things a bit so I could catch it

Hukka12:03:04

Running out of time though, so I have to get back to that next week

jarohen12:03:59

no worries - give us a shout 🙂

2024-03-08

Channels