This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2022-10-04
Channels
- # aleph (23)
- # announcements (1)
- # babashka (21)
- # beginners (70)
- # biff (3)
- # cider (8)
- # clj-kondo (45)
- # clj-yaml (9)
- # clojure (69)
- # clojure-europe (82)
- # clojure-nl (1)
- # clojure-norway (2)
- # clojurescript (34)
- # conjure (19)
- # core-typed (6)
- # cursive (2)
- # events (5)
- # fulcro (55)
- # honeysql (1)
- # integrant (18)
- # jobs (1)
- # lsp (124)
- # malli (10)
- # meander (1)
- # off-topic (26)
- # polylith (8)
- # reagent (7)
- # releases (1)
- # remote-jobs (1)
- # sci (2)
- # shadow-cljs (19)
- # squint (5)
- # vim (17)
- # xtdb (31)
Hi, General question: Is there any guide to indexing in xtdb? I'm not familiar with the concept in general, so please assume I'm a complete newb and can you point me to some resources if needed? Specifically, my worry came when I wanted to add an indexed field which can contain as value a large set of strings.
https://docs.xtdb.com/language-reference/datalog-queries/ right here at the top
https://docs.xtdb.com/language-reference/datalog-queries/#_maps_and_vectors_in_data has more info on nested structures
Caused by: java.lang.AssertionError: Assert failed: LMDB write lock timeout\n(pos? stamp)
at xtdb.lmdb$acquire_write_lock.invokeStatic(lmdb.clj:38)
at xtdb.lmdb$acquire_write_lock.invokePrim(lmdb.clj)
at xtdb.lmdb$increase_mapsize.invokeStatic(lmdb.clj:46)
at xtdb.lmdb$increase_mapsize.invokePrim(lmdb.clj)
at xtdb.lmdb.LMDBKv$fn__5751.invoke(lmdb.clj:232)
at xtdb.lmdb.LMDBKv.store(lmdb.clj:229)
at xtdb.kv.index_store.KvIndexStoreTx.commit_index_tx(index_store.clj:1074)
at xtdb.tx.InFlightTx.commit(tx.clj:350)
at xtdb.tx$__GT_tx_ingester$fn__45194.invoke(tx.clj:511)
we tried starting a new ECS node, the ingester aborted again while replaying tx log after restoring from snapshot
we checked the machine, plenty of disk space so the file has room to grow in the filesystem it is on
then we doubled the memory and started a new ECS node again, that seems to run ok now
I have very little operations knowledge about LMDB so this is a mystery to me… how much memory should be left for it
hey Tatu, sorry to hear the LMDB issues are continuing. as far as I know this was the issue that we were trying to resolve with the fix in 1.22.0 - I don't think this was related to running out of space, more an issue with the lock itself
shouldn’t the ingester be a single thread, I don’t really understand the locking stuff
I’m not that familiar with the StampedLock, but it looks like readers block the tryWriteLock so could a constant query pressure just fail to acquire the lock
If you are able to attempt to use RocksDB as the index-store in the meantime it should be a drop-in replacement
I think I should put a check for this in my health endpoint so ECS will consider the node unhealthy if the ingester has aborted… should I just compare latest completed vs submitted or is there a better way to determine ingester status
or looking at what await-tx does, can I just call (xtdb.db/ingester-error (:tx-ingester node))
in my health endpoint to check for this
perhaps some evidence for query pressure… I added more nodes to our test environment (from 1 to 3) and they all work ok, no ingester aborted errors so far
> shouldn’t the ingester be a single thread, I don’t really understand the locking stuff @U11SJ6Q0K yes it is - unfortunately when we resize the LMDB map we have to lock out both readers and writers - so the readers and writers take a read-lock on the increase-mapsize lock; the ingester takes a write lock when it actually needs to increase the size
> perhaps some evidence for query pressure… so query pressure would make sense, yes - I don't think the StampedLock that we use provides any fairness guarantees, so it may be that the readers continuously hold the lock and the writer can't get in
is there anything that can be done? it looks like it stays in failure mode indefinitely after the first time it fails to acquire lock
Hey @@tatut đź‘‹ , just to say I've picked this up, trying to find the right solution at the moment. Having done some testing, I think the problem is actually 1.21.0 (despite the map resize race) the underlying write lock used by the ingester had no timeout (LMDB controlled the blocking/lock before), with the fix we opted all writes in to the 2 minute timeout. So trading one problem for another, as is often the way. I'll update you once I have identified a decent solution for test (I have a few ideas)