Fork me on GitHub
#xtdb
<
2020-04-13
>
dvingo06:04:55

I have a rocks db standalone topology. It seems like some transactions are not being processed:

(crux/latest-completed-tx crux-node)
=> #:crux.tx{:tx-time #inst"2020-04-13T06:12:57.689-00:00", :tx-id 28}

(crux/await-tx crux-node (crux/submit-tx crux-node [[:crux.tx/put {:crux.db/id :hi :text "new"}]]))
=> #:crux.tx{:tx-time #inst"2020-04-13T06:12:57.689-00:00", :tx-id 28}

(crux/await-tx crux-node (crux/submit-tx crux-node [[:crux.tx/put {:crux.db/id :hi2 :text "new"}]]))
=> #:crux.tx{:tx-time #inst"2020-04-13T06:12:57.689-00:00", :tx-id 28}

(crux/latest-completed-tx crux-node)
=> #:crux.tx{:tx-time #inst"2020-04-13T06:12:57.689-00:00", :tx-id 28}

(crux/sync crux-node (Duration/ofSeconds 10))
=> #inst"2020-04-13T06:12:57.689-00:00"

(crux/latest-completed-tx crux-node)
=> #:crux.tx{:tx-time #inst"2020-04-13T06:12:57.689-00:00", :tx-id 28}
Using mount, this is how I construct the node:
(defn start-crux-node ^ICruxAPI [storage-dir]
  (crux/start-node {:crux.node/topology '[crux.standalone/topology
                                          crux.kv.rocksdb/kv-store]
                    :crux.kv/db-dir     (str (io/file storage-dir "db"))
                    :crux.kv/sync?      true}))

(defstate crux-node
  :start (start-crux-node "crux-store")
  :stop (.close crux-node))

jarohen16:04:31

hey @U051V5LLP, will see if I can repro this

jarohen16:04:19

this is pretty much the setup we use in Crux's user.clj - I've run the transactions you've provided and they went through fine 😕 were there any other messages in the logs? also, could you post the return from the submit-tx calls too?

dvingo18:04:30

Hey @U050V1N74 thanks for looking into it. I wasn't sure what else to try, so ended up deleting the crux-storage directory and started over. I am pretty new to crux, so I'm not sure if I have to enable logging, but I didn't see anything in the db directory that looked suspicious. I noticed some other examples explicitly specify the event-log-dir as well, so I'm trying that.

{:crux.node/topology                 '[crux.standalone/topology
                                         crux.kv.rocksdb/kv-store]
   :crux.kv/db-dir                     (str (io/file data-dir "db"))
   :crux.standalone/event-log-dir      (str (io/file data-dir "eventlog"))
   :crux.standalone/event-log-kv-store 'crux.kv.rocksdb/kv}
If it starts happening again I'll post back here.

dvingo19:04:34

I noticed with my original config:

(defn rocks-config [data-dir]
  {:crux.node/topology '[crux.standalone/topology
                         crux.kv.rocksdb/kv-store]
   :crux.kv/db-dir     (str (io/file data-dir "db"))
   :crux.kv/sync?      true})
When I start the node I get the following in the terminal:
WARN  crux.kv.memdb  - Using sync? on MemKv has no effect. Persistence is disabled.
WARN  crux.kv.memdb  - Using fsync on MemKv has no effect.
And when I use:
(defn rocks-config [data-dir]
  {:crux.node/topology                 '[crux.standalone/topology
                                         crux.kv.rocksdb/kv-store]
   :crux.kv/db-dir                     (str (io/file data-dir "db"))
   :crux.standalone/event-log-dir      (str (io/file data-dir "eventlog"))
   :crux.standalone/event-log-kv-store 'crux.kv.rocksdb/kv
   :crux.kv/sync?                      true})
those warnings are gone. Perhaps this config with the event-log-dir and event-log-kv-store can be added to the documentation here https://opencrux.com/docs#config-rocksdb

Jorin11:04:25

Oh wow! this is a really good point! This also got me with rocksdb: Just included it in the topology and was hoping it does something. Now if it is actually enabled, things suddenly work (and it goes at twice the speed 🐎 )

jarohen11:04:47

ah, yes - it's not obvious, but these are two separate KV stores - one to store all of Crux's indices (set up by including Rocks in the topology) and one to back the event-log of the standalone tx-log. reason why we don't make Rocks the default in standalone is that we'd like to keep standalone completely, well, standalone - not depending on any other Crux modules but yes, this could definitely do with being clearer in the docs cc'ing @U899JBRPF - he's on a docs bash atm 🙂

✔️ 4
Jorin11:04:47

That would be awesome, thanks 🙂 The thing is I jsut checked the docs again and it says this 😅

Jorin11:04:35

I would read that like rocksdb is the default or am I confusing something?

refset12:04:04

@U8ZN5EHGU looking at the commit history I think this got missed when we made the change to the default, so it's stale, sorry!

jbrown22:04:50

I got an error recently and now crux will not complete any queries (they never return, no errors). Using rocksdb and crux version 20.01-1.6.2-alpha Error

Exception in thread "async-dispatch-1" java.lang.Error: java.util.concurrent.TimeoutException: Timed out waiting for index to catch up, lag is: 4653056
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1155)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused by: java.util.concurrent.TimeoutException: Timed out waiting for index to catch up, lag is: 4653056
	at crux.tx$await_no_consumer_lag.invokeStatic(tx.clj:419)
	at crux.tx$await_no_consumer_lag.invoke(tx.clj:406)
	at crux.node.CruxNode.sync(node.clj:141)
	at crux.api$eval22834$fn__22843.invoke(api.clj:196)
	at crux.api$eval22608$fn__22748$G__22589__22761.invoke(api.clj:61)
	at internal_analytics.sync$sync_db_logs$fn__23753$state_machine__5409__auto____23762$fn__23767.invoke(sync.clj:118)
	at internal_analytics.sync$sync_db_logs$fn__23753$state_machine__5409__auto____23762.invoke(sync.clj:118)
	at clojure.core.async.impl.ioc_macros$run_state_machine.invokeStatic(ioc_macros.clj:973)
	at clojure.core.async.impl.ioc_macros$run_state_machine.invoke(ioc_macros.clj:972)
	at clojure.core.async.impl.ioc_macros$run_state_machine_wrapped.invokeStatic(ioc_macros.clj:977)
	at clojure.core.async.impl.ioc_macros$run_state_machine_wrapped.invoke(ioc_macros.clj:975)
	at clojure.core.async.impl.ioc_macros$take_BANG_$fn__5427.invoke(ioc_macros.clj:986)
	at clojure.core.async.impl.channels.ManyToManyChannel$fn__498.invoke(channels.clj:135)
	at clojure.lang.AFn.run(AFn.java:22)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	... 2 more
Any ideas what I can do to fix this?