Fork me on GitHub

thank you for crux. starting to look at today ) have a couple of questions:


Hey, cool! I'll respond here as the answers are short and turnaround will be quicker. Question 1 - correct, there can only be one version of a document per #inst (/range) per transaction Question 2 - there is no built-in way to aggregate across historical versions currently (this is something we are thinking about though). It might be worth considering splitting these "attributes" into separate documents with distinct IDs, then there is no merge/roll-up necessary and you can put as many as you like in the same transaction at the same point in time. Or have you already discounted that?


great thank you. both answers make sense ) I did not expect you would answer that quickly on a Saturday, so I just peeked back I added a 3rd question about a tx log replay


no problem! and yeah...I probably need more hobbies 😅


For Q3, have you added rocksdb to your topology? If the indexes are persisted then the startup time should be nearly instant, and the Crux node will only have to catch up on transactions it might have missed (assuming there are other nodes still online submitting transactions while the node in question is offline)


replaying from the beginning of the log should only be necessary when performing an upgrade of Crux


> For Q3, have you added rocksdb to your topology? no, it’s “app => crux lib => postgres” is there anything specific I should do (start the node in a certain way) to make sure the indices are persisted?


here is how start the node:

(cx/start-node {:crux.node/topology '[crux.jdbc/topology]
                :crux.jdbc/dbtype adapter
                :crux.jdbc/dbname dbname
                :crux.jdbc/host host
                :crux.jdbc/user username
                :crux.jdbc/password password})
stopping it as:
(.close node)
if I start it again after the (.close node) I see:
2020-08-22T13:17:11,593 [crux-polling-tx-consumer] DEBUG crux.tx - Indexing tx-id: 2
2020-08-22T13:17:11,840 [crux-polling-tx-consumer] DEBUG crux.tx - Indexing tx-id: 4
2020-08-22T13:17:11,879 [crux-polling-tx-consumer] DEBUG crux.tx - Indexing tx-id: 6
2020-08-22T13:17:11,924 [crux-polling-tx-consumer] DEBUG crux.tx - Indexing tx-id: 8
2020-08-22T13:17:11,967 [crux-polling-tx-consumer] DEBUG crux.tx - Indexing tx-id: 10
2020-08-22T13:17:12,008 [crux-polling-tx-consumer] DEBUG crux.tx - Indexing tx-id: 12
2020-08-22T13:17:12,050 [crux-polling-tx-consumer] DEBUG crux.tx - Indexing tx-id: 14


The instructions here show the standalone topology (where you're using jdbc) but the idea is the same:


so you really just need to add rux.kv.rocksdb/kv-store in your topology vector and add a location under :crux.kv/db-dir


ah.. interesting. something like this?

(cx/start-node {:crux.node/topology '[crux.jdbc/topology
                :crux.jdbc/dbtype adapter
                :crux.jdbc/dbname dbname
                :crux.jdbc/host host
                :crux.jdbc/user username
                :crux.jdbc/password password
				:crux.kv/db-dir (str (io/file "/anydir/to/keep/indices" "indexes"))})
would it keep the transactions in rocksdb as well? crux.kv/db-dir is wha confused me a bit, since my posgres is the actual DB in my case

👌 1

no transactions don't get stored in Rocks also, when used like this. Rocks only holds the indexes. In this sense "db-dir" is misleading...because it's not referring to a Crux db instance, but a RocksDB "db"


got. it I am driving, so hard to try right away ) but thanks a lot! this complicates the deployment a bit because our apps run in nomad on multiple hosts, but I'll try to cook something up as a follow up (wishful) question: can I store indices in postgres instead? I understand that this is probably not supported at the moment. just curious if this is something that may later appear )


We don't offer any non in-process KV stores today. Due to how the query engine works the indexes need to reside as close a possible to the node, but there's a certainly a spectrum to be explored beyond in-process-RocksDB-on-local-disk.


yea, that option would be great, there are use cases where the goal is not to be super performant but to get the temporal benefits by reusing all existing infra. so if there is any place that collects +1s.. )

👍 1

It's slightly more specific, but feel free to thumb-up this one: