Fork me on GitHub
#xtdb
<
2021-01-10
>
Christopher Thonfeld-Guckes12:01:19

Hi, I want to use crux in production on a kafka cluster. Is there a best practice to configure the topics? Do I need infinite retention for all topics? I can't find any wiki page for this.

refset18:01:42

Hi 🙂 Crux automatically creates its own topics with all the appropriate configs - it would definitely be best to let Crux do this as there are (currently) no checks to confirm that any manually configured topics are valid. However, if for some reason you do need to set up your topics manually you can see how its all managed in this ns: https://github.com/juxt/crux/blob/master/crux-kafka/src/crux/kafka.clj (the key items being: infinite retention on both, partitions=1 for the tx log, and cleanup.policy=compact for the doc log)

Christopher Thonfeld-Guckes21:01:15

Thanks for the info. I tried configuring it manually and it threw exceptions until I configured the number of partitions correctly. At first I had set the transaction log to compact as well, which threw exceptions on write operations. I tried starting crux as a client on confluent kafka and it didn't create it's topics. Thats why I created them manually

👍 3
refset21:01:13

Interesting, I can think why it wouldn't have created the topics automatically :thinking_face: can you tell me which version that happened with please?

refset21:01:45

Were you following these steps? https://opencrux.com/blog/crux-confluent-cloud.html (or the ones on the http://juxt.pro site?)

Christopher Thonfeld-Guckes23:01:34

I adapted the older version of the tutorial from the juxt blog. I used the current version of crux from clojars. It's very possible that I did something wrong though, probably not an issue with the code.

Christopher Thonfeld-Guckes00:01:15

Is there a way to enable compaction on a topic that has already been created? The retention policy is set to infinite.

Christopher Thonfeld-Guckes00:01:03

alternatively, is there a way to dump and restore the complete crux database?

refset08:01:31

I'm fairly sure you can change the compaction setting easily enough for an existing topic - do you have access to a UI?

refset08:01:14

> alternatively, is there a way to dump and restore the complete crux database? Crux doesn't provide any specific utility for this right now. The best option is to use a regular Kafka-compatible backup tool (e.g. https://stackoverflow.com/questions/47791039/backup-restore-kafka-and-zookeeper)

Toyam Cox19:01:58

If I want to run crux in prod using sqlite3 as my backing store (kafka overhead is a bit much for me, all that stateful stuff all over), can I tell crux to use the same database for documents and transactions? I'm worried about backing up the document store then the transaction log in different processes and as such possibly making it impossible to restore from a given backup (race conditions)

refset21:01:24

In general I'd avoid recommending sqlite3 in prod for Crux - you may as well use a single Rocks instance for docs+tx instead (+ a second Rocks instance for the index), as it will give higher throughput. However, production probably should be using something with a distributed backend anyway, for high-availability & durability ...unless you can afford to risk having unhappy customers! You'll probably have a better time with a single Postgres for docs+tx 🙂

Toyam Cox21:01:15

I get the purpose of distributed backends, but the only one available seems to be kafka, and that has the zookeeper dependency + all that state in weird places.... it's much more "backbone" than stateless apps would like

refset23:01:25

Postgres counts as distributed too (for my definition, at least) as long as you have proper clustering enabled