Fork me on GitHub
Xiaomin Wang00:04:39

Hi! I’m trying to set up xtdb on AWS ECS. Planning to use rocksdb on EFS for index store. For doc store and tx logs, I tried both mysql rds and rocksdb. Just wondering what are some considerations for choosing between the two? Any tips generally for choosing storage solutions would be greatly appreciated! Any data on performance, scalability, reliability would be helpful!


if you have multiple nodes, you will want to have a shared placed (for example RDS) for the tx-log and doc-store


we are using rds postgresql for tx-log and doc-store, then rocksdb for local indexes (with checkpoints in s3)

Xiaomin Wang15:04:12

@U11SJ6Q0K Thanks so much for the pointers! Rocksdb with checkpoints in s3 sounds like a good setup. Are you using EC2's instance storage for the local index? Curious what’s your experience with multi-node setup? Does that require something extra to keep the nodes’ indices in sync? Or pointing all xtdb tasks to the same rds db just works? The docs recommend using kafka to connect the nodes. Is that in addition to doc and tx store or just using kafka for storage? Thanks again!!


Yes each node has the rocksdb local index on their own disk

gratitude-thank-you 2

I have some general feedback on the XTDB docs. In general, the documentation for XTDB does not provide enough context, and the examples are not standalone and working. It would be better if every page in the doc had working standalone examples, then followed by an exhaustive list of all options available. For example, I’m looking to set up checkpointing for my configuration, so I consult the docs here: and then I want to use S3 checkpointing so I click through to here: If you notice on either of these pages, there’s lots of ... inside the config maps, that you then have to go elsewhere in the docs and hunt for the correct values. So what could’ve been a 10 second process: Visit the docs, copy paste, change the configs to match what I need and go - instead becomes a 30 minute job, where I have to carefully read through everything, understand the code fully, write my own S3Configurator (or hunt for an example elsewhere). I would have liked to see instead a working standalone namespace, with requires/imports and all that I could just paste in. This is true all over the XTDB docs, from queries, to configs and modules. Just my 2 cents 🙂


Hey @UAEFFG05B thanks for sharing your reflections & taking the time to write this up. XT's approach to configurability is certainly a 'double-edged sword' and the challenges of developing and maintaining comprehensive examples falls under that description. Out of interest, what is your favourite example of well-documented complex software?


I remember a vexing process of discovery related to the "..." style of documentation and trying to use JDBC storage. Xtdb might make it easier on the reader by making storages truly mix-and-match and equally-capable, or by introducing a layer of abstraction that offers only the allowed combinations, or by documenting use of storage without reliance on the wide-open "..." that admits unsupported combinations.

📝 2

I use XTDB every day, and I LOVE it, but if it hadn't been included in Biff (which has a smaller codebase, but a larger surface-area, and great documentation), I might well have moved on to something else.

❤️ 2
😎 2

I suppose part of why I call Biff's docs 'great' is that it spits out a template project, and the docs are written in that context. That might not be entirely applicable 'advice' for XTDB, but I do agree in general with @UAEFFG05B that eliding context in the name of brevity (or generality) creates work for someone doing initial exploration.

👍 2

> Out of interest, what is your favourite example of well-documented complex software? I think that the golang docs are generally very very good. There’s a good mix of tutorials and use cases, and the reference is good too. The purpose of every function is explained, each input is explained, the output and side effects are explained, and they also explain the behaviour in case of an error or invalid data. Pretty much every non-trivial function or type also has a compiling example showing its use. Random example: