Are there options/best practices for implementing something like live mirrors/replicas of XTDB instances? Also, wondering if using Kafka for logging makes it possible to monitor DB events. (I asked a question about the v1 feature that fired events when something changed, which is not present in v2 per se, but could it not be achieved leveraging Kafka instead?)
@taylor.jeremydavid Yes, and moreover, I'd like to talk being a design partner.
@taylor.jeremydavid I may be required to put my solution in AWS Gov, using Red Hat OpenShift for AWS (ROSA) to improve chances of obtaining FedRAMP moderate certification for my solution more quickly. 1. Have ye any customers running on AWS that way, and 2. Are the instructions for plain XTDB on plain vanilla AWS still valid? (strictly speaking, the replication issue is now moot, as they clearly don't want a on prem, or partial on prem, solution)
Thanks for the update. Our current set of v2 design partners (who we work with closely) are all running on Azure, for now, but AWS is undoubtedly a high priority for us too - we're ready to support 🙂 https://aws.amazon.com/marketplace/pp/prodview-5o444yahk6nhu Kubernetes is doing the heavy lifting to keep v2 multi-cloud friendly, and the sample terraform + helm setup scripts should be fully up to date. Worth us setting up a call to chat it through?
Hey @bobcalco (answers assume you're looking at v2 here) all XTDB nodes are effectively replicas, so long as they have the same Kafka and object store bucket configured. If you're trying to mirror the Kafka or object storage itself (e.g. for cross-region HA) then it gets more complicated. What's the goal? As it happens, we're working on writing out all tx events to a "something changed" Kafka topic right now - see https://github.com/xtdb/xtdb/issues/4857 This is then expected to evolve into a live mode, and then full CDC support (Debezium-style), possibly even reactive querying. Stay tuned 🙂
Re: replication goals: Our potential federal partner may require an (effectively) on-prem/FedRAMP High replica of a cluster running in FedRAMP Moderate AWS Gov. Things can be wired (as I understand) from those environments, but I'm needing to understand how to ask the right questions about any potential replication scenario. re: issue 4857: W00t!
Well, I think the main rule with more fanciful replication strategies of the upstream components (e.g. MirrorMaker for Kafka or some bucket<->bucket sync tool) is to make sure you're only ever submitting transactions to one source. Don't attempt bidirectional replication. And when you do point the transactions at a replica (e.g. in a failover scenario), make sure the switch is atomic and that you are careful to not let things start diverging. XT doesn't have sufficient guardrails to detect split brain scenarios of the upstream components.