Fork me on GitHub
#xtdb
<
2021-08-26
>
tatut07:08:16

any recommendations for what storage components to use in AWS, for a relatively low write system? Is it simplest to just use RDS PostgreSQL for both doc and tx data

👀 3
tatut07:08:21

As performance requirements aren't very high, I'm thinking about what's a good solution with regards to price and number of moving parts

tatut07:08:49

setting up kafka looks lot more complicated than RDS

jarohen08:08:01

Yep, RDS is perfectly reasonable 🙂 If you did want to use Kafka, we've used Confluent Cloud Kafka (https://www.confluent.co.uk/aws/) for a couple of internal projects which was both pretty click-and-go and cheap (although I think they've changed their pricing model since). I haven't personally used https://aws.amazon.com/msk/

Tuomas09:08:02

@U050V1N74 did you use confluent cloud kafka for docs too or just tx?

jarohen09:08:45

both, yep. since then we've come to prefer not using Kafka for docs, because of the requirement to replicate it in a local store for random-access lookup (Crux takes care of this for you, but it's more disk space), but it'll probably be sufficient for a small workload

jarohen09:08:02

given a choice I'd go Kafka for txs and Postgres for docs - it's what they're good at - but pragmatically it's fine to use either Postgres for txs or Kafka for docs if you want everything in one store

tatut09:08:13

how about s3 for docs? any perf issues with that

jarohen09:08:04

none that we're aware of, at least 🙂 billing model is obv quite different. if you're not using many nodes and can give each node a decent amount of cache space (if it's a small instance you may even get the whole database locally cached) you probably won't be hitting S3 a lot

tatut09:08:58

good point about the billing model

jarohen09:08:23

from a code point of view it's easy enough to switch between them later (ideally before you go live) if after testing you decide you want the other one

tatut09:08:35

I think I'll go with RDS for this project as PostgreSQL is a tried and true and very reliable work horse... and I have lots of experience with that and almost none with kafka 😛

jarohen09:08:08

familiarity's certainly something to take into account 🙂

tatut09:08:46

and all the golden stores in the same place is good for offsite backups and ops people are comfortable with backup of postgresql

💯 9
richiardiandrea15:08:45

@U11SJ6Q0K we just picked Postgres for both for the same reason - the DevOps team is a bit worried about the table growing and growing and probably we will have to handle that at some point but they gave us the green light

tatut06:08:27

I was thinking that if you are using checkpoints, then could old events be moved to another table (or removed altogether if you have them in a backup file)... idk if that's feasible

tatut06:08:00

but things need to be pretty big to present problems if using brin index