Fork me on GitHub
#crux
<
2020-11-11
>
nivekuil23:11:21

the new tiered storage stuff in confluent cloud looks interesting. I gather crux could use some s3-like as the doc store + kafka with tiered storage to that same doc store, and it'd be pretty thrifty? Does every index rebuild/new crux node incur significant egress costs from confluent?

refset00:11:46

Yeah Confluent have been pushing their stack forward at an impressive pace. We're big fans! I haven't crunched their cloud prices for a few months but I know the entry costs go up a fair bit if you need HA. There's a big spectrum of options for trading off performance against cost - throwing S3 into the mix would probably shave off storage cost but will likely slow down ingestion latency. Maybe that's okay 🙂 In the upcoming release (~2 weeks) you can configure the checkpointing feature to minimise rebuilding and egress from Confluent: https://opencrux.com/reference/checkpointing.html

nivekuil00:11:08

Oh, nice. The next release looks huge! B2 is 20x cheaper per byte than confluent storage so offloading there sounds nice, but yeah the 99.5 SLA is not so great on the basic plan. I think scribe, the kafka equivalent at facebook, had 5 9s

nivekuil00:11:19

I'm sure it's not actually that bad in practice and they just don't want cheapos to complain, but how well does crux deal with unreliable backends?

ordnungswidrig08:11:54

Oh nice with checkpoints. Would that allow to have more ephemeral nodes (think AWS lambdas) to have a way to quicker bootstramp the index?

refset16:11:20

> Would that allow to have more ephemeral nodes (think AWS lambdas) to have a way to quicker bootstrap the index? Essentially yes. Transferring an exact binary copy of an index checkpoint over the network could be many orders of magnitude faster than rebuilding indexes from scratch. I imagine doing such a thing with AWS Lambda is probably only realistic for <1-2GB sized databases though (given that Lambdas are so ephemeral), but it could still be a useful model for many people > how well does crux deal with unreliable backends? In theory it should cope just fine, even with eventually consistent document stores, as the ingestion process now blocks until it successfully retrieves a given document. If there's any issue with the tx-log losing writes (or somehow reordering writes) then all bets are off, however 🙂

ordnungswidrig16:11:46

ECS autoscaling is also a use case where I can think that faster bootstrapping helps a lot

💯 1
dominicm19:11:12

@U797MAJ8M I got pretty poor latency with b2, I wasn't happy with it

nivekuil00:11:41

@U09LZR36F poor UX from the latency too? Think it would be a problem to have the golden stores in a separate dc/network from the crux nodes in general?

dominicm03:11:40

Didn't get that far :) I was just very concerned with ingestion speed