Fork me on GitHub
#xtdb
<
2023-01-11
>
markaddleman16:01:52

I’m curious about https://github.com/xtdb-labs/crux-redis - It hasn’t been updated in a while. Does that mean the experiment did not yield good results?

refset17:01:52

Hey @U2845S9KL great question 🙂 it was definitely interesting as a basic experiment to simply see what would happen and whether the data model and ZRANGEBYLEX API was really sufficient to implement XT's KV protocol - which it was! There are various unresolved issues with the approach though, like how multiple XT nodes can safely coordinate on indexing, or whether there actually needs to be a 'leader' + HA setup. The performance seemed usable enough on my laptop with a local Redis, although the protocol/serialization overheads were apparent compared to Rocks (at least 3x slower IIRC)...but for some use-cases that may be not be a show stopper. I didn't attempt to run anything over a network, and complex queries could suffer dramatically in that setup. Since that experiment however AWS released https://aws.amazon.com/memorydb/, which makes the whole idea much more tempting to re-open (note also the existence of https://upstash.com/). Spinning up ~https://aws.amazon.com/blogs/database/access-amazon-memorydb-for-redis-from-aws-lambda/ XT nodes with a shared Redis-backed index-store doesn't feel completely mad 😅

J17:01:34

There is also https://github.com/dragonflydb/dragonfly which can be interesting.

👍 2
markaddleman17:01:54

🙂 You are reading my mind. If you remember, I’m in the world of streaming click events. Running locally using RocksDB for everything, I can ingest and index roughly 2000 events per second. Ingestion by itself is about double that but ingestion without indexing is just Write-Only-Memory 😉 That led to wonder about keeping the index in Redis (or some other kv memory store).

refset17:01:41

Dragonfly looks cool: > After we built the foundation for Dragonfly and https://github.com/dragonflydb/dragonfly#benchmarks, we went on to implement the Redis and Memcached functionality. By now, we have implemented ~185 Redis commands Part of the beauty of relying on the Redis protocol ("RESP") is that all these options are effectively drop-in. Maybe MemoryDB will switch to Dragonfly under the hood before long!

refset17:01:24

> That led to wonder about keeping the index in Redis (or some other kv memory store). Interesting, so the whole LSM-based KV-store model of Rocks is definitely optimized around durable storage devices. I haven't previously considered the potential role of a memory-only/non-transactional KV store specifically for improving indexing throughput in certain use-cases :thinking_face: My main impetus for looking at Redis back then was to mitigate the operational challenge of having nodes that maintain their own full local copies of the indexes.

markaddleman17:01:52

Yep, there are big operational advantages to a centralized index shared by a bunch of (potentially ephemeral) query nodes. Datomic’s architecture is like this. In my use case, I don’t really care if the index is centralized. I’m looking to squeeze as much indexing performance out of the system as possible and I’m willing to sacrifice transactional integrity to achieve it (especially if a broken index can be recreated quickly using checkpoints).

markaddleman17:01:28

On the subject of LSM and Rocks, I’d like to play around with DBOptions to see if I can juice more performance but I can’t figure out how to successfully pass my DBOptions instance to xtdb. When I supply it as a value to the :db-options key, I get a ClassCastException because XTDB is expecting a Rocks Options object. But, if I supply an Options object, the module fails to load due to a spec error. Any thoughts?

refset17:01:04

Huh, I we may have a bug, I think Options https://github.com/xtdb/xtdb/blob/20cb5623f5338bcc1b08a70c02a9a90d279419ad/modules/rocksdb/src/xtdb/rocksdb.clj#L280 wants to be DBOptions - I'll investigate, thanks for mentioning that!

🙏 2
markaddleman16:01:28

Congrats to the team on 1.23! Results of ingesting & indexing 1 million of our payloads (which translates to approximately 5 million XTDB docs): • 1.21.0.1 - 423 inserts per sec • 1.23.0 - 1949 inserts per sec This is running on my SSD laptop using RocksDB for all stores and the new filtering option enabled

🎉 2
refset18:01:54

Wow, awesome 🙂 thanks for sharing that anecdote! (assuming you meant the second version number to be 1.23 😅) /cc @U0GE2S1NH @U050DD55V

markaddleman20:01:33

doh! yes, edited