xtdb 2023-12-30 | Slack Archive

akis00:12:14

https://xtdb.com/v2 mention > rather than individual documents, this now stores large, compacted, columnar Apache Arrow index files. I'm curious what kind of compaction rate you observed?

Martynas Maciulevičius09:12:41

If you didn't watch the XTDB videos then you may want to see this short one about Columnar DBs: https://youtube.com/watch?v=8KGVFB3kVHQ The info that you expect is at 2:30 If you want to get exact numbers... then you have to wait the official response. But the size of your database will be based on how similar the data values are to each other in the columns.

👍 1

jarohen07:12:14

Hey @UKDLTFSE4 👋 This isn't something we've spent a lot of time fine-tuning or measuring as yet, and will depend mostly on the nature of your data and the update characteristics. That said, we (deliberately) store relatively idiomatic Arrow files, so that we can lean on as much of the available Arrow compression as possible - in theory, we should see similar numbers to other Arrow use cases

2023-12-30

Channels