xtdb

bobcalco 2025-11-17T22:57:32.577899Z

Does anyone have any rules of thumb for estimating XTDB 2 database size and cost in a cloud deployment (in my case, using AWS)?

refset 2025-11-18T08:22:27.375689Z

Logs on Kafka will get truncated routinely (every 4h by default). Object storage isn't compressed at all right now, but we'll be shipping compression at some point to help there. It will probably also contain garbage from index compaction that can be cleaned up in future. Until you hit TB scale I would expect the baseline Kafka costs to be higher than your s3 bill. I would also recommend having 2 or 3 large machines than trying to scale out, if you're looking limit costs (density is more efficient). Happy to help with more detailed estimates but the best advice is to experiment for your specific workload :)

2025-11-17T23:11:25.477909Z

In my app, if I serialize all the records to nippy files, it takes about 12GB. There are about 18 million records. In XTDB 2, when using a local tx log and local storage, the log takes about 14GB and storage is about 79GB. I'm assuming that when using remote storage the size of the data stored will be the same, though maybe someone on the XT team could confirm.

2025-11-17T23:12:21.913329Z

I don't know about AWS pricing, but digitalocean S3 pricing is $5/250GB/month. And then of course you'll need kafka/a kafka-compatible service

nivekuil 2025-11-17T06:17:18.585419Z

hello 🙂 quick bug repro. xtdb seems to be messing with stored binary values under certain conditions? xtdb 2.0.0, nippy 3.6.0, java 21

(def node (xtn/start-node {}))
  (def bytes (nippy/fast-freeze {:unit {:after "network-online.target"}
                                 :service {:type "oneshot"
                                           :remain-after-exit true
                                           :exec-start [(str "sh -c 'ip link add hatchery-shim link ""eth0"" type macvlan mode bridge || true'")
                                                        (str "sh -c 'ip addr replace $(ip --oneline route get ""127.0.0.1"
                                                             " | sed \\'s/.* src //g;s/ .*//g\\') dev hatchery-shim'")
                                                        "ip link set dev hatchery-shim up"]
                                           :exec-stop "ip link del hatchery-shim"}})) 
  (xt/submit-tx node [[:put-docs :test {:xt/id 12345 :payload bytes}]])
  (def doc (first (xt/q node '(from :test [{:xt/id 12345} *]))))
  (nippy/fast-thaw (:payload doc)) ; exception
  (nippy/fast-thaw bytes) ; works
  (count (:payload doc)) ; 355
  (count bytes) ;361

refset 2025-11-17T08:04:46.526629Z

Hey @kevin842 (would be great to catch up!) thanks for the repro, I will take a look 🙏

refset 2025-11-17T22:04:09.878549Z

My repro against main passed 🤔 https://github.com/refset/xtdb/commit/9a5a4024c036a708703c562f3a566c96f4245162 If you get a chance test again using our latest maven snapshot (preview of the upcoming 2.1.0) that would be very welcome, if not I'll dig in again soon

nivekuil 2025-11-17T22:24:29.009469Z

ah, 2.0.0-SNAPSHOT works. should have tested that, been focused on writing and not firing up a repl much 🙂

😄 1
refset 2025-11-17T23:57:08.117409Z

Cool, not sure how it got fixed but I'm glad it did!

nivekuil 2025-11-18T01:03:07.367529Z

maybe https://github.com/xtdb/xtdb/pull/4869 because of the backslash

💡 1