xtdb

eighttrigrams 2026-01-07T15:54:43.528989Z

Hello, I am trying to get an XTDB 2.1 deployed since a week. The behaviour which I get is that everytime after a compaction things go bad with SIGSEGV ---- J 17735 c1 org.apache.arrow.memory.ArrowBuf.getInt(J)I (17 bytes). I've first tried on-disk storage on fly io, then switched Docker images, then switched to Railway for hosting, with S3 storage and logs on a local volume. Can someone think of what goes wrong here?

Oliver Marshall 2026-01-08T08:38:48.223379Z

Hey! Would you be able to get a full stack trace? Sounds like your config should be fine 🤔

eighttrigrams 2026-01-08T11:48:00.688869Z

Mounting volume on: /var/lib/containers/railwayapp/bind-mounts/8f8b9213-12c5-40a3-84e0-07ec449f77e6/vol_86szqxvoouhydyqo
#
# No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /app/hs_err_pid1.log
#
[60821.256s][warning][os] Loading hsdis library failed
# A fatal error has been detected by the Java Runtime Environment:
#
#
# If you would like to submit a bug report, please visit:
#   
#  SIGSEGV (0xb) at pc=0x00007f08a0598f98, pid=1, tid=622
#
#
[error occurred during error reporting (), id 0xb, SIGSEGV (0xb) at pc=0x00007f08b07639a2]
# JRE version: OpenJDK Runtime Environment Temurin-21.0.9+10 (21.0.9+10) (build 21.0.9+10-LTS)
# Java VM: OpenJDK 64-Bit Server VM Temurin-21.0.9+10 (21.0.9+10-LTS, mixed mode, emulated-client, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
# Problematic frame:
# J 12946 c1 xtdb.arrow.ArrowUtil.arrowBufToRecordBatch(Lorg/apache/arrow/memory/ArrowBuf;JIJLjava/lang/String;)Lorg/apache/arrow/vector/ipc/message/ArrowRecordBatch; (178 bytes) @ 0x00007f08a0598f98 [0x00007f08a0598e40+0x0000000000000158]
I can try find out more. But this is basically all I've got in the logs. What happens is that I call the app at https://personalist-production.up.railway.app/rovwes-posdeb, after I didn't for, say, an hour, and hit the more button a lot of times (10 times or so), and then at some point I start getting 502s, which come because the app crashed in the meanwhile. I only observe this behaviour after the first time the storage is actually being written. What is maybe a little suspicious is the mount volumes in the first line. Maybe that means it is trying to read from storage, and instead of accessing S3, it tries to hit a local volume. And I have observed the same as well on FlyIO that it crashes with local volumes and on disk storage.

eighttrigrams 2026-01-08T11:50:53.753389Z

This is some AI maintained report, so to be worth only a skim: https://github.com/eighttrigrams/personalist/blob/main/plans/XTDB2_SIGSEGV.md

eighttrigrams 2026-01-08T11:51:38.555949Z

Here's my init-conn https://github.com/eighttrigrams/personalist/blob/main/src/clj/et/pe/ds.clj#L19

eighttrigrams 2026-01-08T11:51:51.178259Z

configs come from env vars and https://github.com/eighttrigrams/personalist/blob/main/config.prod.edn

eighttrigrams 2026-01-08T11:52:36.944559Z

And I'm sure it loads the s3 (as per my logs, Initializing XTDB connection with type: :s3)