set the channel topic: Everything about the Datalevin database https://github.com/datalevin/datalevin
I'm experiencing the SIGSEV errors when using many datalevin database for isolation of the data. Happens also in tests.
A SIGSEGV occurs in libdtlv.so's _mdb_txn_commit function, caused by a race condition between Datalevin's async sampling/analysis worker and database cleanup.
SIGSEGV (0xb) at pc=0x..., pid=..., tid=...
Problematic frame:
C [libdtlv.so+0x1c717] _mdb_txn_commit+0xe37
Signal: SEGV_MAPERR (accessing unmapped memory)
datalevin.dtlvnative.DTLV.mdb_txn_commit
datalevin.binding.cpp.CppLMDB.transact_kv
datalevin.storage.Store.actual_cardinality
datalevin.storage$sampling.invoke
datalevin.async$event_handler.invoke
(in ForkJoinPool worker thread)
Environment
Datalevin 0.9.27
Clojure 1.12.0
JVM GraalVM 25.0.1 (also on OpenJDK 21+)
OS Ubuntu 24.04 / Linux
Raw stacktrace:
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x000071904d063717, pid=252154, tid=253276
#
# JRE version: Java(TM) SE Runtime Environment Oracle GraalVM 25.0.1+8.1 (25.0.1+8) (build 25.0.1+8-LTS-jvmci-b01)
# Java VM: Java HotSpot(TM) 64-Bit Server VM Oracle GraalVM 25.0.1+8.1 (25.0.1+8-LTS-jvmci-b01, mixed mode, sharing, tiered, jvmci, jvmci compiler, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
# Problematic frame:
# C [libdtlv.so+0x1c717] _mdb_txn_commit+0xe37
#
# Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport -p%p -s%s -c%c -d%d -P%P -u%u -g%g -F%F -- %E" (or dumping to /root/unbound-clojure/core.252154)
#
# If you would like to submit a bug report, please visit:
#
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#
--------------- S U M M A R Y ------------
Command Line: -XX:ThreadPriorityPolicy=1 -XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCIProduct -XX:+EnableJVMCI -XX:-UnlockExperimentalVMOptions -XX:-OmitStackTraceInFastThrow --enable-native-access=ALL-UNNAMED -Djdk.attach.allowAttachSelf --add-opens=java.base/java.nio=ALL-UNNAMED --add-opens=java.base/sun.nio.ch=ALL-UNNAMED -Dguardrails.enabled=false -Dunbound.logging=disabled -Dguardrails.enabled=false -Dunbound.logging=disabled -Dclojure.basis=.cpcache/3300949091.basis clojure.main -e (require 'lazytest.core) -e (require 'lazytest.main) -e (System/exit (if (lazytest.main/run ["--watch" "false" "--output" "nested" "--dir" "test/clj"]) 0 1))
Host: Intel Xeon Processor (Skylake, IBRS, no TSX), 8 cores, 15G, Ubuntu 24.04.3 LTS
Time: Tue Jan 20 03:34:17 2026 UTC elapsed time: 112.764949 seconds (0d 0h 1m 52s)
--------------- T H R E A D ---------------
Current thread (0x00007187a81bd8f0): JavaThread "ForkJoinPool-25-worker-1" daemon [_thread_in_native, id=253276, stack(0x000071904d0f8000,0x000071904d1f8000) (1024K)]
Stack: [0x000071904d0f8000,0x000071904d1f8000], sp=0x000071904d1f5bb0, free space=1014k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C [libdtlv.so+0x1c717] _mdb_txn_commit+0xe37
Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
J 12368 datalevin.dtlvnative.DTLV.mdb_txn_commit(Ldatalevin/dtlvnative/DTLV$MDB_txn;)I (0 bytes) @ 0x000071906aee11d7 [0x000071906aee1180+0x0000000000000057]
J 12686 c1 datalevin.binding.cpp.CppLMDB.transact_kv(Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object; (526 bytes) @ 0x0000719064b247cc [0x0000719064b22200+0x00000000000025cc]
J 12685 c1 datalevin.binding.cpp.CppLMDB.transact_kv(Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object; (22 bytes) @ 0x0000719064b21124 [0x0000719064b21080+0x00000000000000a4]
J 12748 c1 datalevin.binding.cpp.CppLMDB.transact_kv(Ljava/lang/Object;)Ljava/lang/Object; (14 bytes) @ 0x0000719064b52d9c [0x0000719064b52d00+0x000000000000009c]
j datalevin.storage.Store.actual_a_size(Ljava/lang/Object;)Ljava/lang/Object;+400
j datalevin.storage.Store.a_size(Ljava/lang/Object;)Ljava/lang/Object;+258
j datalevin.storage$sampling.invokeStatic(Ljava/lang/Object;)Ljava/lang/Object;+257
j datalevin.storage$sampling.invoke(Ljava/lang/Object;)Ljava/lang/Object;+3
j datalevin.storage.SamplingWork.do_work()Ljava/lang/Object;+76
j datalevin.async$do_work_STAR_.invokeStatic(Ljava/lang/Object;)Ljava/lang/Object;+48
j datalevin.async$do_work_STAR_.invoke(Ljava/lang/Object;)Ljava/lang/Object;+3
j datalevin.async$individual_work.invokeStatic(Ljava/lang/Object;)Ljava/lang/Object;+59
j datalevin.async$individual_work.invoke(Ljava/lang/Object;)Ljava/lang/Object;+3
j datalevin.async$event_handler.invokeStatic(Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;+165
j datalevin.async$event_handler.invoke(Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;+6
j datalevin.async.AsyncExecutor$event_loop__37122$fn__37123.invoke()Ljava/lang/Object;+19 These should be fixed in master branch.
The next release will have a more robust solution to this problem.
Thank you. When do you plan to release the next version of datalevin?
Soon. Trying to fix the standing issues labeled with Bug, and a few other low hanging enhancements.
I can see that a connection atom contains sets :eavt etc for the indexes, but they're empty even after transacting data. My expectation from using Datomic and Datascript was for these to contain the data after a transaction. What's their use in Datalevin?
These are the implementation details of how transaction process works and are subjected to change. We use the same process of turning transaction data into datoms as Datascript, so these are for that purpose. Datoms are then persisted in storage. After that, there's no need to keep them around. Otherwise, memory will be exhausted very quickly. A future use for these, is to support WAL mode, as the in-memory overlay that serves query before WAL is checkpointed.
Datomic (presumably) and Datascript directly serve queries from these structures, whereas we serve queries from storage. That's the difference.
I see. I'm working on inspecting datalevin databases with Dataspex, and I'm trying to replicate as much of Datomic and Datascript feature set as possible, is why I'm asking.