datalevin

Huahai 2025-11-11T00:24:53.354159Z

Update: I am working on getting the vector DB feature to be more tightly integrated with LMDB transaction process (storing the vector index in LMDB), so that it is ACID compliant.

πŸ‘ 7
sg-qwt 2025-11-11T18:56:54.249339Z

Those external dependency is only required with vector feature? Is there anyway to opt out that feature so I can use Datalevin as pure lib like before without worrying about external dependencies?

πŸ‘ 1
Huahai 2025-11-12T17:37:06.809629Z

External dependencies were always needed. For example, the correct version of libc is needed since the beginning. So there isn't any difference from before: we expect some common dependencies exist on the host machine. It's user's responsibility to ensure that happens. This problem is what motivated Java to begin with. Nowadays, people don't seem to care too much about this problem any more, because there are only a handful of platforms and virtualization becomes norm.

Huahai 2025-11-12T17:39:35.179909Z

Again, this is open source, if you need some feature, work on it. Since I don't see the need for the feature, I am not going to work on it, understandably.

Huahai 2025-11-12T17:42:08.075809Z

I personally see vector feature as essential, so I am not going to work on making it optional. I am always open for PR though.

Huahai 2025-11-12T17:46:02.797649Z

"My goal is not just to compete with existing databases, but also to lay the foundation for advanced artificial intelligence of the future: Just like memory is the center of human cognition, databases ought to serve the same purpose in artificial intelligence." https://huahaiy.github.io/projects/ This is the reason for me to work on Datalevin.

sg-qwt 2025-11-12T18:01:28.118149Z

I understand your point. My usecase with datalevin is as a simple kv db, and it works well. But after bumping to 0.9.22 things stop working due to external dependencies. I have no idea of the version of libc or such. Sure, on my case, after some research. I installed gcc on host to fix. But that kind of external dependence stuff will complicate build/deploy process etc. That's why I ask if that's an optional feature so I can skip that gcc install stuff.

Huahai 2025-11-12T18:08:34.842449Z

I have mentioned this before, you were just lucky that the machines you were on had the right version of libc. Not everyone was that lucky. "no dependency" was just an illusion. There are a few things we can do better, e.g. maybe compile all dependencies statically and bundle them all. That's one option. Another option is to make everything dynamic and optional, and load them on the needed bases. Personally, I think option 1 is simpler. We have a ticket for it, https://github.com/juji-io/dtlvnative/issues/7 PR welcome.

πŸ‘ 1
Huahai 2025-11-12T18:10:36.583779Z

The musl-libc approach is what zig takes, which enables its cross compilation.

Huahai 2025-11-12T18:15:10.005189Z

Although people say they prefer making things optional approach, but in reality, most successful engineering projects tend to pile on features. It's just simpler to maintain that way, ironically.

πŸ‘ 1
littleli 2025-11-20T11:58:52.310579Z

just a note, is static linking even an option? to link with java it has to be a shared object of some sort, no? thinking-face

littleli 2025-11-20T12:01:32.639829Z

anyway. Sorry, I don't understand this topic that much, I was just a bit surprised.

littleli 2025-11-20T12:02:40.239109Z

What do you think about new FFM APIs? It should be more straightforward to use native libraries thanks to these advancements in Java.

Huahai 2025-11-21T04:59:47.168009Z

The idea is to have a single native library with everything included. The new FFM is not universally available yet. It doesn’t seem to be more convenient to use than JNI, as you still need to pass jvm options to open things as well. However, we will be forced to switch at some point when they disable UNSAFE. The performance is a lot worse without unsafe.

Huahai 2025-11-21T05:15:37.002519Z

E.g. for single read/write, the new FFM is 3 times slower than using Unsafe. So we will stick with the current setup as long as we can.

Huahai 2025-11-21T05:18:21.805169Z

The driver of the new FFM is safety, not performance. It is not something too meaningful for our use case, as safety is already handled in the Dlmdb layer.