Fork me on GitHub
#datalevin
<
2022-04-12
>
Josh20:04:43

Hi all, I am trying out datalevin and testing how many datalevin conns I can have in a single process. The code

(def *dbs (atom []))

(doseq [i (range 1000)]
  (prn "Creating" i)
  (swap! *dbs conj (dl/create-conn (str "/tmp/dl" (.toString (java.util.UUID/randomUUID))))))
After creating 497 it errors out with
Execution error (ExceptionInfo) at datalevin.binding.java/eval27406$fn (java.clj:608).
Fail to open database: "Platform constant error code: EAGAIN Resource temporarily unavailable (35)"
At first I thought this might be a file descriptor limit, but JConsole is saying OpenFileDescriptorCount is 2681 and the MaxFileDescriptorCount is 10420 so that doesn’t appear to be the issue. Does anyone know what could be causing this / how to increase the limit?

Norman Kabir20:04:09

Is there a use case you're trying to exercise? The documentation recommends re-using the connection: > Please note that the connection should be managed like a stateful resource. Application should hold on to the same connection rather than opening multiple connections to the same database in the same process. https://juji-io.github.io/datalevin/datalevin.core.html#var-create-conn The underlying database ins LMDB > There should only ever be one environment open per database-file per process. Usually there is no good reason to close an environment once it’s open, so you can just leave it upon until the process ends. https://blogs.kolabnow.com/2018/06/07/a-short-guide-to-lmdb

Josh21:04:10

Yes, the idea is to have one datalevin database per user, each connection is open to a different path / database, which I think means that there is a different lmdb database for each as well

Huahai02:04:59

This code (open 1000 connections) works fine on my Ubuntu Linux. What platform are you running this on?

Huahai03:04:22

When I tried to do 2000, I hit the limit at 1021.

Huahai03:04:46

I think the default open file limit for a process is 1024, ulimit -Sn shows 1024 for me. Each LMDB DB opens two files, data.mdb and lock.mdb. Plus Datalevin open library jar files, your 497 number would be about right to hit that limit

Josh15:04:15

I am running on OSX

Josh15:04:57

Interesting, I must have missed something with the file descriptors I changed my limit and ulimit -Sn shows 64000, but I still can only create 497

Josh19:04:33

If it was the file descriptor limit, I shouldn’t be able to read any more files right? However, I am able to run the code to create the databases and then run (after creating this file)

(def fd1 (io/reader "/tmp/foo.txt"))

(line-seq fd1)
With no issues Is there some other system resource that datalevin / lmdb uses that I could be running into the limit for?

Huahai04:04:02

You are right, it’s not the OS limit, nor the JVM ones, I think it is JNR’s limit.

Huahai04:04:29

We are using LMDBJava which uses JNR. Another reason to get ride of LMDBJava eventually https://github.com/juji-io/datalevin/issues/35

Huahai05:04:15

hmm, it seems that i can still only get to 1024 even with dtlv, which does not use JNR at all.

Huahai05:04:12

On mac, the limit for me is 500. These neat numbers (500, 1024) make me think this is some kind of system limit that I have yet to figure out.

Josh16:04:58

the limit appears to be per process. I was able to start 3 different jvms at the same time and each was able to create 497 graphs for a total of 1491 between them

Huahai01:04:24

Yes, this is a per process limit. The error seems to be related to mmap call to lock the file or memory. Further investigation will be needed once I am back from vacation.

Norman Kabir23:04:08

I'm using Datalevin with Babashka and wanted to know if the following are equivalent:

{":color" "red"}
and
{:color "red"}
or should the quotes for keywords always be stripped? The keywords are often quoted when importing information from scripts.

Huahai02:04:59

They shouldn’t be equivalent. The quotes should be stripped.