Possible noob question: I'm building a simple CRUD web app with Datalevin and http-kit. I'm the only user, running one query or transaction at a time, but I invariably get the MDB_READERS_FULL error after enough use. I am using the same DB connection for everything. I don't have a simple test case, but in the event I'm not just being silly, I will investigate further.
I have experienced this exact problem, using http-kit, and was about to create a similar thread.
Curious how the entity object is not GC’ed.
Please share if you learn more @spnw, I will investigate further myself in the next days.
A solution is to not use entity?
Actually I was wrong in thinking entity was the problem, because using transact! causes the same result.
I'd like to investigate this further, but as a stopgap I recommend using a different server (e.g. Jetty).
probably due to http-kit's websocket support, it's keeping the ring-hander from being GC'ed?
Excuse the ignorant question, but are we always reliant on the GC to avoid running out of readers? E.g. there's no way to explicitly free them?
We could add a function to do that, if that's something people want.
Perhaps it's only necessary in pathological cases, I don't know. I'm always a bit leery of relying on GC for resource management though.
Readers are kept open to be renewed and reused for performance. So the system does not close readers, except when dead readers are cleaned up every 5 minutes (configurable) when the DB file is force synced at that time. However, if there's reference to them, these readers are not dead.
I see. I suppose that doesn't change the situation (the readers don't need to be freed, they just need to not be tied up unnecessarily)
Can you try master branch to see if the transact! problem is going away?
Unfortunately still happens on master. Same test as before, with this handler:
(fn [_]
(d/transact! conn [{:db/id 1 :my/attr 42}])
{:status 200
:body "test\n"})It turns out that http-kit by default just starts a new (virtual?) thread for each request, instead of using a thread pool. The way to fix it, is to set a :worker-pool when you start the server, and supply a thread pool of your own, anything will work, e.g. (`Executors/newCachedThreadPool)` Even though it's an unbounded pool, the cache will limit it to a small number if your threads finish running quick enough.
So if you want to keep using http-kit, set :worker-pool is the way to go.
Notice that (Executors/newVirtualThreadPerTaskExecutor) will not work, for that's the default http-kit behavior
hope this helps
Maybe I can add an option :reuse-reader? false to Datalevin, for those cases when one wants to use unlimited number of threads for read. e.g. when using (Executors/newVirtualThreadPerTaskExecutor), and close the reader after each use. It will have some performance implications.
Let me figure out something. Meanwhile, set a :worker-pool in http-kit is the way to go.
Excellent. Thank you very much for your work on this!
Set a bigger :max-readers
Which version are you using?
I'm using 0.9.22.
Set a larger :max-readers then
Bumped to 512 and I'm getting the same issue. My app rarely does one more than read at a time, so it's all very odd.
How about bump more?
You can also check how many threads your app is using
My guess it is quite a lot
36 threads (include REPLs and stuff) Haven't noticed a correlation between the :max-readers count and the amount of time it takes to trigger the issue, even with a very low count. I can experiment with higher values just to see if anything changes.
What's the spec of you machine?
6-core i5 w/ 16 GB RAM. Running macOS if that makes any difference.
can you clone datalevin and see if you can pass all the tests?
lein test
All tests are passing.
it is likely a problem with the app then
36 is not a lot of threads, but how did you check it?
(count (Thread/getAllStackTraces))
since you are on Mac, activity monitor will just show you
The app is just a web server and little else. As far as I know, http-kit uses a small thread pool and doesn't allocate per-request threads at all.
Activity Monitor is currently showing 38.
when the READERS_FULL happens?
how did you increase :max-readers?
It's generally after I update an entity in the app and then query the DB to build the response.
(defonce conn
(d/create-conn "/path/to/my/db" schema {:kv-opts {:max-readers 512}}))rights, bump to a few thousands?
Can do.
if you can create a minimal reproducible repo, I can take a look
Currently running a test with :max-readers 8192 (could take a while...). If that fails or I'm able to get a minimal reproduction I'll get back to you. Thank you!
Indeed, the error triggered after roughly 8000 update/read cycles. This would seem to support my suspicion that each web request is grabbing a reader which is never released.
I'll get to work on that repo later.
you probably created a connection each time
get-conn would be safer. It will create a connect if one doesn't exist, and get if it does
My code only ever ran create-conn once; I was very careful about that. Testing with get-conn didn't change anything.
I've managed to reproduce the issue in a couple dozen lines of code. I got a strange result though: at the last moment I decided to try swapping in Jetty as the server, and the problem went away entirely. I don't know if that makes it a http-kit issue, or just a strange interaction between it and Datalevin.
Interesting. Can you share the code?
Yeah sorry, just wanted to check a few other things. I am pretty sure my problem involves using entity instead of q .
entity does have a reference to the DB instance
Yeah I thought that might be the case. I don't know why http-kit would be caching them though.
Run with clojure -M -m core. It makes a DB at /tmp/minimal-db and a server at localhost:8080. I just run this command until it starts spitting errors: while true; do curl
Btw if you think it's a server problem and not a Datalevin problem then I'm content to drop it. You've been helpful :)
yeah, probably some caching is going on with http-kit. OTOH entity creates an entity object, so it isn't really a query function. From your code, it should be GC'ed, but somehow didn't
That makes sense to me. Thanks again for your time!
(My actual code had a bunch of q calls and just one random entity call tucked away in a middleware, probably that is why I never suspected it. Hah.)