This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2023-06-19
Channels
- # ai (4)
- # announcements (1)
- # babashka (6)
- # beginners (31)
- # biff (1)
- # calva (11)
- # cider (9)
- # clerk (6)
- # clojure (40)
- # clojure-europe (49)
- # clojure-nl (1)
- # clojure-norway (30)
- # clojurescript (17)
- # conjure (1)
- # core-async (2)
- # datalevin (28)
- # emacs (1)
- # events (4)
- # fulcro (7)
- # gratitude (1)
- # guix (4)
- # hyperfiddle (19)
- # juxt (10)
- # luminus (4)
- # malli (4)
- # missionary (11)
- # nbb (3)
- # pedestal (7)
- # reagent (27)
- # reitit (2)
- # releases (1)
- # shadow-cljs (32)
- # tools-deps (6)
- # xtdb (5)
would it make sense to have an API to open multiple dbi at once? Right now it's single dbi
(open-dbi
[db dbi-name]
[db dbi-name opts]
"Open a named DBI (i.e. sub-db) in the LMDB env")
maybe with signature (not sure if compatible with existing) :
[db opts & dbi-name]
it's something I noticed when I have this:
(d/open-dbi lmdb docsearch-table)
(d/open-dbi lmdb semmed-table)
(d/open-dbi lmdb processed-rel-name)
(d/open-dbi lmdb rel-name)
I have a datalevin db and I'm trying to scan the index. I thought using seek-datoms
would return a result relatively quickly, but the following starts consuming memory rapidly and takes quite a while to return.
> (time
(->> (d/seek-datoms
(d/db conn)
:ave
::basis
"/fo")
first))
"Elapsed time: 66873.939042 msecs"
If I make a subsequent call to seek-datoms
with different value, I quickly hit the RAM limit for the process (currently 12g) at which point it appears to stop making progress
> (time
(->> (d/seek-datoms
(d/db conn)
:ave
::basis
"/bar")
first))
Am I using seek-datoms
incorrectly or is there a better way to scan the index?Calling seek-datoms
3 times with different start components throws an OutOfMemoryError
If you know you are doing an operation that takes a lot of memory, you want to disable the cache.
I thought seeking to a particular spot in the index would be pretty fast. Is that an incorrect intuition or is there something else going on?
i think you misunderstood what seek-datoms
does. It starts from a spot, and return everything after that.
I'll have to double check, but I think seek-datoms
is lazy in datomic.
It looks like datoms
might be lazy?
datalevin.core/datoms
[db index]
[db index c1]
[db index c1 c2]
[db index c1 c2 c3]
[db index c1 c2 c3 c4]
Index lookup in Datalog db. Returns a sequence of datoms (lazy iterator over actual DB index) which components (e, a, v) match passed arguments.
I just tried the following, and it seems to work:
> (let [db (d/db conn)]
(d/datalog-index-cache-limit db 0)
(time
(->> (d/datoms
db
:avet
::basis
"/deps.edn")
first)))
"Elapsed time: 1.436584 msecs"
It seems like index-range
is also lazy. I think this query would scan my whole database otherwise:
> (let [db (d/db conn)]
(d/datalog-index-cache-limit db 0)
(time
(->> (d/index-range
db
::analysis-id
"2023"
nil)
first)))
"Elapsed time: 4.952667 msecs"
As i said, right now, these are not lazy. if your memory is large enough, it will loads everything in memory. If memory is not large enough, they will spill to disk, hence the slowness you saw. Disabling cache was to save the memory, otherwise, the results accumulates in memory.
Ok, I guess I'm just confused about why seek-datoms
is so slow. Using index-range
to scan all of the values of a property takes less than a second, but doing the same thing with seek-datoms
takes over a minute:
(let [db (d/db conn)]
(d/datalog-index-cache-limit db 0)
(time
(->> (d/index-range
db
::analysis-id
"2023"
nil)
first)))
vs
(let [db (d/db conn)]
(d/datalog-index-cache-limit db 0)
(time
(->> (d/seek-datoms
db
:avet
::analysis-id
"2023")
first)))
right, seek-datoms
returns everything after that value, so it may have taken more than 80% of memory, then spill to disk is triggered
oh, seek-datoms
also includes all of the other properties in the :avet index :face_palm: . I get it now.