Fork me on GitHub
#datalevin
<
2023-04-01
>
Eugen03:04:06

hello @huahaiy, I found some time and decided to give a try to implement Map interface on top of KV store https://github.com/juji-io/datalevin/pull/200 . Turns out basic functionality is quite simple and can be done outside of datalevin library. I did it in the example code. I would like to flesh out the code over time and plan to use it in one of our internal projects to test it out. I pushed the code to be out there (surprised how simple it is) and to allow for discussion on top of it. • I think it is simple enough and usefull enough to be part of datalevin so people don't roll their own. Wdyt? • Not sure yet how to implement clear / size (efficiently) yet. Probably get-range, but I don't want to load everything in memory. Recommendations are welcomed. • There are lots of map API's to implement later

Huahai03:04:46

clear can use clear-dbi , size is just entries

Huahai03:04:20

This will only be limited to data that do not expect range queries on them, because the map API cannot takes data type for keys.

Huahai03:04:37

Yes, it is trivial to implement a Map API, as it is actually the subset of existing functionality.

Huahai03:04:21

get-range is fine, it returns a SpillableVector

Huahai04:04:08

we will have a key-range function in the next major version, that should take care of keySet, entrySet and values can be done with get-range with [:all] as the key range

Huahai04:04:16

containsValue can use get-some

Huahai04:04:03

I think that’s all.

Huahai04:04:10

If you would like to send a PR to include it as part of the datalevin, I can merge it. Please write some tests if you do. Thanks.

Eugen04:04:51

sure, I'm currently evolving the API. thanks for the hints

Eugen05:04:44

> This will only be limited to data that do not expect range queries on them, because the map API cannot takes data type for keys. hold my beer ! 🙂 . I think the API can be used in a statefull maner. The map implementation can have params passed to it when it's built. We can mutate those params via special interface. This might not be canonical Map implementation, but as long as it works ?!

Eugen05:04:24

I'm going to push an updated version that implements AutoCloseable. so we can write code like this:

(with-open [m (->map "/tmp/datalevin/map2" "a")]
    (println (.put m 1 "a")))

Eugen05:04:02

in what namespace should the code fit ? datalevin.core ?

Huahai05:04:23

lmdb.clj. We then create an alias in core

Eugen05:04:17

ok, thanks

Eugen05:04:03

pushed the update. Will try this in our project for a use case pretty soon and come back with more feedback. Then we can consider merging.

Eugen05:04:38

how far along is key-range ? I might need that

Huahai16:04:13

index branch has it

Huahai16:04:41

Also, you can use range-seq instead of get-range if you want to load the data lazily, since you are using with-open already

Huahai16:04:12

the same warning applies: keep a read transaction for a long time is not good

Eugen18:04:12

thanks. today I spent some time on a few places of our code that might benefit from lmdb/datalevin. we load a map in memory - so this might be a good fit. The maps use about 260MB of memory (measured with https://github.com/clojure-goes-fast/clj-memory-meter ). And that is just the start. It also does some processing that I believe we can cache in datalevin. Very excited about this 🙂

❤️ 2
Eugen04:04:40

how can I check I have a LMDB instance? using

is-datalevin? (instance? datalevin.db.DB db-or-path)
db (if is-datalevin? db-or-path (d/open-kv db-or-path db-opts))
I get :
; Execution error (ClassCastException) at datalevin.db/close-db (db.cljc:426).
; class datalevin.binding.java.LMDB cannot be cast to class datalevin.db.DB (datalevin.binding.java.LMDB is in unnamed module of loader clojure.lang.DynamicClassLoader @55cd4d99; datalevin.db.DB is in unnamed module of loader clojure.lang.DynamicClassLoader @109c25f8)

Eugen04:04:15

stack trace is actually for close-db call

Eugen04:04:31

typo in my code