This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2023-05-10
Channels
- # aws (39)
- # babashka (4)
- # beginners (5)
- # biff (25)
- # cider (14)
- # clj-on-windows (40)
- # clojure-europe (36)
- # clojure-gamedev (1)
- # clojure-losangeles (4)
- # clojure-norway (51)
- # clojure-spec (5)
- # clojure-uk (2)
- # clojurescript (2)
- # clr (176)
- # data-science (10)
- # datalevin (17)
- # datomic (7)
- # deps-new (4)
- # docs (3)
- # emacs (12)
- # figwheel (3)
- # figwheel-main (5)
- # hyperfiddle (20)
- # instaparse (3)
- # introduce-yourself (8)
- # lsp (66)
- # malli (43)
- # off-topic (4)
- # rdf (11)
- # reagent (5)
- # releases (2)
- # sci (11)
- # shadow-cljs (24)
- # slack-help (2)
- # specter (7)
- # tools-deps (3)
- # xtdb (48)
hi @huahaiy: would it make sense for datalevin server component to be able to manage multiple lmdb database files? The way postgresql has a cluster of databases (where each db is isolated). I would imagine this might be useful in a context where there are large datalevin stores and you would like to sync / backup the individual /lmdb file level
The server manages not just multiple databases, but also multiple users, with full RBAC.
I think in next 2 weeks I will have time to work on datalevin pieces and finish the map / iterator PR
Not sure if it’s best to ask this here or in datalog channel, but I’m using datalevin so I figured I’d try here first. 😅
I’m new to datalog and datalevin/datascript/datomic.
I’m working with a large dataset (millions of signatures) and I’m effectively trying to get the top N latest results for :signature
ordered by block-time
(timestamp), but with the caveat that I only care about signatures that belong to a particular :program
.
The only performant way I’ve found to do this is the following, but it feels a tad complicated:
(let [program-address "foo"
db @db
[{program-db-id :db/id}] (d/q '[:find [(pull ?p [:db/id])]
:in $ ?program-id
:where
[?p :program/id ?program-id]]
db
program-address)]
(->> (d/datoms db :ave :signature/block-time)
(map first)
(partition-all 10)
(mapcat #(d/pull-many db [:signature/program] %))
(filter (comp #{program-db-id} :db/id :signature/program))
(take 1)))
Am I missing something obvious? In sql I’d just add an index on block-time
and where program-id order-by block-time limit N
, to get the latest N values for a given program-id
. Here I’ve had to partition the result to lazily pull-many
in chunks and then filter
by the program-db-id
.
Any help greatly appreciated. 😀Right. There’s no query optimizer to speak of (except some obvious optimizations that we do) at this point, so effectively you have to do these on your own. My goal is to improve the query performance to the point you can do what you do in sql. What’s what I am talking about when I say my goal is to bring the datalevin performance to be on par with RDBMS. Stay tuned.
That’s great news! Datalevin has been a fantastic experience so far.
I believe this goal is possible to achieve, because we can do whatever RDBMS does behind the scene. I have done the research and done some of the work, just need to finish it.
I’d normally use sqlite for this sort of stuff, but I’d still be writing migrations and working out the schema! 😆
right, one of the advantage of datalog is that the schema is about the attribute, not about the entity, so it is much more flexible
That’s good to hear, I know ordering/sort-by type stuff is a challenge for datomic style dbs. At least that’s my understanding. Mainly wanted to make sure I wasn’t missing something obvious.