This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2023-09-06
Channels
- # announcements (14)
- # babashka (12)
- # beginners (61)
- # biff (2)
- # calva (16)
- # clj-kondo (22)
- # cljdoc (7)
- # clojure (131)
- # clojure-europe (52)
- # clojure-losangeles (9)
- # clojure-norway (54)
- # clojure-spec (5)
- # clojure-uk (4)
- # clojurescript (18)
- # cursive (14)
- # datomic (19)
- # deps-new (14)
- # emacs (8)
- # events (7)
- # fulcro (6)
- # graphql (3)
- # hyperfiddle (42)
- # instaparse (5)
- # lsp (10)
- # malli (21)
- # nbb (1)
- # off-topic (3)
- # pathom (3)
- # polylith (7)
- # reagent (14)
- # releases (2)
I have a composite tuple that is indexed (2 attrs), but index-range
that seems fairly slow to me - pulling around 700k datoms takes around 700ms, even when all are in the object cache. I did a profile, and found almost all of the time is being spent in datomic.common$compare_ex. Is there any way to get insight into what I might be able to do to speed this query up?
How do you know all are in the object cache? index-range doesn’t provide io-stats. How does the index-range compare to just d/datoms over the same attr? e.g. (time (count (d/datoms db :avet attr)))
do you use index-range with an “end” parameter? Does it make a difference if you don’t?
Great questions.
• Object Cache: my actual query is something like this [:find ?foo :in $ ?bar :where [(function-that-uses-index-range $ ?bar) ?foo]
, and, pleasantly surprisingly - this seems to accurately report io stats. The underlying function only uses index-range. I also am running the profiling after the first execution of the query.
• I should've said that i'm sampling, not profiling, so it can be imprecise, but is reported as cpu time
• For my use case, not using end
would be fetch way too much data. The composite tuple's schema is [:db.type/ref :db.type/instant]
, and so my range would typically look something like this: start: [some-entid #inst "2020-01-01"]
end:`[some-ent-id #inst "2023-01-01"]`
I’m asking about “end” and vs d/datoms because maybe comparing values until the end of the range is what is slow. Both of these would remove that.
d/index-range is lazy--is it also 700ms when you just run that function by itself with e.g. count to realize it without allocating all of it?
another comparison you can make, if function-that-uses-index-range
is simple to port to a query, is to see how the query does when expressed directly with [?e :attr ?v][(<= ?start ?v)][(< ?v ?end)]
Yeah, my introduction of function-that-uses-index-range
was introduced because the query performance was much worse. Raw usage of index-range
without realizing every ?foo is much faster (200ms). I think that's the clue I needed
If your output set is significantly smaller than the input, consider reducing over chunks of your index-range input and running the query for each chunk
I don’t really understand why this is better; I theorize that it’s all about memory pressure from large sets