This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2022-01-10
Channels
- # announcements (4)
- # babashka (40)
- # beginners (39)
- # calva (16)
- # cljdoc (1)
- # cljs-dev (8)
- # clojure (72)
- # clojure-europe (10)
- # clojure-losangeles (1)
- # clojure-nl (4)
- # clojure-spec (9)
- # clojure-uk (12)
- # clojurescript (16)
- # community-development (15)
- # conjure (5)
- # cursive (5)
- # datomic (26)
- # eastwood (1)
- # emacs (7)
- # events (1)
- # figwheel-main (15)
- # fulcro (27)
- # graphql (7)
- # gratitude (4)
- # introduce-yourself (1)
- # malli (4)
- # meander (4)
- # off-topic (2)
- # other-languages (13)
- # polylith (7)
- # reagent (5)
- # reitit (5)
- # shadow-cljs (27)
- # spacemacs (4)
- # sql (3)
- # tools-deps (6)
- # xtdb (13)
Sorry for another question about xtdb queries. See below for a simple query, the first query without using or-join is fast, but the one using or-join timed out. Should I avoid query shape as in the second/third query? I have totally 1 million records and I'm using rocksdb for tx, doc and index stores:
user=> (xt/q (xt/db node) '{:find [(pull v? [*])] :where [[v? :x/type :v] [v? :v/vin "vin-1"] ] :limit 10} )
; got the expected 1 record instantly
user=> (xt/q (xt/db node) '{:find [(pull v? [*])] :where [[v? :x/type :v] (or [v? :v/vin "vin-1"] )] :limit 10} )
Execution error (TimeoutException) at xtdb.query.QueryDatasource/q_STAR_ (query.clj:1799).
Query timed out.
user=> (xt/q (xt/db node) '{:find [(pull v? [*])] :where [[v? :x/type :v] (or-join [v?] [v? :v/vin "vin-1"] )] :limit 10} )
Execution error (TimeoutException) at xtdb.query.QueryDatasource/q_STAR_ (query.clj:1799).
Query timed out.
As it happens I've been working on some detailed documentation over the past few days on how rule execution actually works 🙂 the short answer as to what's happening here, is that both the or
and or-join
execute as inner-loop-join subqueries (whose results must be fully materialised before the outer query can continue), and at the same time, the inner query is doing a full table scan - i.e. it's O(n^2)
(I think that's right, anyway), and not the "filtering" behaviour that I think you are hoping for
As a workaround, instead of using [v? :v/vin "vin-1"]
within the or
(or or-join
), you can use these two clauses to do the filtering (and it will still execute as an inner-loop-join):
[(get-attr v? :v/vin :nothing) [vvin-val]]
[(= "vin-1" vvin-val)]
Got below:
user=> (xt/q (xt/db node) '{:find [(pull v? [*])] :where [[v? :x/type :v] (or-join [v?] [(get-attr v? :v/vin :nothing) [vvin-val]] [(= "vin-1" vvin-val) ] )] :limit 10} )
Execution error (IllegalArgumentException) at xtdb.error/illegal-arg (error.clj:12).
Or join variable never used: v? {:args {:free-args [v?]}, :body [[:term [:pred {:pred {:pred-fn get-attr, :args [v? :v/vin :nothing]}, :return [:tuple [vvin-val]]}]] [:term [:range [[:val-sym {:op =, :val "vin-1", :sym vvin-val}]]]]]}
ah, you need an and
in there, since otherwise the two clauses are treated as separate legs, instead of having them both in the same leg
user=> (xt/q (xt/db node) '{:find [(pull v? [*])] :where [[v? :x/type :v] (or-join [v?] (and [(get-attr v? :v/vin :nothing) [vvin-val]] [(= "vin-1" vvin-val) ]) )] :limit 10} )
Thanks! Tried that, still timed out though.
user=> (xt/q (xt/db node) '{:find [(pull v? [*])] :where [[v? :x/type :v] (or-join [v?] (and [(get-attr v? :v/vin :nothing) [vvin-val]] [(= "vin-1" vvin-val) ]) )] :limit 10} )
Execution error (TimeoutException) at xtdb.query.QueryDatasource/q_STAR_ (query.clj:1799).
Query timed out.
well, I suppose that may still take some time if it has to churn through 1 millions records :thinking_face: (though 30s still sounds like far too much) Did you already try:
user=> (xt/q (xt/db node) '{:find [(pull v? [*])] :where [(or-join [v?] (and [v? :x/type :v][v? :v/vin "vin-1"]) )] :limit 10} )
ah, so adding this clause before or-join slows it down:
(xt/q (xt/db node) '{:find [(pull v? [*])] :where [[v? :x/type :v] (or-join [v?] (and [v? :v/vin "vin-1"]) )] :limit 10} )
yep, exactly :thumbsup: I should have thought to link this issue to you already in my first reply: https://github.com/xtdb/xtdb/issues/1674
the examples in that issue demonstrate (towards the end) that there can still be cases where it makes better sense, in terms of performance, to have such clauses outside of the or-join