This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2023-05-16
Channels
- # announcements (1)
- # babashka (13)
- # beginners (33)
- # calva (26)
- # cherry (33)
- # clerk (5)
- # clj-kondo (3)
- # clojure (40)
- # clojure-europe (24)
- # clojure-finland (2)
- # clojure-norway (29)
- # clojurescript (18)
- # cursive (2)
- # datomic (6)
- # docker (11)
- # emacs (12)
- # events (1)
- # fulcro (71)
- # graalvm (8)
- # hyperfiddle (2)
- # lsp (23)
- # meander (5)
- # off-topic (36)
- # polylith (4)
- # re-frame (6)
- # reitit (13)
- # shadow-cljs (87)
- # spacemacs (1)
- # tools-deps (19)
- # vim (5)
- # xtdb (57)
I’m struggling with the XTDB v2 query syntax:
(x/q node '{:find [pid person]
:where [($ :people {:xt/id pid, :xt/* person})]})
=> Execution error (IllegalArgumentException) at xtdb.expression/continue-read$fn (expression.clj:253).
No method in multimethod 'read-value-code' for dispatch value: :set
I’m trying to get all fields of a document because there is no x/entity in v2 and x/pull not supported yet. Docs show this example, but I don’t understand what uid means and why :xt/*
would work here but not in my simpler query above:
{:find [event-at event]
:in [uid]
:where [(union-join {:find [uid event-at event]
:where [($ :posts [uid {:xt/valid-from event-at, :xt/* event}])]}
{:find [uid event-at event]
:where [($ :comments [uid {:xt/valid-from event-at, :xt/* event}])]}
{:find [uid event-at event]
:where [($ :likes [uid {:xt/valid-from event-at, :xt/* event}])]})]
:order-by [[event-at :desc]]}
Hey 🙂 I think uid
is "user-id" in the example. I'm not sure what that multimethod error indicates offhand. What did the map look like that you submitted?
(def node (xt.node/start-node {}))
(x/submit-tx node [[:put :people {:xt/id :zahier
:name "Zahier"
:roles #{:admin}}]
[:put :people {:xt/id :petrus
:name "Petrus"
:roles #{:admin}}]
[:put :people {:xt/id :loekie
:name "Loekie"
:roles #{:customer}}]])
(x/q node
'{:find [pid name]
:where [(match :people {:xt/id pid})
[pid :name name]]})
=> [{:pid :zahier, :name "Zahier"} {:pid :petrus, :name "Petrus"} {:pid :loekie, :name "Loekie"}]
(x/q node
'{:find [pid]
:where [($ :people {:xt/id pid, :xt/* person})]})
=> Syntax error (IllegalArgumentException) compiling at (src/zworx/data.clj:28:3).
No method in multimethod 'read-value-code' for dispatch value: :set
Seems to be related to the sets. The query runs if the DB is empty or if I change the :roles set values to vectors.
@U051SPP9Z mind running the same query with :explain? true
?
The multimethod xtdb.expression/read-value-code
doesn’t handle :set yet
yep. we might be able to fall back to the :list
support there, given sets are represented as Arrow lists under the hood (with a little bit of metadata to remind us that they're sets, when required)
v2 Query syntax: How do I query for all records in a table where a set value contains a needle value I’m looking for? E.g. [:put :people {:xt/id :joe :roles #{:admin}]
<= how do I find all :people with :roles that contain :admin
?
we're looking at explicit unnesting in v2, because we've had a fair few people in v1 wanting to get the set/list out as a value, who've been caught out by the implicit unnesting in v1
How does imlipict unnesting look in v1 and how would explicit unnesting look in v2?
implicit unnesting in v1 is just {:find [user], :where [[user :roles :admin]]}
- the roles value is unnested into its individual values
in v2, it currently looks like this:
(xt/q node '{:find [user-id],
:where [($ :people {:xt/id user-id, :roles [role ...]})
[(= role :admin)]]})
i.e. the [role ...]
syntax is saying that 'the :roles
value is a collection, give me each value'
Ah, I see. Yeah implicit unnesting feels dodgy - how to ground an exact value of a seqable thing? Explicit unnesting feels right.
Would be nice to have:
(x/q node
{:find [person]
:in [[?role ...]]
:where [($ :people {:xt/id pid, :xt/* person, :roles [?role ...]}
#{:admin})
Or something like (there is probably official Datalog for binding coll values):
(x/q node
'{:find [person]
:in [?role ...]
:where [($ :people {:xt/id pid,
:xt/* person
:roles roles})
[roles [?role...]]]})
yep, that works:
(xt/q node ['{:find [user-id]
:in [[role ...]]
:where [($ :people {:xt/id user-id, :roles [role ...]})]}
#{:admin}])
(location of the argument depends on which snapshot version you're using - this version literally pushed yesterday)I have been quite tempted to take the unnesting out of the $
, though, as you suggest :thinking_face:
I pulled snapshot a few hours ago and am I’m getting:
(x/q node
'{:find [person]
:in [[role ...]]
:where [($ :people {:xt/id pid,
:xt/* person
:roles [role ...]})]}
#{:admin})
=> Execution error (IllegalArgumentException) at xtdb.api/q& (api.clj:108).
Don't know how to create ISeq from: clojure.lang.Keyword
(All stored :roles
values are vectors now.)Why is this considered malformed:
(x/q node
'{:find [person]
:where [($ :people {:xt/id pid,
:xt/* person
:roles roles})
[(= roles [:admin])]]}
{:explain? true})
=> Execution error (IllegalArgumentException) at xtdb.error/illegal-arg (error.clj:13).
Malformed query
How to find a person with an exact :roles value that is a vector?>
Don't know how to create ISeq from: clojure.lang.Keyword
yep - this is the breaking change of the argument location - these are now passed as (xt/q node [q & args])
, to make it consistent with how args are passed in SQL> How to find a person with an exact :roles value that is a vector? I think this one is just a 'not implemented yet' I'm afraid 🙂
Any possibility of getting the faster vectorized query engine being backported to v1? I have some simple queries that take way too long for ~13k entities. Equivalent Datomic queries ran instantly (obviously no bitemporality).
not likely, I'm afraid - it's quite a different codebase. any way we can help out with optimising the queries?
Is there a way to get :explain? in v1?
the nearest equivalent is (xtdb.query/query-plan-for db q)
, in which the :vars-in-join-order
key tells you the order the query engine'll resolve the different logic variables
Hmmk so :order-by
is what seems to be taking forever. It takes about as long as pulling all 12k matching records w/sort-by w/o :order-by clause.
It’s a double-entry accounting system, so the data model cannot change without losing power: every Transaction has 2+ Entries. Each Entry is either a Debit or a Credit. Transaction has :tx/date (LocalDate) & :tx/description (String). Every entry belongs to an account (:entry/account). Accounts are a tree, so each account has a parent.
Pretty much every query involves querying across :tx/date for tax periods and filters entries that belong to certain accounts with child rules.
Query:
{:find [?entry ?date],
:order-by [[?date :asc]],
:in [?needle parent-account _], ;; ?needle not used in this version.
:rules [[(child-of ?c ?p) [?c :account/parent ?p]] [(child-of ?c ?p) [?c :account/parent ?t] (child-of ?t ?p)]],
:where [[?tx :tx/date ?date]
[?entry :entry/tx ?tx]
[?entry :entry/account ?account]
(or [(= ?account parent-account)] (child-of ?account parent-account))]}
There are a total of 12,230 matching records.
Fetching the first 150 (:limit 150) takes 530ms locally with in-process node. Time does not seem to scale much as :limit goes up.
If I take out the :order-by, it takes 15ms.Taking out the parent account query clause does not make it any faster. Could this be slow because I’m storing LocalDate instead of #inst? I guess I can do tricks like storing ?year, ?month and ?day and cascade sorting for speed.
Meh…I tried adding :tx/year, :tx/month & :tx/day to all txes and cascaded sorting (`:order-by [[?year :asc] [?month :asc] [?day :asc]]`) is only about 25% faster (400ms vs 530ms).
> Time does not seem to scale much as :limit goes up
the :order-by
will always force the whole result set to be computed and sorted behind the scene, so in reality :limit
will only avoid some deserializing at the end of the process
Managed to make this query muuuch faster by putting the account tree query in a subquery.
glad to hear! do you mean instead of using rules altogether? I realise now that I definitely should have suggested to try adding bound variable hints to the rule heads (sorry!), as in:
{:find [?entry ?date],
:order-by [[?date :asc]],
:in [?needle parent-account _], ;; ?needle not used in this version.
:rules [[(child-of [?p] ?c) [?c :account/parent ?p]] [(child-of [?p] ?c) [?c :account/parent ?t] (child-of ?p ?t)]],
:where [[?tx :tx/date ?date]
[?entry :entry/tx ?tx]
[?entry :entry/account ?account]
(or [(= ?account parent-account)] (child-of parent-account ?account))]}
How do bound variables hints work? Is it documented somewhere? Did you mean to reverse the order of parameters in child-of? Is that required for it to work?
this section should help 🙂 https://docs.xtdb.com/language-reference/datalog-queries/#_bound_arguments
> reverse the order of parameters in child-of? Is that required for it to work?
in the sense that the bounding [ ]
braces only work in the first argument position, yes
As I understand, XT keeps an index (I assume a sorted set) of each attribute in a record. How can I read that index directly like with datomic.api/datoms
so that I can apply more performant subfilters from there, or do I need to resort to Datomic for this?
At the moment my queries take 500ms against only 13,000 accounting records.
There's no public, supported API for that kind of direct low-level index access, although the internal protocols are there if you really want to explore that route. Have you tried using simpler queries (which will be direct index scans...more or less) and relying on open-q
to stream the results into your own computations?
When I run the following query, order of ?date
is random:
(x/q (x/db node)
'[:find ?date ?tx
:where
[?tx :tx/date ?date]])
Also if I use open-q:
(with-open [res (x/open-q (x/db node)
'[:find ?date ?tx
:where
[?tx :tx/date ?date]])]
(into [] (iterator-seq res)))
Hnng, even this takes 410ms (?date is LocalDate):
(time
(do (with-open [res (x/open-q (x/db node)
'[:find ?date ?tx
:where
[?tx :tx/date ?date]])]
(into (sorted-set-by (fn [[a t1] [b t2]]
(.isBefore a b))) (iterator-seq res)))
nil))
"Elapsed time: 410.652663 msecs"
=> nil
a range predicate can be used to influence the returned order, iff it correlates with the 'natural' binary sorted order, see https://github.com/xtdb/xtdb/discussions/1514
Is there a “lazy sorting” technique to realize sorted entity IDs sooner? I am only ever displaying at most 150 entities on one screen, so I can pull them in in parallel, but I want to be able to paginate faster.
The notes/messages in that discussion cover this in more detail, but there's an essential difficulty to making generic paginated sorting easy
A lot of people just dump a copy of everything into Elasticsearch for that sort of UI work
Will v2 support prospective txes like with xt/with-tx
?
Eventually yes, you can watch this issue to track progress: https://github.com/xtdb/xtdb/issues/2353
It's unlikely to look like with-tx
- it will likely take the form of a ~stateless query parameter (because db
no longer exists)
I am going to miss db
because I felt safe knowing that unless I passed a node to a fn, nothing could get transacted.
I also wondered why db
will be taken away. Seems like such a natural fit for an immutable DBMS.
Hey @UDQE4G9B2 🙂 Thanks for the Q. There's a couple of reasons we want to remove db
. First is to make the API more data oriented - in v2, all of the temporal dimension is available for query as part of the query argument, and therefore can then be combined with other query clauses. The other reason is to focus on producing a narrower 'core' API, that others can build upon if they so wish. This is partly to make XT more remote-API friendly, but also lessons we've learnt from v1. The db
in v1, in truth, doesn't do a lot, other than represent bitemporal coordinates. It would be fairly straightforward to recreate this in user-space for v2, or to make a thin wrapper, that wraps the core API.
@U051SPP9Z You could hopefully feel safe to some degree, by supplying the transaction time as part of the query, to get read consistent guarantees