Fork me on GitHub
#xtdb
<
2021-03-02
>
Aleksander Rendtslev01:03:00

Given the following document:

{:crux.db/id #uuid "5a5f266c-8311-4e18-b6ce-c06f0aa127ea", :entry/id #uuid "5a5f266c-8311-4e18-b6ce-c06f0aa127ea", :entry/text "Test!", :entry/refs [], :entry/user #uuid "17774272-6ca5-40b4-80f5-88ca21c65c04"}
How come this works:
(repo/q *node*
    `{:find  [(eql/project ?el [:entry/refs :entry/id :entry/user :entry/text])]
      :where [[?el :entry/text ?text]
              [?el :entry/user ?user]]})
;; #{[#:entry{:id #uuid "5a5f266c-8311-4e18-b6ce-c06f0aa127ea", :text "Test!", :refs [], :user #uuid "17774272-6ca5-40b4-80f5-88ca21c65c04"}]}
But this doesn’t:
(repo/q *node*
    '{:find  [?el ?text ?refs ?user]
      :keys [entry/id entry/refs entry/text entry/user]
      :where [[?el :entry/refs ?refs]
              [?el :entry/user ?user]
              [?el :entry/text ?text]]})
;; #{}
The issue seems to be the [?el :entry/refs ?refs] in the where clause

refset10:03:22

Hi, this is because Crux breaks down vector values in documents as individual triples (a.k.a. cardinality-many), therefore an empty vector in a document is completely invisible to the query engine (asides from eql/project which works with documents). i.e. there are no values in the indexes available for ?refs to bind to in [?el :entry/refs ?refs] This also the case for set values. There are a few examples in query_test.clj that illustrate this, e.g. https://github.com/juxt/crux/blob/0f982be3bf57545e456007c0a57fcd5b14a66fdf/crux-test/test/crux/query_test.clj#L868-L883 https://github.com/juxt/crux/blob/0f982be3bf57545e456007c0a57fcd5b14a66fdf/crux-test/test/crux/query_test.clj#L2916-L2952 However lists, and all other Clojure collections, are only indexed opaquely as "Objects" which are hashed + serialised into bytes using Nippy (and therefore they won't be lexicographically encoded and sorted in the indexes for efficient range scans). As it happens, we have an in-flight PR that aims to make this clearer in the docs under a new "Model" section: https://github.com/juxt/crux/pull/1420/files#diff-cf084dcf0c8886e9638cab14dfb2a7341837cb7bc0613103b5d36384ed5466e0R45-R58 ...I'm sorry it wasn't ready in time to avoid confusion on this occasion 😅

Aleksander Rendtslev11:03:45

I’ll read through that! luckily the project is available. Otherwise entries without refs would be inaccessible for me in crux. One thing with project by the way: it’s always projected into a set of vectors of maps. I know the vectors are an artifact of the original :find api. but since you guys implemented the :keys property, I’m wondering if there could be away to make projects skip the vector

Aleksander Rendtslev11:03:50

Arhh, it makes a lot more sense with that model doc. Super helpful!!!

🙏 3
refset11:03:23

Thanks for the feedback 🙂 > it’s always projected into a set of vectors of maps Do you mean specifically when joining in the projection?

Aleksander Rendtslev12:03:37

What I mean is:

(repo/q *node*
    `{:find  [(eql/project ?el [:entry/refs :entry/id :entry/user :entry/text])]
      :where [[?el :entry/text ?text]
              [?el :entry/user ?user]]})

;; OUTPUTS:
;; #{[#:entry{:id #uuid "5a5f266c-8311-4e18-b6ce-c06f0aa127ea", :text "Test!", :refs [], :user #uuid "17774272-6ca5-40b4-80f5-88ca21c65c04"}]}

;; WOULD PREFER
;; #{#:entry{:id #uuid "5a5f266c-8311-4e18-b6ce-c06f0aa127ea", :text "Test!", :refs [], :user #uuid "17774272-6ca5-40b4-80f5-88ca21c65c04"}}
The vector wrapping my eql/projection is almost always redundant for me. Right now I’ve created a repo function that always does the following:
(defn project
  "A query function for usage with eql/projet"
  ([db query] (project db query nil))
  ([db query coll]
    (->> (q db query coll)
         (map clojure.core/first)
         vec)))

refset11:03:30

Got it, thanks for the example! This is an ergonomic problem more generally in that :find must currently always be a vector. Adding support for a single unwrapped find-arg as well would solve this https://github.com/juxt/crux/blob/master/crux-core/src/crux/query.clj#L125 I'll add a note to the backlog for discussion soon 🙂

🙌 3
tianshu23:03:45

I just started to learn Crux, I really like its architecture design, and great document. As coming from the SQL world, one thing confuse me that, how can I get the unique constraint? Should I always use id with crux.tx/match to make something unique?

refset23:03:54

Hi 🙂 thanks for the nice words! A uniqueness constraint can be handled 3 ways: 1. encode the unique value in the ID (e.g. store the value(s) in a string ID, or use a map ID) of an entity representing that value, with a reference to the other entity that currently "owns" the value 2. use :crux.tx/match, however this takes time to confirm successful assertions and when there is contention you need to implement retry logic at each client 3. use a transaction function https://opencrux.com/reference/21.02-1.15.0/transactions.html#transaction-functions - this is slower still, though at least avoids the need to worry about contention & retrying. You can also express very complex constraints in these functions Do you think those are enough to cover the requirement?

tianshu13:03:47

Yes, very make sense! Thank you very much!

🙏 3
tianshu06:03:37

After some play with these solutions, I find use crux.db/id + crux.tx/match works well in my case. But how can get I know whether a transaction is ignored. Is that I have to query that entity again to confirm?

refset09:03:51

Cool, glad to hear that 🙂 check out the tx-committed? API - I think it does what you want: https://opencrux.com/reference/21.02-1.15.0/clojure-api.html#_tx_committed

tianshu09:03:33

thank you! But I got crux.api.NodeOutOfSyncException even the tx is successfully committed with submit-tx. Should I always use await-tx before tx-committed?? I saw there is another function submit-tx-async so I was thinking submit-tx is a synchronized version.

refset10:03:20

Ah, yes you need to use await-tx first since submit-tx is asynchronous...I hadn't realised that submit-tx-async adds confusion to this before 😬

refset10:03:48

I'll write something about this in the docs now. Sorry it wasn't clearer!

tianshu11:03:35

I see, so I should always use submit-tx and it is async. Thanks!

refset11:03:35

That's right :thumbsup: I was considering adding this example function to the docs to help make things clearer:

(defn transact [node tx]
  "Synchronous transactions, reduced throughput"
  (->> (crux/submit-tx node tx)
       (crux/await-tx node)
       (crux/tx-committed? node)))

tianshu13:03:59

Great! 👍

tianshu13:03:30

I also play with transaction fn, however when submit-tx with a transaction fn, the document looks like disappear(I'm using the in-memory node). I quoted the function with syntax quote instead of simple quote. I'll try more to get more information.

refset17:03:20

hmm, things definitely shouldn't disappear 🙂 I find syntax quoting can make queries confusing, since all the symbols get namespaced

tianshu02:03:20

Okay, good to know that I should use simple quote. The syntax quote usually work for writing queries, like

`{:where [[x :x/name ~name]]
  :find [(eql/project x ~query)]}
I'm not sure whether this is a proper way.