Fork me on GitHub
#xtdb
<
2022-09-06
>
Dmitri Akatov12:09:00

Hi, I’m trying to understand how the :xt/id values are stored, indexed, and looked up, in particular what happens when I use a map, which is https://docs.xtdb.com/language-reference/datalog-transactions/#valid-ids. Judging by https://discuss.xtdb.com/t/entity-document-ids-with-nested-maps-and-equality-semantics/42 (about the caveat of using ints and longs as an :xt/id directly vs inside a (nested) map) and https://github.com/xtdb/xtdb/blob/master/core/src/xtdb/codec.clj, these maps at some point end up getting converted:

(type (xtdb.codec/new-id {:some "map"}))
;; => xtdb.codec.Id 
Now https://xtdb.com/blog/dev-diary-may-22/ claims that > Pillar #4: Dynamism > XTDB 1.21.0 allows almost any data to be stored including deeply nested documents with arbitrary java.io.Serializable types (thanks to https://github.com/ptaoussanis/nippy!), however, https://en.wikipedia.org/wiki/Sargable (index-backed) querying is restricted to top-level values. My questions are thus: How do map values for :xt/id actually get stored and how do they get looked up? Is it by always running them through the xtdb.codec/new-id function for both storage and lookup? And does the blog post describe storing other nested values rather than :xt/id values here? ----------------------

refset12:09:23

Hey @UJX2D0H9B IDs and values are handled pretty similarly, although the binary forms for the two use different type prefixes. The codec namespace is fairly self-contained and a good place to build up some foundational understanding, e.g. https://github.com/xtdb/xtdb/blob/065f41945daed67c6742a5c480864eaa24f6e48d/core/src/xtdb/codec.clj#L536-L618

Dmitri Akatov12:09:38

so does this mean that nested values still get indexed in a manner of speaking, but it’s like a hash of the entire structure that’s put in the index rather than any individual values?

refset12:09:20

that's exactly right, yep

🙌 1
refset12:09:03

~large values are always pulled from the document store (via a cache), because an actual hash cannot be reversibly decoded 🙂

Vinicius Vieira Tozzi22:09:33

Hello everyone, I am very new to xtdb, but I am having a weird issue, when I run the query:

(xt/q (xt/db xtdb-node)
                    '{:find [(pull ?e [*])]
                      :where [[?e :xt/id 7]]})
I get one entity with id 7, which seems right for me, but if I put this in a function where I get the id as an argument like this:
(defn get-todo [id]
  (xt/q (xt/db xtdb-node)
        '{:find [(pull ?e [*])]
          :where [[?e :xt/id id]]}))

(get-todo 7)
Then I get all the entities from the database, am I missing something super obvious?

Alex Miller (Clojure team)22:09:11

in the query, id is in a quoted form so it's 'id not 7

1
🙏 1
Vinicius Vieira Tozzi22:09:40

Ah I was really missing something obvious, thanks!

tatut09:09:12

also xt/entity and as a general note, it’s better to pass in a db value to functions instead of accessing a global node

Vinicius Vieira Tozzi22:09:19

Yes in was really the correct way for solving that, thanks.

Vinicius Vieira Tozzi22:09:44

@U11SJ6Q0K But I did not understood, what do you mean to pass a db value instead of accessing a global node? How can I do that? I thought xt/q would always get a db node as parameter

Jacob O'Bryant22:09:34

He's saying that instead of a function like this:

(defn get-todo [id]
  (xt/q (xt/db xtdb-node)
        '{:find [(pull ?e [*])]
          :in [id]
          :where [[?e :xt/id id]]}
        id))
It's usually better to have a function like this:
(defn get-todo [db id]
  (xt/q db
        '{:find [(pull ?e [*])]
          :in [id]
          :where [[?e :xt/id id]]}
        id))
It's better functional style, since db is an immutable value while xtdb-node is not. So the second get-todo is a pure function. And also in this specific case, what you actually want is probably just the xt/entity function. Instead of calling (first (get-todo db id)) (using the function in the second example) you can call (xt/entity db id) which does the same thing. Another option would be (xt/pull db '[*] id) which can be better in some situations.

☝️ 1
Vinicius Vieira Tozzi08:09:49

Ok got it, yeah makes sense, thanks for the explanation!