Fork me on GitHub
#xtdb
<
2022-05-23
>
Paul09:05:24

Hi everyone, we are wondering if it is possible to “join by a key”. Our entries have a user UUID as key and we want to get the matching user data. As far as the XTDB documentation goes I wasn’t able to find anything fitting the problem. Current map

{#uuid-0 {...data}
 ...
 #uuid-x {...data}}
Desired map (or something similar if necessary)
{#uuid-0 {...data
          :user-name "Michael Scarn"}
 ...
 #uuid-x {...data
          :user-name "Tuna"}}
Thanks :)

tatut12:05:43

I'm not sure I understand the problem.. what is the current query code?

tatut12:05:07

You can return data from linked entities with pull

tatut12:05:51

(xt/q db '{:find [(pull ?e [:xt/id :attr1 {:link-field [:other-attr1]}])] :where [?e ...]})

Paul12:05:15

(-> (xt/db node)
    (xt/q '{:find  [(pull ?event [*])]
                            :where [[?foo :xt/id uid]]
                            :in    [uid]} id))

Paul12:05:37

Currently it is pretty simple. We just take the first returned item.

tatut12:05:51

so you should be able to add the attributes of the linked document to the pull pattern

Paul12:05:27

[(pull ?event [* {:foo/bar [:xt/id :user/name]}])]
Is what we are doing for the case where we have the UUID as value. {:xt/id #uuid…} But in our case the key is the UUID, not the value.

Paul12:05:57

The join, as seen in the documentation, uses the value of the :user/profession and replaces it with the map including the string name.

[(pull ?user [:user/name {:user/profession [:profession/name]}])]
Which is the only obvious documentation on this kind of join I was able to find.

tatut12:05:20

so the value is a map {:xt/id ...}?

tatut12:05:22

I would advise using the :xt/id value directly, I don't know if that would work

tatut12:05:28

joins work directly with the entity id values

Paul12:05:47

{#uuid-0 "abc" {:foo 1
                :bar 2}
 ...
 #uuid-x "abd" {:foo 1
                :bar 2}}
Basically our maps looks like this. And the reason is to have guaranteed uniqueness in the stored map.

Paul12:05:39

Okay thanks. So ‘join’ is not able to use the key as reference?

tatut12:05:31

maybe I'm misunderstanding, but that map doesn't look like a valid document

tatut12:05:44

UUID can't be an attribute name

Paul12:05:55

{"abc" {:foo 1
                :bar 2}
 ...
 "bcd" {:foo 1
                :bar 2}}
It is probably stored like this and the #uuid prefix is only printed when tabbing the value.

tatut12:05:35

A document is a map from keywords to values and each document must have an :xt/id key

tatut12:05:54

so string keys don't look right

Paul12:05:33

Ah, okay. Now I think I know what you mean. ^^ The whole entry looks like this

{:xt/id "…"
 :other/keys "foo"
 :user-list  {"abc" {:foo 1
                     :bar 2}
               ...
              "bcd" {:foo 1
                     :bar 2}}
}
Sorry for the confusion.

Paul12:05:05

And this :user-list keys are the xt/id (=UUID) I have as a reference and for which I want to get the join working.

tatut12:05:46

ok, afaict you can't do joins from within deep map values... at least not efficiently in pull

✔️ 1
Paul12:05:43

Alright. Efficiency shouldn’t be a problem in this case as we only look at one result and the nested :user-list map only contains <1000 values (average estimate is 10-20).

Paul12:05:52

Anyways, thank you very much 🙂

1
tatut12:05:54

user> (xt/submit-tx node [[::xt/put {:xt/id "foo" :user-list {"bar" {:some-thing 1}
                                                              "baz" {:some-thing 2}}}]
                          [::xt/put {:xt/id "bar" :another-thing 11}]
                          [::xt/put {:xt/id "baz" :another-thing 22}]])

;; query by splitting user-list map to k/v
user> (xt/q (xt/db node) '{:find [(pull ?e [*]) ?ul-value (pull ?ul-key [*])]
                           :where [[?e :user-list ?ul]
                                   [(seq ?ul) [[?ul-key ?ul-value]]]]})

;; result
#{[{:user-list {"bar" {:some-thing 1}, "baz" {:some-thing 2}},
    :xt/id "foo"}
   {:some-thing 1}
   {:another-thing 11, :xt/id "bar"}]
  [{:user-list {"bar" {:some-thing 1}, "baz" {:some-thing 2}},
    :xt/id "foo"}
   {:some-thing 2}
   {:another-thing 22, :xt/id "baz"}]}

tatut12:05:49

you can do an ugly hack like that, calling clojure seq function on the map and binding that as keys and values

Paul13:05:14

Yes, looks not as nice and concise as the usual join.

Paul13:05:15

But it answers my questions either way. thx!

Steven Deobald12:05:33

Reminder that the JUXT virtual office is doing an open house again in ~90 minutes: https://twitter.com/juxtpro/status/1528698081637715968

Steven Deobald12:05:14

If it's anything like last time, there will probably be some xtdb conversations happening. 🙂

Martynas M13:05:43

My goodness.... why does gather.town has so many dependencies... It's only the front page. Is there a way to join without registration?

🙈 1
Steven Deobald13:05:20

@U028ART884X Sorry I missed this. The tweet had the Gather link directly last time around so you didn't need to register with LinkedIn. Though it looks like you found Gather itself? You don't need to create an account with Gather to join a space, if that's what you were asking. We'll run another in-office event in about a month.

Martynas M13:05:00

I just wanted to try it out. I don't know what it would've been

Martynas M13:05:28

Hey. Is there a way to have XTDB consume kafka queue but then at the same time have some entities circumvent the queue to be inserted directly? I want to have some ephemeral entities that would only live until the instance dies. I think about separating the instances but then I wouldn't be able to run tx functions in a consistent way. :thinking_face: I.e. I want to use XTDB to run tx functions and subscribe to updates of these updated ephemeral entities but I also want these entities not to be involved in the MQ. Actually these ephemeral entities would have their own transaction functions that wouldn't be run on persisted docs. But that would surely need to separate the instance. So it's... not the same as with-tx because I may have quite a lot of these ephemeral entities. And then I also want to run tx functions there.

jarohen16:05:17

hey @U028ART884X - this isn't possible I'm afraid. one of the properties of XT is that it's completely deterministic based on the content of the tx-log - this is useful because you can be sure that two queries running against different nodes (with the same tx basis) will always return the same results. entities inserted from the side like this would break that property

tatut16:05:54

you could have a separate in-memory xtdb node for the ephemeral things

tatut16:05:29

querying is of course more complicated, but at least there is clear separation between what is the shared state of the system and what is not

Martynas M18:05:29

It's good that it's deterministic. And these ephemeral entities are a client-side thing. But I have to cache it somehow. I always query either the real data or those ephemeral entities (which are actually derived from real data). This is why I wasn't afraid to mix them with original docs. And I want to react from regular tx log and change those entities. But I'm not yet sure if I can do it without breaking consistency with the tx log. I think I could have some kind of a message queue that I would invoke in a transaction function but it would break consistency. Hmm. Not sure how to address it yet. I.e. having a second XTDB instance would break consistency between the log itself and this new instance because I want to have some transaction functions. Would it be possible to consume this log the second time and populate a second XTDB instance manually? For instance use some kind of core.async/tap function to hook into the stream of events. Blocking the tx ingestion thread would also be nice.

tatut04:05:29

I think using the XTDB tx log and tx functions for any other coordination things sounds like it would bring more trouble and complexity

Martynas M05:05:04

It does sound exactly like that to me as well. But I already have this tx function that takes regular docs and works with ephemeral docs (no other way around). And I think about how to make it more performant (it's not that it's bad but ephemeral entities could skip MQ and that should improve congestion). And this is how I ended up here. What would make my life easier is that I would be able to pass my own args or a second XTDB node into transaction function handler. This way I could handle two XTDB instances with one tx function.....? At this point I don't know what I talk about and I don't know if it can work. Also maybe valid-time is also enough and I could simply use it with in-memory XTDB instance from tx function in a blocking way. That may work too. (Yep, me... casually destroying the database design here xD) One additional way to do this would be to halt the Kafka Consumer and in the meantime calculate the ephemeral entity in the other DB. And then resume the consumer. And somehow react to txs.

Nikolas Pafitis15:05:58

Hi, is it possible to catch exceptions thrown by tx functions from application side?

Nikolas Pafitis15:05:10

Or some other way to get at least which transaction operation failed rather than just getting false from xt/tx-committed?

jarohen16:05:54

hey @U0105D1EL4B, I'm afraid that isn't possible at the moment. we have a card to track it, if memory serves

Tomas Brejla08:05:29

I had a need for that as well in the past and I ended up saving the exception data as an xtdb entity.. so basically something like this.: 1. on calling side, pass some sort of :correlation-id to tx function besides your payload you want to process 2. when exception occurs in tx fn, catch it and store it in xtdb, be sure not to throw any exception out of the tx fn 3. once finished, on calling side you try to lookup the "succesful entity". If not found, try looking up the error entity using correlation id. If found, use the data and optionally evict that error info entity. So I was doing something like this on calling side

(catch Throwable t
             [[::xt/put {:xt/id (java.util.UUID/randomUUID)
                         :entity-type :order-placement-error
                         :error-type :throwable
                         :correlation-id (:correlation-id order)
                         :error-data {:throwable t}
                         :order-id (:xt/id order)}]])))}]]))
and then on the calling side, after the tx function got executed:
(if-let [created-order (ffirst (xt/q
                                    (xt/db xtdb-node)
                                    {:find '[(pull ?e [*])]
                                     :where [['?e :xt/id order-id]]}))]
      {:order created-order}
      (if-let [order-placement-error (ffirst (xt/q
                                              (xt/db xtdb-node)
                                              {:find '[(pull ?e [*])]
                                               :where [['?e :entity-type :order-placement-error]
                                                       ['?e :correlation-id order-id]]}))]
        (do
          (xt/submit-tx xtdb-node [[::xt/evict (:xt/id order-placement-error)]])
          {:error order-placement-error})
        {:error :unknown-error}))))
(disclaimer: the code was quickly hacked and just a POC really, so it could be definitely improved. And all this might be a bad idea, but at that time I believe it was a working way to go)

👍 1
🙂 1
Tomas Brejla08:05:32

btw here's a link to previous discussion that lead to this ^^^ https://clojurians.slack.com/archives/CG3AM2F7V/p1635806296075200

🙏 1
Nikolas Pafitis13:05:08

@U899JBRPF Is it necessary to do a speculative submit and if so why? I'd think it's not necessary since if an operation fails the whole transaction is reverted anyways?

Nikolas Pafitis13:05:21

@U899JBRPF Also in the gist you use this predicate

(dev/with-tx-failed? $ ops)
is this some kind of built-in predicate or do i have to add it to xtdb somehow?

refset09:05:31

Thanks for chiming in @U01LFP3LA6P! It's really great to hear about how you tackled this 🙂 > Is it necessary to do a speculative submit and if so why? I'd think it's not necessary since if an operation fails the whole transaction is reverted anyways? In my example, with-tx is repeating the original work of the transaction that failed, which is somewhat inefficient but hopefully useful enough. > is this some kind of built-in predicate or do i have to add it to xtdb somehow? It's just a regular function that is available on the classpath (in the example it's simply in my dev ns), XT will resolve and execute it dynamically