Fork me on GitHub
#xtdb
<
2024-03-19
>
kokonut20:03:53

I wonder if xtdb guarantees commit so the recorded data can be pulled/queried right after the recording transaction is committed. For sure I am doing something wrong, my tiny web app makes a query but cannot pull the data that was recorded prior to this step. It seems it can after a few milliseconds. My xtdb is on AWS postgres. Can someone please let me know what I am missing.

kokonut21:03:30

Oh, and plus I am using xtdb 1.

Jacob O'Bryant21:03:59

If you pass the result of xt/submit-tx to xt/await-tx and then get a new DB value, you should be guaranteed to have the transaction results in the DB. e.g.:

(let [tx-result (xt/submit-tx node ...)
      _ (xt/await-tx node tx-result)
      db (xt/db node)]
  ...)

👍 1
🙏 1
kokonut21:03:13

Awesome. Thanks.

👌 1
flefik22:03:07

This query times out despite there not being more than a few thousand records with :type :timecard. How can I optimise it? Are there any docs about how to speed up slow queries?

flefik22:03:15

(xt/q
         (xt/db node) '{:find [tc p1 user d value dd]
                        :where [[tc :type :timecard]
                                [d1 :project p1]
                                [p1 :value value]
                                [tc :email user]
                                [tc :duration d]
                                [tc :date dd]
                                [(clj-time.core/within? interval dd)]
                                ]
                        :order-by [[dd :desc]]
                        :in [interval]}
         interval)

Oliver Marshall09:03:09

Is one of d1, p1 or value supposed to join onto something from tc?

Oliver Marshall09:03:21

It's possible it's timing out because it will be returning every combination of tc and p1, for example:

(xt/submit-tx xtdb-node
              [[::xt/put {:xt/id :a :type :one}]
               [::xt/put {:xt/id :b :type :one}]
               [::xt/put {:xt/id :c :type :two}]
               [::xt/put {:xt/id :d :type :two}]])

(xt/q (xt/db xtdb-node)
      '{:find [id1 id2]
        :where [[id1 :type :one]
                [id2 :type :two]]})
; => #{[:a :d] [:b :d] [:a :c] [:b :c]}

flefik10:03:59

I’ve now rewritten this query like so, which is a lot quicker, but still not as fast as I’d like:

flefik10:03:01

(xt/q
         (xt/db node) '{:find [tc p1 user d value dd]
                        :where [[tc :deliverable d1]
                                [d1 :project p1]
                                [p1 :value value]
                                [tc :email user]
                                [tc :duration d]
                                [tc :date dd]
                                (matches tc interval)
                                ]
                        :rules [[(matches [tc] i)
                                 [tc :type :timecard]
                                 [tc :date dd]
                                 [(clj-time.core/within? i dd)]
                                 ]]
                        :order-by [[dd :desc]]
                        :in [interval]}
         interval)

flefik10:03:39

Yes, in the example I submitted I had missed [tc :deliverable d1]

flefik10:03:17

It was timing out in both versions (with tc :deliverable ,,, and without)

1
Oliver Marshall11:03:02

I tried replicating your issue with some generated data:

(let [projects
      (for [id (range 10)]
        {:xt/id (str "p-" id)
         :value (str "value-" id)})
      deliverables
      (for [id (range 100)]
        {:xt/id (str "d-" id)
         :project (str "p-" (rand-int 10))})
      timecards
      (for [id (range 10000)]
        {:xt/id (str "tc-" id)
         :type :timecard
         :deliverable (str "d-" (rand-int 100))
         :date (t/date-time (+ 2019 (rand-int 3)))
         :email (str id "@email.com")
         :duration :something})]
  (xt/submit-tx xtdb-node
                (->> (concat projects deliverables timecards)
                     (map (fn [doc] [::xt/put doc])))))

(time
  (xt/q (xt/db xtdb-node)
        '{:find [tc p1 user d value dd]
          :where [[tc :deliverable d1]
                  [d1 :project p1]
                  [p1 :value value]
                  [tc :email user]
                  [tc :duration d]
                  [tc :date dd]
                  (matches tc interval)]

          :rules [[(matches [tc] i)
                   [tc :type :timecard]
                   [tc :date dd]
                   [(clj-time.core/within? i dd)]]]

          :order-by [[dd :desc]]
          :in [interval]}
     (t/interval (t/date-time 2019) (t/date-time 2021))))
; (out) "Elapsed time: 479.920167 msecs"
But I can't seem to replicate it timing out With a bit of fiddling though just removing the [tc :type :timecard] give a nice speedup:
(time
  (xt/q (xt/db xtdb-node)
        '{:find [tc p1 user d value dd]
          :where [[tc :deliverable d1]
                  [d1 :project p1]
                  [p1 :value value]
                  [tc :email user]
                  [tc :duration d]
                  [tc :date dd]
                  (matches tc interval)]

          :rules [[(matches [tc] i)
                   ;[tc :type :timecard]
                   [tc :date dd]
                   [(clj-time.core/within? i dd)]]]

          :order-by [[dd :desc]]
          :in [interval]}
     (t/interval (t/date-time 2019) (t/date-time 2021))))
; (out) "Elapsed time: 301.287542 msecs"

Oliver Marshall11:03:30

Btw, this query is just as fast and more readable imo:

(xt/q (xt/db xtdb-node)
      '{:find [(pull tc [:email :duration :date 
                         {:deliverable [{:project [:value]}]}])
               dd]
        :where [[tc :date dd]
                [(clj-time.core/within? interval dd)]]
        :order-by [[dd :desc]]
        :in [interval]}
      (t/interval (t/date-time 2019) (t/date-time 2021)))

flefik11:03:50

That’s so interesting

flefik11:03:57

Because I’m still seeing it slow on my system

flefik11:03:58

Ah no, on a fresh database it’s very quick

flefik12:03:03

And I have way less than 10,000 timecards

Oliver Marshall12:03:20

Hmm, how is your node setup?