Fork me on GitHub
#xtdb
<
2022-06-16
>
flefik07:06:49

Any reason why my xt/await-tx call would block indefinitely?

com.xtdb/xtdb-core {:mvn/version "1.20.0"}
        com.xtdb/xtdb-jdbc {:mvn/version "1.20.0"}
Code:
(defn- user-record [email]
  {:xt/id email
   :last-signin (java.util.Date.)
   :type :user})

,,,

(let [tx (xt/submit-tx node [[::xt/put (user-record email)]])]
       (xt/await-tx node tx))
(I checked, and the tx shows up in postgres immediately)

refset07:06:21

Hi, can you try upgrading to 1.21? There were a few jdbc fixes that might unblock this

flefik08:06:03

okay thanks. I’ve just done that, and I’ve also migrated to rocksdb

refset10:06:55

cool, did it have any impact? still hanging?

flefik09:06:52

I don’t know yet. I had to rollback the rocksdb migration since it’s crashing the JVM

refset09:06:33

Oh, that sounds weird. Are you using an M1?

flefik19:06:37

No amd64 ubuntu

flefik19:06:20

The upgrade to 1.21 does not solve the problem. It works for a little while, but after a couple of minutes/hours it seems that new transactions aren’t being ingested by my node

flefik19:06:27

If I reboot it recovers

refset20:06:05

hmm, and what does your RAM and CPU look like while this is happening?

refset20:06:44

Maybe there's some missing memory config

flefik20:06:49

(defstate node
  :start (xtdb/start-node
          {:xtdb.jdbc/connection-pool {:dialect {:xtdb/module 'xtdb.jdbc.psql/->dialect}
                                       :db-spec (:db (config))}
           :xtdb/tx-log               {:xtdb/module     'xtdb.jdbc/->tx-log
                                       :connection-pool :xtdb.jdbc/connection-pool}
           :xtdb/document-store       {:xtdb/module     'xtdb.jdbc/->document-store
                                       :connection-pool :xtdb.jdbc/connection-pool}
           ; :xtdb.http-server/server   {:port 5000}
           ;; :xtdb/index-store {:kv-store {:xtdb/module 'xtdb.rocksdb/->kv-store
           ;;                               :db-dir (io/file "/tmp/rocksdb/db")}}
           })
  :stop (.close @node))

flefik20:06:40

(:db (config)) is just a database url of the form :port/dbname

refset10:06:41

Sorry for the delay - thanks for sharing the extra info, although it's not clear what might be happening still :thinking_face:

refset10:06:20

What are you JVM memory configs (if any), i.e. -Xms -Xmx?

refset10:06:48

My next step would be to reach for a profiler to look for heap memory usage / GC pausing, using https://www.yourkit.com/java/profiler/features/ or similar

refset10:06:56

How much data are you loading in total? How many transactions? How many operations per transaction on average?

refset11:06:32

Do you know what is happening with the Postgres instance? Is it reporting that everything is healthy?

flefik08:06:39

Is there any documentation anywhere on how to improve the performance of queries?

flefik08:06:36

Here’s my slow query:

(defn active-deliverables [node]
  (->> (xtdb/q
        (xtdb/db node) '{:find [d cn (sum du) (pull d [*]) (distinct deps)]
                         :where [[d :type :deliverable]
                                 [d :project p]
                                 [p :code-name cn]
                                 (not [d :date nil])
                                 (not [d :status _])
                                 (or-join [du d]
                                          (and [tc :type :timecard]
                                               [tc :duration du]
                                               [tc :deliverable d])
                                          (and [tc :type :timecard]
                                               [tc :duration du]
                                               [tc :deliverable d2]
                                               [d2 :parent d])
                                          (and [(identity 0) du]
                                               [d :type :deliverable]))
                                 (or-join [deps d]
                                          (and
                                           [deps :type :deliverable]
                                           [deps :parent d])
                                          (and [(identity nil) deps]
                                               [d :type :deliverable]))]})
       (map (fn [[_ code-name ssf attr deps]] (merge attr
                                                     {:code-name code-name
                                                      :spent-so-far ssf
                                                      :dependencies deps})))))
It pulls in the effort spent on a task (deliverable), which parent tasks it has and the project codename the task belongs to. Example records from the schema are::
{:type :project :codename x :status "active"}
{:xt/id #uuid "..." :type :deliverable :project p :status "completed" :parent d2}
{:type :timecard :duration du :deliverable d}
There are 10000 timecards, 1000 deliverables and ~100 projects

tatut11:06:18

did you try (xtdb.query/query-plan-for db ...query...)if it gives anything interesting

1
refset12:06:01

> Is there any documentation anywhere on how to improve the performance of queries? There's no central resource, but there have been a lot of discussions and issues that might hold clues. Looking at query-plan-for and the vars-in-join-order in particular is the best first step

refset12:06:37

I suspect the two (not ...) rule clauses are the source of the slowness though

refset12:06:19

You could try instead:

;; (not [d :date nil])
[d :date date]
[(some? date)]
And for
(not [d :status _])
...you could store explicit nil values also, if that is acceptable for your data model(?)

flefik09:06:17

Thank you @U11SJ6Q0K and @U899JBRPF I will explore these ideas!

🙏 1
citrouille08:06:39

Hello, does XTDB provide a mechanism to 'populate' a fresh new kafka topic (using kafka Document store) from a Checkpoint or local rockdb instance? Tx-log also managed by Kafka.

refset12:06:25

Hi @U026YA2AQSX we use the term 'migrating' to describe this. Given the large number of possible migration paths, we haven't yet built a generic tool for this, however you can attempt to use / draw inspiration from the last two commits here: https://github.com/xtdb/xtdb/tree/crux-migration-hack

refset12:06:43

If this is something urgent you need help with though, please feel free to DM me and I'm sure we can help

citrouille12:06:16

Thanks Jeremy! I'll take look and try to to make some POC with. For the moment it's not really critical, were are spending some times to have a strategy in case of kafka failure since database is constantly growing :thumbsup::skin-tone-3:

🙏 1