Fork me on GitHub
#xtdb
<
2024-01-30
>
Namit Shah06:01:48

Hi. Can you anyone help me with the following query. > Does using RocksDB for all stores consume less CPU overall (reading + writing) than using combination of JDBC (document store) + Kakfa (transaction store) + RocksDB (index store)?

refset13:01:55

Hey @U04DDKZJERF Rocks is definitely mechanically simpler than both a full SQL database (used in the case of a JDBC backend) or Kafka, so I expect it can be much more efficient than both for certain workloads, however it's a bit of an apples-to-oranges comparison since using Rocks as a document-store/tx-log will not give you a Highly Available system (and therefore you would need to migrate to Kafka/JDBC if you want HA later down the road)

Namit Shah04:01:10

Noted. The Problem: We have a service running with full RocksDB backend it works smoothly. We are willing to shift to a HA so we went with the combination of JDBC (document store) + Kafka (transaction log) + RocksDB (index store). The service’s job is to read DB events present in the Kafka and build the documents in the XTDB (this also includes running few queries while writing). By using this combination, the CPU utilization got increased and at some point it reaches it max which slows down the service and ultimately the consumer while is polling the Kafka for events gets kicked out and it crashes. Is there any extra configuration which I am missing for preventing the high CPU utilization or something else is going on here.

refset12:01:36

> the CPU utilization as in the for the server the XT node is using, specifically? if anything I would expect less CPU usage in that scenario :thinking_face:

refset12:01:07

it might be a good time to connect a profiler (Yourkit or whatever) for more clues

Namit Shah12:01:20

Will try to configure the profiler and see if I can find anything. Thanks for the help.

refset15:01:53

You're very welcome, I will be on standby 🙂

Namit Shah07:02:19

AFn.java:18 xtdb.tx$__GT_tx_ingester$fn__41300$submit_job_BANG___41301$fn__41302.invoke() 64353 91
I connected the profiler and was monitoring the CPU. I inspected the time where the CPU is high and the one with highest time was this. I am not sure what information this gives. Does this mean that the transaction logic is CPU intensive?

refset11:02:30

hmm, yes that does sound relevant, although it's still not clear why there should be a difference between Rocks and the other backends here

refset11:02:38

can you generate a flame graph?

refset11:02:29

I would be very surprised if that submit-job! call could be the problem itself

benny13:01:24

Maybe this is more a datalog question, but I'll ask here for now. I'm trying to model a durable command/event queue in xtdb. In the query that is giving me pause I'm looking for all commands that aren't referenced by an event.

(->>
 (xt/q (xt/db !xtdb)
       '{:find [?e (pull ?e [:command/name
                             :command/payload
                             {(:event/_parent {:as :handle-event})
                              [:event/name :xt/id]}])]
         :where [[?e :db/type :db.type/command]]})
 (filter (fn [[_eid doc]] (not (contains? doc :handle-event)))))
Is there a way to have this expressed in pure datalog?

Thomas Moerman15:01:50

try perhaps:

'{:find [?e (pull ?e [:command/name
                        :command/payload
                        {(:event/_parent {:as :handle-event})
                         [:event/name :xt/id]}])]
    :where [[?e :db/type :db.type/command]
            (not [?e :handle-event])]}

benny16:01:17

That doesn't work. I guess it can't because :handle-event is only created in the :find pull-syntax and doesn't exist. An event references a command but not vice-versa.

Thomas Moerman16:01:25

Ah i see what you mean now, sorry for shooting from the hip :face_with_cowboy_hat: 🔫

Thomas Moerman16:01:31

perhaps this?

'{:find [?cmd (pull ?cmd [:command/name
                          :command/payload
                          {(:event/_parent {:as :handle-event})
                           [:event/name :xt/id]}])]
  :where [[?cmd :db/type :db.type/command]
          (not-join [?cmd]
            [?evt :event/parent ?cmd])]}

1
benny16:01:17

Yeah! That works! Thanks. :-)

1