xtdb 2021-11-23 | Slack Archive

cjmurphy09:11:31

After tx-committed? has returned false, is there a corresponding function in PXtdb (or elsewhere) that will give the reason why the commit did not happen?

refset12:11:17

There's not currently, although this has been discussed here in the past few weeks: https://clojurians.slack.com/archives/CG3AM2F7V/p1635806296075200 (....which means its about time for a proper entry in the docs 😅) The short answer is that we could store this kind of information, but currently don't. What is your use-case? Are you using match operations and want to observe contention? Or is this for during development usage when working with transaction functions? etc.

cjmurphy13:11:44

Just for development when creating transaction functions. As it happens this time I was doing a count of an infinite sequence inside a transaction function, so there must have been a timeout. Speed of development is the use case!

👍 1

refset13:11:34

Thanks, that's useful to know. I'll try to gather tips on developing with transaction functions somewhere

cjmurphy13:11:00

This is the macro I've found very helpful: https://github.com/lgessler/glam/blob/80674fbf9dd141a96499a704a7b77d3f0d34b13d/src/main/glam/xtdb/easy.clj#L78 for writing them.

🙏 1

👏 1

cjmurphy13:11:32

The infinite sequence wasn't the only problem. The next was an assert being tripped. Only found that with debugs between calls, then when I got to the problem call wrote a test just for that function call.

Tomas Brejla11:11:13

regarding the thread that @U899JBRPFmentioned ^^^. I didn't need :match in my usecase and it wouldn't probably be useful in tx function. My use-case required transaction function to be used anyway and I was in control of throwing some "business error" inside that tx function myself. So instead of throwing error, I simply started returning "error document" with correlation id that I provided to the function as argument. The flow that I ended up with (for now) looks like: 1. call tx function that should place order entity (if all prerequisities are met). Pass correlation-id to that tx function. 2. submit tx, await-tx 3. if tx didn't get commited, then something went terribly wrong. I have to check logs etc. In normal conditions, tx should get commited no matter what and either create the order itself or error document. 4. Fetch the order (query by its pre-defined :xt/id) which was supposed to be created. 5. if order wasn't found, try to fetch error document by correlation id. 6. Alternatively also delete error document it if no longer needed. It is a bit complex, but it worked fine for me so far. One could probably create some helper macro or function which would help building similar pattern if you have to use it over and over again.

💡 2

sheluchin18:11:22

I know it's a very broad question with countless possible solutions, but I'm pretty new to XTDB and running into some performance issues. Most of my needs are around aggregating stats across time periods. The schema is dead simple at the moment. Are there some low hanging fruit to go for when optimizing for that sort of thing? I'm presuming the choice of the index store is impactful. Do backends used for the tx log and docstore matter much? Where does query optimization typically take place in the XTDB stack? Some general advice would be appreciated.

refset19:11:03

> Most of my needs are around aggregating stats across time periods Can you share an example of the kind of aggregation query? In general, :find aggregates will involve scanning through a lot of data,. > Are there some low hanging fruit to go for when optimizing for that sort of thing? Firstly I'd compare the shape of your query against various known issues, e.g. https://github.com/xtdb/xtdb/issues/1533 and https://github.com/xtdb/xtdb/issues/1447 Then you could turn on DEBUG logging for xtdb.query to get an insight to the crucial vars-in-join-order part of the query plan. However if you want to send me a gist privately (or whatever) I'd be happy to give it a little thought before you go too far down any rabbit holes. > Do backends used for the tx log and docstore matter much? When querying, the doc-store only comes into play when using pull and there's a cache miss. > Where does query optimization typically take place in the XTDB stack? During query compilation, which is before any :in binding cardinalities are taken into account. Due to the underlying execution model there isn't a simple equivalent to SQL's 'EXPLAIN' (...yet)

sheluchin17:11:57

Hey @U899JBRPF, thanks for the offer. The queries I'm dealing with currently look something like this:

(defn- inst->year
  [inst]
  (-> inst
      dt/beginning-of-year
      dt/inst->html-date
      (subs 0 4)))

(defn parent-subchildren-per-year
  [db {parent-id :parent/id}]
  (-> (query db
        {:find '[?year (count ?subchild)]
         :in '[?parent-id]
         :keys '[stats/period stats/count]
         :where '[[?children :child/parent ?parent-id]
                  [?subchildren :subchild/child ?children]
                  [?subchildren :subchild/datetime ?date]
                  [(database-queries/inst->year ?date) ?year]]
         :order-by '[[?year :asc]]}
        parent-id)))

With about 50K records loaded in the DB, this is taking a number of seconds per query.. a bit longer than I'd like, even on my own machine. I'm guessing the inst->year translation is kinda expensive, but I'm not aware of a cheaper way of doing it.

Tomas Brejla20:11:24

@UPWHQK562 is the query faster if you remove the (count ?subchild) part from :find part of the query?

sheluchin21:11:09

@U01LFP3LA6P doesn't seem like it.

refset13:11:51

(Sorry about the delayed response!) I'm reasonably sure that inst->year will be insignificant here. Could you try hardcoding parent-id first and see if there is a speedup:

(defn parent-subchildren-per-year
  [db {parent-id :parent/id}]
  (-> (query db
        {:find '[?year (count ?subchildren)] ; I'm guessing `?subchild` was a typo
         :keys '[stats/period stats/count]
         :where [['?children ':child/parent parent-id]
                 '[?subchildren :subchild/child ?children]
                 '[?subchildren :subchild/datetime ?date]
                 '[(database-queries/inst->year ?date) ?year]]
         :order-by '[[?year :asc]]})))

refset13:11:06

Also, what is the actual result of the count (like, how wide are these trees 🙂)

sheluchin13:11:18

Hi, yes, I'll try that shortly. The counts are usually somewhere between 1K and 5K. Give or take.

sheluchin13:11:40

No worries about the delayed response!

🙏 1

sheluchin14:11:47

Yeah, there's a bit of a speed improvement there:

;; no :in
"Elapsed time: 3290.282923 msecs"
"Elapsed time: 3178.380908 msecs"
"Elapsed time: 3185.247063 msecs"
"Elapsed time: 3212.157273 msecs"
"Elapsed time: 3258.191435 msecs"

;; with :in
"Elapsed time: 5425.111825 msecs"
"Elapsed time: 6351.666858 msecs"
"Elapsed time: 4552.427955 msecs"
"Elapsed time: 4513.602178 msecs"
"Elapsed time: 5519.375394 msecs"

👍 1

refset14:11:13

Okay, well that's something - means you are hitting https://github.com/xtdb/xtdb/issues/1447 - as for further improvement ideas I'm not sure what next to try. Please could you paste the output of the following (I'm hoping to see the :vars-in-join-order entry in particular):

(defn parent-subchildren-per-year
  [db {parent-id :parent/id}]
  (-> (xtdb.query/query-plan-for
        db
        {:find '[?year (count ?subchildren)]
         :keys '[stats/period stats/count]
         :where [['?children ':child/parent parent-id]
                 '[?subchildren :subchild/child ?children]
                 '[?subchildren :subchild/datetime ?date]
                 '[(database-queries/inst->year ?date) ?year]]
         :order-by '[[?year :asc]]})))

refset15:11:41

> The counts are usually somewhere between 1K and 5K My gut feeling is that multiple seconds for this isn't completely surprising, assuming that every index lookup is a cache miss requiring disk I/O

sheluchin15:11:43

@U899JBRPF here's the query plan:

{:depth->constraints
 [nil
  nil
  nil
  nil
  [#function[xtdb.query/eval86466/fn--86469/pred-constraint--86473]]
  nil],
 :var->range-constraints {},
 :var->logic-var-range-constraint-fns {},
 :vars-in-join-order
 [#uuid "ffffffff-ffff-ffff-ffff-000000000001"
  ?children
  ?subchildren
  ?date
  ?year],
 :var->joins
 {#uuid "ffffffff-ffff-ffff-ffff-000000000001"
  [{:id triple98359,
    :idx-fn #function[xtdb.query/triple-joins/fn--85929/fn--85931]}
   {:idx-fn #function[xtdb.query/triple-joins/fn--85929/fn--85935]}],
  ?children
  [{:id triple98359,
    :idx-fn #function[xtdb.query/triple-joins/fn--85929/fn--85931]}
   {:id triple98360,
    :idx-fn #function[xtdb.query/triple-joins/fn--85929/fn--85931]}],
  ?subchildren
  [{:id triple98360,
    :idx-fn #function[xtdb.query/triple-joins/fn--85929/fn--85931]}
   {:id triple98361,
    :idx-fn #function[xtdb.query/triple-joins/fn--85929/fn--85931]}],
  ?date
  [{:id triple98361,
    :idx-fn #function[xtdb.query/triple-joins/fn--85929/fn--85931]}],
  ?year
  [{:id pred-return98362,
    :idx-fn #function[xtdb.query/pred-joins/fn--85958/fn--85963]}]},
 :var->bindings
 {?year
  {:e-var nil,
   :var ?year,
   :attr nil,
   :result-index 4,
   :result-name xtdb.query.value/?year,
   :type :pred,
   :value? true},
  ?children
  {:e-var ?subchildren,
   :var ?children,
   :attr :subchildren/children,
   :result-index 1,
   :result-name ?subchildren,
   :type :entity,
   :value? false},
  ?subchildren
  {:e-var ?subchildren,
   :var ?subchildren,
   :attr :crux.db/id,
   :result-index 2,
   :result-name ?subchildren,
   :type :entity,
   :value? false},
  ?date
  {:e-var ?subchildren,
   :var ?date,
   :attr :subchildren/datetime,
   :result-index 3,
   :result-name ?subchildren,
   :type :entity,
   :value? false}},
 :var->cardinality
 {#uuid "ffffffff-ffff-ffff-ffff-000000000001" 0.0,
  ?children 0.4995115596487688,
  ?subchildren 43551.31091364738,
  ?date 42784.42388250363},
 :in-bindings []}

refset15:11:32

Cool, thank you. Assuming your data model works how I'd expect, that looks about right:

[#uuid "ffffffff-ffff-ffff-ffff-000000000001"
  ?children
  ?subchildren
  ?date
  ?year]

...that's essentially the same join ordering as implied by how you ordered the clauses (although XT doesn't take that into account at all)

sheluchin15:11:40

As for #1447, I can easily restructure my query fns to avoid using :in and instead hardcode the id. That's helps at least a bit. I guess I can add some year params the the query function and memoize it to make it good enough until I get into more advanced techniques sometime down the road... or throw more hardware at it. Or is entity-cache-size an XTDB-native alternative to using memoize?

refset16:11:16

entity-cache-size is tied to the lifecycle of db and is used only to memoize the ~opaque version ("content hash") of each entity for the given point-in-time, which means this lookup work can be re-used both within and across multiple queries against the same db

refset16:11:30

you can also try increasing Rocks' block-cache size https://docs.xtdb.com/storage/rocksdb/#blocks-cache

Tomas Brejla17:11:59

> My gut feeling is that multiple seconds for this isn't completely surprising, assuming that every index lookup is a cache miss requiring disk I/O Multiple seconds for just 5k items sounds a bit slow even for disk-IO cache-miss scenarios, doesn't it? I mean, even if all those 50K items need to be considered, ~5s sounds like quite a lot.

refset19:11:45

for inst->year I think you could probably do it like:

(-> #inst "2021-11-25"
    (.toInstant)
    (.atZone (java.time.ZoneId/systemDefault))
    (.getYear)
    str)

refset19:11:41

> ~5s sounds like quite a lot. I don't disagree. I would be interested to get hold of a small example that reproduces this data model so I can profile it properly

refset19:11:44

it would also be useful to know more about the OS/environment generally, and the options passed to start-node

refset19:11:07

for reference, on my machine, running the most basic possible lookup from Rocks, with a fresh db takes ~~1ms after the first run (with a cold node it takes~~ 60ms to warm up), doing

(time (do (xt/q (xt/db nn) '{:find [e] :limit 1 :where [[e :xt/id]]}) nil))

(this is a warm JVM though, since I'm at the repl) running that code in a tight loop with dotimes averages 0.1ms, and if run in an open-db context, averages 0.05ms

sheluchin19:11:18

"Elapsed time: 6.613754 msecs"
"Elapsed time: 0.388212 msecs"
"Elapsed time: 0.524183 msecs"
"Elapsed time: 0.327393 msecs"

sheluchin19:11:35

Well, I think the problem was that I was using the in-memory document store. Having switched it to Postgres, I'm now seeing times like:

"Elapsed time: 2089.379857 msecs" (cold start JVM)
"Elapsed time: 1015.39728 msecs"
"Elapsed time: 874.131094 msecs"
"Elapsed time: 944.789552 msecs"
"Elapsed time: 953.388092 msecs"
"Elapsed time: 924.663348 msecs"

sheluchin19:11:54

My fault. I was warned about the in-memory DB performance before but thought switching my index store over would have been enough. Sorry for taking up some of your time, but I did learn some stuff along the way 🙂 Thanks for the help!

refset22:11:13

> the problem was that I was using the in-memory document store. Having switched it to Postgres [...] Ahh, even more intriguing! I don't understand how changing the document store here could be making a difference...unless your IDs are large strings or maps perhaps?

refset22:11:41

for context, one reason the in-memory document store is ~slow is because it's using mem-kv underneath which uses data structures intended as a rough approximation of Rocks/LMDB, but the doc-store protocol is actually so simple that it probably deserves its own in-memory implementation, without the inefficient layer of indirection (even though it was only ever intended for testing) https://github.com/xtdb/xtdb/blob/e2f51ed99fc2716faa8ad254c0b18166c937b134/core/src/xtdb/mem_kv.clj https://github.com/xtdb/xtdb/blob/e2f51ed99fc2716faa8ad254c0b18166c937b134/core/src/xtdb/kv/document_store.clj#L62

sheluchin11:11:08

> Ahh, even more intriguing! I don't understand how changing the document store here could be making a difference...unless your IDs are large strings or maps perhaps? Nope, my ids are all uuids at the moment, but I was considering using maps (single kv) at some point. I guess this highlights a possible downside to that approach? Or does that only have much impact with large maps when using the in-mem docstore?

refset13:11:12

Yeah it's down to whether the value is stored reversibly in the key as a "value ID" or whether it needs to be stored separately - anything less than 224bits (e.g. a ~small map) is reversibly encoded using Nippy so shouldn't be problematic https://github.com/xtdb/xtdb/blob/013cdfd599795183efef82d58d61ad9fb8155115/core/src/xtdb/codec.clj#L368

2021-11-23

Channels