Fork me on GitHub
#xtdb
<
2022-09-23
>
Hukka07:09:02

Is there something peculiar about xt/entity-history which might cause it to show stale transactions, even when transactions have been awaited and checked they are committed? All of our other tests that rely on the awaitting always show the db in the db (when using q, entity, or pull), but sometimes entity-history doesn't show the right amount of transactions

Hukka07:09:23

Seems like about 1/10 of the test runs fail with that. Annoyingly flaky

Hukka07:09:42

Or is it that I need to use

`:with-corrections?` (boolean, default false): specifies whether to include bitemporal corrections in the sequence, sorted first by valid-time, then tx-id
, which I don't really understand (I mean what are bitemporal corrections)?

Hukka07:09:58

Looking at the implementation, it's pretty clear that without the with-corrections, only one entity version at the same time is counted

1
refset13:09:10

Well, certainly nothing should be non-deterministic :thinking_face: If you can share a snippet of roughly the APIs used (and in which order) then I might be able to piece an explanation/theory together, or if you had time to create a ~minimal repro that would be really useful.

Hukka13:09:46

(deftest test-create-if-absent!
  (let [user1 #:user{:provider :test
                     :id 1
                     :email ""
                     :name nil}]
    (testing "when updating a user, the record is updated"
      (with-open [node (core/create-inmem-db)]
        (let [user-id1 (create-if-absent! node (assoc user1 :user/name "Foo"))
              _ (create-if-absent! node user1)]
          (is (= (count (xt/entity-history (xt/db node) user-id1 :desc {:with-corrections? true}))
                 2))
          (is (= (dissoc (xt/entity (xt/db node) user-id1) :xt/id)
                 (dissoc user1 :user/id :user/provider))))))
    ))
doesn't of course work without our helpers, but shows roughly what is happening. create-if-absent! calls
(let [tx (xt/submit-tx node ops)]
    (xt/await-tx node tx)
    (xt/tx-committed? node tx))
inside to make sure that results should be visible

Hukka13:09:15

I added printing the entity-history to the test, and noticed that every time it failed, both transactions had happened in the same millisecond

refset10:10:46

Hey @U8ZQ1J1RR apologies for the delayed follow-up. Are you still convinced something weird is happening here? I can dig in next week

Hukka07:10:02

It was at least surprising, if not weird. The documentation talks about bitemporal corrections, but I wouldn't expect that the difference between something being a correction or not, is how fast the tests manage to run

Hukka07:10:28

While this is mostly a problem in the tests, it does also mean that it can become a timing issue in production too: if for some reason two different changes manage to hit at the same time, one of them will be masked in the entity-history. Knowing this, I think we would just need to always use the :with-corrections since I don't see why I wouldn't want to ever see all of the history, if I want the history at all

refset14:10:36

Ah understood, thanks for explaining further. This is a very related discussion to https://discuss.xtdb.com/t/accidental-bi-temporal-correction/112/2

refset14:10:03

Feel free to chime in there if you wish. I acknowledge that this is behaviour is not necessarily intuitive but I'm not sure if there are good options available :thinking_face: (I'll make a note about explaining it in the docs though)

Hukka06:10:07

Ah, yes. I think I have missed that thread because I don't really know what bi-temporal correction means, so I didn't think it as relevant in June.

Hukka07:10:45

What is the usecase for a correction? Is there a way to update a record and explicitly declare that it is not one, no matter what the times are?

Hukka07:10:54

I can also ask that in the discuss, if that's relevant, but the time precision doesn't seem like something I would be interested in otherwise

refset08:10:16

> What is the usecase for a correction? It's about asserting what is or isn't true for a given period of (valid) time, and having auditability over that. A correction is a region in valid time where more than one version has been asserted. You might find this helpful https://bitemporal-visualizer.github.io/ - for example, in that link, the period from 2019-02-08 2019-02-15 has 5 versions (looking at the horizontal slice) and therefore 4 corrections

refset08:10:50

> Is there a way to update a record and explicitly declare that it is not one, no matter what the times are? Unfortunately not > time precision doesn't seem like something I would be interested in otherwise Noted, and I think I agree that while higher precision would workaround your immediate issue, it feels orthogonal fundamentally. This issue also provides some context https://github.com/xtdb/xtdb/issues/441 I'll reflect on this.

Hukka08:10:33

Ah, hm. So entity-history is not really about the history in the transaction time domain, but in the business time domain?

refset08:10:54

primarily the business time domain, yep. It comes first in the sort order (and after that is sorted by transaction time)

Hukka09:10:18

I have seen the bitemporal-visualizer before, and even tried to google for the "bitemporal correction", but didn't really understand what it means before now. Though I wonder if I did understand it correctly, if it's named bitemporal correction, but is really about the valid time only. But thinking it this way, it would be sufficient then to specify valid times in the tests at least, even if the transaction time for those will be the same. That doesn't avoid the problem of things in production accidentally happening at the same time, but as long as we keep in mind that :with-corrections is really about history in time vs history in transactions, we should manage

Hukka09:10:49

Ah, so it is not treated as a correction if the transaction time doesn't also match?

refset09:10:55

> Though I wonder if I did understand it correctly, if it's named bitemporal correction, but is really about the valid time only. this is right, and any confusion might well be our fault due to inconsistent usage / invention (or abuse) of language

refset09:10:35

> so it is not treated as a correction if the transaction time doesn't also match? correction just means something in the valid time past has since (in transaction time) been updated. The transaction times definitely don't have to match. However, if you don't specific a valid time during your put the valid time is inherited/derived from the transaction time (the tx-log is the source of truth for 'now')

refset09:10:00

> But thinking it this way, it would be sufficient then to specify valid times in the tests at least, even if the transaction time for those will be the same. That doesn't avoid the problem of things in production accidentally happening at the same time, but as long as we keep in mind that :with-corrections is really about history in time vs history in transactions, we should manage that sounds viable to me, yes 👍

Hukka09:10:46

Hm. Given times A, B and C in order, record having valid versions from A→C, A→C and B→C, what would be the usecase for only sometimes wanting to know if there has been two versions from time A to C, but always see that there's been more than one version from B to C?

Hukka09:10:36

Or does the :with-correction change the shown valid time from A→B to the original A→C?

Hukka09:10:04

Hm, no. it seems to be used only as filter, not changing the history itmes

Hukka09:10:03

In other words, wouldn't entity-history, in the case shown in https://bitemporal-visualizer.github.io/, always return all transacted versions, no matter if :with-corrections? is true or false, since no stored record has the same app_start and app_end?

refset10:10:13

without :with-corrections?, entity-history would return only the last column shown (3 entries)

Hukka11:10:16

Ok. I assumed that (->> (partition-by :vt) (map first)) would match only if the valid time is exactly the same

👍 1