This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2022-05-20
Channels
- # aleph (11)
- # announcements (3)
- # babashka (35)
- # babashka-sci-dev (28)
- # beginners (29)
- # calva (51)
- # cider (33)
- # clj-kondo (26)
- # clj-on-windows (1)
- # clojure (40)
- # clojure-austin (1)
- # clojure-europe (47)
- # clojure-nl (9)
- # clojure-norway (7)
- # clojure-uk (5)
- # clojurescript (69)
- # conjure (30)
- # cursive (7)
- # data-science (9)
- # datomic (2)
- # etaoin (10)
- # events (2)
- # fulcro (1)
- # graalvm (1)
- # gratitude (6)
- # helix (16)
- # honeysql (20)
- # hyperfiddle (14)
- # inf-clojure (2)
- # jobs (1)
- # jobs-discuss (12)
- # kaocha (9)
- # leiningen (2)
- # lsp (4)
- # malli (8)
- # music (9)
- # off-topic (12)
- # pathom (10)
- # portal (14)
- # practicalli (15)
- # re-frame (27)
- # reitit (7)
- # remote-jobs (4)
- # sci (37)
- # shadow-cljs (16)
- # sql (8)
- # tools-deps (6)
- # vim (6)
- # xtdb (21)
Hi, this is probably more of a Clojure question but worth a shot.
I have a transaction function (1) for adding new key-value pairs, and one for appending a userID to a set in a document (2):
(1) [[::xt/put (assoc entity :buildings/name buildingName :buildings/area buildingArea) validStartTime]]
(2) [[::xt/put (assoc entity :buildings/persons (conj (:buildings/persons entity) personId)) entryTime]]
Because my data arrives unordered, I may have to create a barebones building doc that only contains the ID as well as an empty set that tx-fn (2) later can append to. The tx-fn (1) is there to add the rest of the building information whenever it arrives to the system.
The problem is, experimenting with this on a new node strangely enough shows that the order in which the tx-fns are run matters? If I create a doc with an ID and an empty set #{} and run (2), the userID is added succesfully, however running (1) next - nothing happens. If I run (1) after creating the empty doc and then (2), it works as expected. Am I missing something? The validStartTime sent to (1) is always earlier than entryTime in (2)
Hey again π it looks like you have a missing buildingPersons
value in (1)
- could that be related? Or is that just a typo here?
Here's the history where tx 2 is creation of empty doc, tx 3 is running tx-fn (2) adding personId to set, tx 4 is running tx-fn (1) attempting to add fields. From what I can see, even when running in this order, the history (correctly) shows the valid-time timeline
{:tx-time #inst "2022-05-20T18:39:43.462-00:00", :tx-id 3, :valid-time #inst "2016-01-01T00:00:00.000-00:00", :content-hash #xtdb/id "c9354e0f69a5c43fd1cfd91c78773a88c3f83165"}
{:tx-time #inst "2022-05-20T18:39:46.239-00:00", :tx-id 4, :valid-time #inst "2013-01-01T00:00:00.000-00:00", :content-hash #xtdb/id "ad832526ad525ea08ba11c89133f9c6103419935"}
{:tx-time #inst "2022-05-20T18:39:37.503-00:00", :tx-id 2, :valid-time #inst "2000-01-01T00:00:00.000-00:00", :content-hash #xtdb/id "af1f8340456f01cbec7cdf9aad3a7f7f910b7a79"}
If I flip the order and run tx-fn (2) before (1), the valid-time timeline is the same yet when fetching my entity, it has been updated as expected (as opposed to the ordering above)
{:tx-time #inst "2022-05-20T18:46:55.516-00:00", :tx-id 4, :valid-time #inst "2016-01-01T00:00:00.000-00:00", :content-hash #xtdb/id "ad832526ad525ea08ba11c89133f9c6103419935"}
{:tx-time #inst "2022-05-20T18:46:53.900-00:00", :tx-id 3, :valid-time #inst "2013-01-01T00:00:00.000-00:00", :content-hash #xtdb/id "c047d161bca3b5acce49557a9510d664612526b7"}
{:tx-time #inst "2022-05-20T18:46:53.843-00:00", :tx-id 2, :valid-time #inst "2000-01-01T00:00:00.000-00:00", :content-hash #xtdb/id "af1f8340456f01cbec7cdf9aad3a7f7f910b7a79"}
edit: might be worth mentioning this is testing in repl
Nono, I'm the pain here π
, here it is, running the transactions in the order that gives unexpected behavior:
{:tx-time #inst "2022-05-20T19:57:11.143-00:00", :tx-id 3, :valid-time #inst "2016-01-01T00:00:00.000-00:00", :content-hash #xtdb/id "c9354e0f69a5c43fd1cfd91c78773a88c3f83165", :doc {:buildings/persons #{"abc123id"}, :xt/id 1234}}
{:tx-time #inst "2022-05-20T19:57:11.260-00:00", :tx-id 4, :valid-time #inst "2013-01-01T00:00:00.000-00:00", :content-hash #xtdb/id "ad832526ad525ea08ba11c89133f9c6103419935", :doc {:buildings/persons #{"abc123id"}, :buildings/name "Victoria Stadion", :buildings/ownerId "51", :xt/id 1234}}
{:tx-time #inst "2022-05-20T19:57:11.054-00:00", :tx-id 2, :valid-time #inst "2000-01-01T00:00:00.000-00:00", :content-hash #xtdb/id "af1f8340456f01cbec7cdf9aad3a7f7f910b7a79", :doc {:buildings/persons #{}, :xt/id 1234}}
And in reverse order:
{:tx-time #inst "2022-05-20T20:09:23.583-00:00", :tx-id 4, :valid-time #inst "2016-01-01T00:00:00.000-00:00", :content-hash #xtdb/id "ad832526ad525ea08ba11c89133f9c6103419935", :doc {:buildings/persons #{"abc123id"}, :buildings/name "Victoria Stadion", :buildings/ownerId "51", :xt/id 1234}}
{:tx-time #inst "2022-05-20T20:09:22.567-00:00", :tx-id 3, :valid-time #inst "2013-01-01T00:00:00.000-00:00", :content-hash #xtdb/id "c047d161bca3b5acce49557a9510d664612526b7", :doc {:buildings/persons #{}, :buildings/name "Victoria Stadion", :buildings/ownerId "51", :xt/id 1234}}
{:tx-time #inst "2022-05-20T20:09:22.466-00:00", :tx-id 2, :valid-time #inst "2000-01-01T00:00:00.000-00:00", :content-hash #xtdb/id "af1f8340456f01cbec7cdf9aad3a7f7f910b7a79", :doc {:buildings/persons #{}, :xt/id 1234}}]
(data slightly masked)
thanks for that, I think I understand now - I reckon you want to be specifying an explicit valid-time-end on the put
that is greater than the valid-time-start of :tx-id 3
...which will essentially replace (/'correct') the :tx-id 3
version
Thanks, that did the trick! I left the valid time-end out since we don't know how long the building will be in use, but I guess I could set a date like year 2090 and await the delete-building event?
Hey again, kinda the same/similar problem, I'm testing out the order of sending in tx-fn [remove person from set at 2016-01-04j], [add person to set at 2016-01-01], [add building info at 2013-01-01]. If I put a valid end-time far in the future on the fx-fn that appends fields to a building, as we discussed yesterday, the changes of adding/removing a person (through conj/disj tx-fn) aren't saved. If I were to remove the explicit valid end-time on the building init, then it works as expected in this specific scenario (but not in another scenario and so on...) A bit confusing, but I'm at fault for seeing this as 'deletions' when in fact it's all about puts and valid-times on the same doc
I'm very sorry for the bother, but it seems I was too hasty with the solution proposed yesterday (when adding building information, set explicit end time that is greater than addingPersons valid time-start). Here is the doc-history for whenever you have time to check it out!
(ascending order)
1. Init empty building doc in year 2000, OK
{:tx-time #inst "2022-05-21T19:06:48.841-00:00", :tx-id 3, :valid-time #inst "2000-01-01T00:00:00.000-00:00",
:doc {:building/persons #{}, :xt/id 1234}}]
2. Add building information in year 2013, OK but it contains a person that doesn't arrive until 2016
{:tx-time #inst "2022-05-21T19:06:49.861-00:00", :tx-id 5, :valid-time #inst "2013-01-01T00:00:00.000-00:00",
:doc {:building/persons #{"42314-XXXXXX-45346"}, :building/name "Victoria Stadion", :building/ownerId "51", :xt/id 1234}}
3. Actual addition of person to building set in 2016 (same content as above)
{:tx-time #inst "2022-05-21T19:06:49.861-00:00", :tx-id 5, :valid-time #inst "2016-01-01T00:00:00.000-00:00",
:doc {:building/persons #{"42314-XXXXXX-45346"}, :building/name "Victoria Stadion", :building/ownerId "51", :xt/id 1234}}
Hey again, it's no problem at all - it seems like a tricky problem π although I think an executable example would really help me grok things. Would you mind elaborating with something like this:
(with-open [n (xt/start-node {})]
(xt/submit-tx n [[::xt/put {:xt/id :put-building-fn
:xt/fn '(fn [ctx i]
[[:xtdb.api/put {:xt/id (str "building" i)}]])}]
[::xt/put {:xt/id :add-kvs-fn
:xt/fn '(fn [ctx eid & kvs]
[[:xtdb.api/put (apply assoc (xtdb.api/entity (xtdb.api/db ctx) eid) kvs)]])}]
[::xt/put {:xt/id :conj-persons-fn
:xt/fn '(fn [ctx eid & persons]
[[:xtdb.api/put (update (xtdb.api/entity (xtdb.api/db ctx) eid) :persons #(conj % persons))]])}]
[::xt/fn :put-building-fn 1]])
(xt/sync n)
(xt/submit-tx n [[::xt/fn :add-kvs-fn "building1" :a 1 :persons #{}]])
(xt/sync n)
(xt/submit-tx n [[::xt/fn :conj-persons-fn "building1" "alice" "bob"]])
(xt/sync n)
(clojure.pprint/pprint (map #(select-keys % [::xt/tx-id ::xt/valid-time ::xt/doc])
(xt/entity-history (xt/db n) "building1" :asc {:with-docs? true}))))
;=>
(#:xtdb.api{:tx-id 0,
:valid-time #inst "2022-05-21T21:20:30.566-00:00",
:doc #:xt{:id "building1"}}
#:xtdb.api{:tx-id 1,
:valid-time #inst "2022-05-21T21:20:30.571-00:00",
:doc {:a 1, :persons #{}, :xt/id "building1"}}
#:xtdb.api{:tx-id 2,
:valid-time #inst "2022-05-21T21:20:30.574-00:00",
:doc
{:a 1, :persons #{("alice" "bob")}, :xt/id "building1"}})
And then be very explicit about what you want the output to be instead with a handcrafted ideal output
Thank you once again for your help! I took your code and added a valid time-start parameter for all three functions:
'(fn [ctx i vts]
[[:xtdb.api/put {:xt/id (str "building" i)} vts]])
'(fn [ctx eid vts & kvs]
[[:xtdb.api/put (apply assoc (xtdb.api/entity (xtdb.api/db ctx) eid) kvs) vts]])
'(fn [ctx eid vts & persons]
[[:xtdb.api/put (update (xtdb.api/entity (xtdb.api/db ctx) eid) :persons #(conj % persons)) vts]])
I also rearranged the order of the tx-fns executions to match my common scenario of having data arrive to the system in the wrong order:
[::xt/fn :put-building-fn 1 #inst "2000-01-01"]])
[::xt/fn :conj-persons-fn "building1" #inst "2016-01-01" "alice" "bob"]
[::xt/fn :add-kvs-fn "building1" #inst "2013-01-01" :a 1 :persons #{}]
This gives:
(#:xtdb.api{:tx-id 0,
:valid-time #inst "2000-01-01T00:00:00.000-00:00",
:doc #:xt{:id "building1"}}
#:xtdb.api{:tx-id 2,
:valid-time #inst "2013-01-01T00:00:00.000-00:00",
:doc {:persons #{}, :a 1, :xt/id "building1"}}
#:xtdb.api{:tx-id 1,
:valid-time #inst "2016-01-01T00:00:00.000-00:00",
:doc {:persons (("alice" "bob")), :xt/id "building1"}})
The ideal output is exactly the one that is produced when your code is left unchanged (besides me wanting the set to contain persons in singles like #{"alice", "bob"}
, but that is unrelated). In this case, the building again loses it's previously attained key :a π.
When loading my data, it is common for a "personEnteredBuilding" event to have arrived before "buildingCreated". When "personEnteredBuilding" is received, and there is no building of that id in the db, the plan would be to put a building with an empty set and then add the person to this set. This set will rapidly change as people keep entering and exiting the building, whose events may unfortunately come in the wrong order, something we try to fix with sending in valid-times as per above. Somewhere along the line, the "buildingCreated" event finally arrives and should add kvs without modifying the set or its history. This event has an earlier valid time than any of the "personEntered" or "personExited"Kinda sounds like you cannot really do just temporal inserts of the data once. Either at every time you get new information, you need to change all of the valid time ranges, not just insert one new document. Or really do fault tolerant (or rather missing information tolerant) event sourcing, instead of trying to normalize the state at every point of time from incomplete data
If even the events like "person entered" and "person exited" can come in any order, event sourcing seems like the better choice. Then you can just ignore exit events that didn't have a corresponding enter event, or even show extra information that some people have been present even though you don't know when exactly
Sounds reasonable, we'll look into it more - thanks!!
Purely FYI: Xodus vs. RocksDB performance: https://blog.aawadia.dev/2021/04/03/xodus-vs-rocks/ TL;DR: RocksDB: 10x faster writes, 2x faster reads, 8x less storage (Iβm still using Xodus for xtdb; will eventually switch to rocksdbβ¦)