This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2019-10-17
Channels
- # announcements (2)
- # aws (44)
- # beginners (96)
- # calva (10)
- # cider (7)
- # cljdoc (5)
- # cljsrn (2)
- # clojure (38)
- # clojure-dev (19)
- # clojure-europe (6)
- # clojure-italy (16)
- # clojure-nl (10)
- # clojure-norway (44)
- # clojure-spec (7)
- # clojure-uk (74)
- # clojurescript (133)
- # cloverage (1)
- # cursive (54)
- # datomic (78)
- # duct (11)
- # graalvm (5)
- # instaparse (4)
- # joker (3)
- # kaocha (5)
- # nrepl (2)
- # off-topic (10)
- # pathom (56)
- # pedestal (1)
- # reagent (7)
- # reitit (17)
- # shadow-cljs (144)
- # slack-help (2)
- # sql (35)
- # testing (5)
- # tools-deps (22)
- # vim (22)
- # xtdb (11)
How do people handle schema migrations with datomic cloud? I was looking at conformity but it doesn't support the client api.
Hi everybody, We are having an issue with the way datomic tx works. Our application allows user to register driver registrations with a start and end date for a certain license plate via xml files. If a new driver registration overlaps with the start and end date of the old driver registration, we cut off the old dates in our timelines. Our timelines are sorted on the highest tx. Just to be clear about the overlaps. Here's an example: driver registration A inserted at 2019-09-09 {:driver-registration/id "1XXX001-2018-01-01" :driver-registration/license-plate "1XXX001" :driver-registration/start #inst"2018-01-01" :driver-registration/end #inst"2019-01-01" :driver-registration/driver {:person/first-name "John" :person/last-name "Doe"}} driver registration B inserted at 2019-09-10 {:driver-registration/id "1XXX001-2018-06-01" :driver-registration/license-plate "1XXX001" :driver-registration/start #inst"2018-06-01" :driver-registration/end #inst"2020-01-01" :driver-registration/driver {:person/first-name "Mary" :person/last-name "Jane"}} timeline: 2018-01 2018-06-01 2020-01-01 |--driver John Doe---|-----driver Mary Jane------| As you can see the end date of driver registration A (John Doe) was cut off by the start date of driver registration b (Mary Jane). We retrieve this data by the following query: (d/q '[:find ?tx (pull ?dr [: {:driver-registration/person [:]}]) :in $ ?license-plate :where [?dr :driver-registration/license-plate ?license-plate] [?dr :driver-registration/id _ ?tx]] db "1XXX001") We sort the list by the value of ?tx and cut off the dates were necessary. This works fine for most cases but now image the user has made a mistake and wants the end date of driver registrations as the cut off date. Like this: 2018-01 2019-01-01 2020-01-01 |----driver John Doe-----|-----driver Mary Jane------| When the user uploads a new xml with the exact same data of driver registration A, we expect that driver registration A would now have the highest tx. But due to datomics redundancy elimination, datomic will filter out the data of the transaction and never update the tx of driver registration A. When the user asks for the driver registration timeline, he will still receive the old one. Is there a way to solve this issue on the query side? One of the solutions would be to add a field :driver-registration/last-upload with a date value to driver registration but that feels as if I'm rebuilding the db/txInstant system.
Why are you not sorting by the registration start and end dates? Tx Instant is a time of record and has no connection to your domain’s “business” times. Suppose you for eg uploaded an old registration?
Maybe helpful: https://vvvvalvalval.github.io/posts/2017-07-08-Datomic-this-is-not-the-history-youre-looking-for.html
Well the start and end dates do not show at which time the driver registrations was entered in the database. In my example they are nice in order but it's possible that a user would insert a driver registration with an overlap at the start of an old driver registration. If we would just sort on start and end date, the newer one will always be overwritten by the old one. Which shouldn't happen. Great article. So going by that article we should add a field stating when a driver registration was last-uploaded?
I guess so? I thought by “newer one” you just mean a start time > next range’s end time. I guess I don’t understand precisely your timeline overlap algorithm
If time of record is really vital to you here you can retrieve the tx of start and end specifically
Sorry about not being clear. Hopefully this flow helps to understand how our application works, With newer I mean the latest transaction date: Insert #1: driver registration A inserted at 2019-09-09 {:driver-registration/id "1XXX001-2018-01-01" :driver-registration/license-plate "1XXX001" :driver-registration/start #inst"2018-01-01" :driver-registration/end #inst"2019-01-01" :driver-registration/driver {:person/first-name "John" :person/last-name "Doe"}} timeline: 2018-01 2019-01-01 |----driver John Doe-----| Insert #2: driver registration B inserted at 2019-09-10 {:driver-registration/id "1XXX001-2018-06-01" :driver-registration/license-plate "1XXX001" :driver-registration/start #inst"2018-06-01" :driver-registration/end #inst"2020-01-01" :driver-registration/driver {:person/first-name "Mary" :person/last-name "Jane"}} timeline: 2018-01 2018-06-01 2020-01-01 |--driver John Doe---|-----driver Mary Jane------| Insert #3: driver registration A inserted at 2019-09-11 {:driver-registration/id "1XXX001-2018-01-01" :driver-registration/license-plate "1XXX001" :driver-registration/start #inst"2018-01-01" :driver-registration/end #inst"2019-01-01" :driver-registration/driver {:person/first-name "John" :person/last-name "Doe"}} Expected: 2018-01 2019-01-01 2020-01-01 |----driver John Doe-----|-----driver Mary Jane------| Reality: 2018-01 2018-06-01 2020-01-01 |--driver John Doe---|-----driver Mary Jane------|
even with taking the tx of start and end date, the last insert will always be ignored as it is completely the same as the first insert and datomic will use redundancy elimination
now that I'm typing this it's starting to make sense to just add an extra field stating upload-date
(d/q '[:find ?tx (pull ?dr [* {:driver-registration/person [*]}])
:in $ ?license-plate
:where
[?dr :driver-registration/license-plate ?license-plate]
[?dr :driver-registration/start _ ?tx-s]
[?dr :driver-registration/end _ ?tx-e]
[(max ?tx-s ?tx-e) ?tx]
] db "1XXX001")
again, this model assumes that your notion of driver registration “age” exactly corresponds to the sum of its tx times, which might be possible but more likely you actually have a separate explicit domain-specific notion of “record effective date” which you just haven’t noticed yet
also note the mismatch in granularity: tx time is about individual facts not “records”. Datomic doesn’t know anything about records
e.g. it may be the union of attributes from multiple records, or it may be a value or “sub-record” (e.g in isComponent case) or it may just be a convenient thing to join on
"but more likely you actually have a separate explicit domain-specific notion of “record effective date” which you just haven’t noticed yet" I think that is the case here. Also, I haven't noticed this statement "also note the mismatch in granularity: tx time is about individual facts not “records”" so your example of the query makes more sense to me now. I think I can figure it out from here. Thanks for taking the time to help me favila!
Datomic cloud documentation mentions: > [:db/unique] attribute must have a :db/cardinality of :db.cardinality/one. https://docs.datomic.com/cloud/schema/schema-reference.html#db-unique However, in my project, I’ve been using the following schema definition just fine:
{:db/ident :user/email
:db/unique :db.unique/identity
:db/valueType :db.type/string
:db/cardinality :db.cardinality/many}
It appears to work as expected:
(d/transact conn {:tx-data [{:user/email "calvin"}]})
=> {:tx-data [#datom[13194139533323 50 #inst "2019-10-17T13:44:42.951-00:00" 13194139533323 true] #datom[10740029580116059 78 "calvin" 13194139533323 true]]
(d/transact conn {:tx-data [{:user/email ["jenny"]}]})
=> [#datom[13194139533325 50 #inst "2019-10-17T13:47:13.676-00:00" 13194139533325 true] #datom[13453624277467228 78 "jenny" 13194139533325 true]]
And I can even pull:
(d/pull db [:user/email] [:user/email "calvin"])
#:user{:email ["calvin"]}
#:db{:id 78,
:ident :user/email,
:valueType #:db{:id 23, :ident :db.type/string},
:cardinality #:db{:id 36, :ident :db.cardinality/many},
:unique #:db{:id 38, :ident :db.unique/identity}}
This is in the output section of my compute stack
DatomicCFTVersion 512
DatomicCloudVersion 8806
@cjsauer you’re correct that it can be done. Dont. 🙂 The semantics of unique identity are such that having a card-many attr there is pretty dicey I’ll look into filing this as somethign that should potentially throw or warn
@marshall ha okay, thanks for checking. I’m a bit puzzled tho. Email addresses seem to challenge those semantics. Is there a technical reason that unique-many attributes can’t exist? Regardless, throwing an exception there would be great.
i suppose not a technical reason so much as a semantic one unique identity says “this specific attr/value pair is unique in the database”
>“this specific attr/value pair is unique in the database” This could still hold for a card-many attribute theoretically. I opened an issue/PR for datascript before posting here on this subject, and @tonsky mentioned: > Upsert would not work because it’s not clear which value to look at. Or we must look at all provided values and make sure they all resolve to the same entity. This might be the complication. When upserting multiple values one must ensure that they do indeed resolve to the same entity. https://github.com/tonsky/datascript/issues/320
i.e. what if you have two separate “entites” in your transaction, each using one of your unique emails, but they both contain a separately conflicting datom for some other attr
since they both resolve to the same entity based on that unique id, you then have a conflict of the other datom
It would seem appropriate for that transaction to fail in that example. The two entities are resolved to one, and then can be constrained from there.
Maybe the detection of that conflict is the hard part tho. I’m ignorant of the details.
Given this limitation, is it possible to model users being uniquely identified by multiple email addresses? This felt like a very natural way to model my domain, where a user can be part of multiple teams/orgs, each with their own email domains. It would be great if those emails did indeed resolve to the same entity.
You could model this with the concept of an “organization user” which can be mapped to a “user”. Each “organization user” can have their unique identifier (email), a reference to user and the organization they belong to.
You have the same conflict if you use different identity attributes on the same entity.
Correct, and I’ve been burned by this before. datomic will pick one for entity id purposes in some undefined way
I’d prefer entity resolution only happened via :db/id
, and if you use a db/id value with a lookup ref whose attribute is identity, only then would it upsert for you
What is the conflict? I’m not disagreeing, just failing to conceptualize. Is there a minimal example of a transaction that shows this?
Assuming :product/id
and :product/asin
are two identity attributes, the following tx will throw with :db.error/datoms-conflict
:
[{:product/id #uuid "59d1da4a-7de0-4625-ad83-b63ac8346368"
:product/name "A"}
{:product/asin "B00VEVDPXS"
:product/name "B"}]
That feels broken :thinking_face: The ambiguity must lie in the expanded form of this transaction perhaps?
Not sure why it seems broken. It makes sense. I was just pointing out the fact that Datomic detect those conflicts already so I'm not sure why it could not do it for card many identity attributes.
My mental model for this must be off. That transaction looks like two products, each with different forms of identity.
Sorry, yes. I should have said that these 2 values refer to the same entity in the database.
>I was just pointing out the fact that Datomic detect those conflicts already so I’m not sure why it could not do it for card many identity attributes. This was my thought as well. The conflict check is similar, just over a collection of identity values instead of one.
>I was just pointing out the fact that Datomic detect those conflicts already so I’m not sure why it could not do it for card many identity attributes. This was my thought as well. The conflict check is similar, just over a collection of identity values instead of one.
Mostly tho, semantically, card-many-unique attributes seem very natural. Using Game of Thrones as an example, royal figures can have many identifying titles: Robert Baratheon, first son of XYZ, slayer of ABC, builder of QRS, etc etc etc
.
Can you build 2 datomic cloud instances in the same region with different names? can you share the key?
which key @hadilsabbagh18?
we put one Datomic system in one region, we attach a couple query groups to it, and it hosts hundreds of databases
I started deploying to us-west-2 according to instructions and I repeatedly get this error:
Embedded stack arn:aws:cloudformation:us-west-2:962825722207:stack/stackz-StorageF7F305E7-13QLTET3W9OAQ/ad7f7950-f0ff-11e9-b33c-02a77ed54d64 was not successfully created: The following resource(s) failed to create: [DatomicCmk, CatalogTable, FileSystem, LogGroup, LogTable].
Does anyone know why it fails?