Fork me on GitHub
#datomic
<
2023-02-15
>
onetom03:02:58

We are still having issues with composite keys, made up of ref attrs. Filed a full example of the issue here: https://forum.datomic.com/t/upsert-tupleattrs-containing-db-type-ref-using-tempids/2181 did anyone else here encountered this problem?

favila07:02:44

Yes, and your “related literature” links show the same problem. Tempid resolution (including via upserting) happens before composite tuple re-computation; composite tuple recomputation happens at the very end after all datoms are expanded to make sure nothing else could change the values of the attributes the tuple is indexing. To resolve tempids using the value of composites which itself depends on resolved tempids would require a more complex multi-pass strategy

favila07:02:10

My advice is to use a transaction function and not rely on upserting

onetom14:02:29

we are using transaction functions, but do you mean, we should hand-roll the decision, whether a certain nested entity map (or a set of positional arguments) represent an insert or an update and emit different tx-data accordingly?

favila14:02:17

yes. Something like [:myfn/make-x-y-ref tempid x y] It can see if x and y exist already, and use an existing tuple value to resolve tempid to an existing value accordingly

onetom14:02:09

> To resolve tempids using the value of composites which itself depends on resolved tempids would require a more complex multi-pass strategy i think, i also had issues using both lookup-refs and keywords representing :db/idents as tuple components. could those be the consequences of this single-pass algorithm too?

favila14:02:44

no, that’s just that d/entid doesn’t resolve them

favila14:02:34

this is the problem that you need to resolve the tempid (possibly create a new entity) before you can resolve another tempid

favila14:02:56

you would need to make the tx itself multi-stage, instead of collect tempids, resolve them, commit

favila14:02:17

txs right now are atomic, they’re not gathering db changes “as it works”

onetom14:02:34

> ... [:myfn/make-x-y-ref tempid x y] ... what is tempid in this case? u mean something like this:

[{:db/id "x" :db/ident "x"}
 {:db/id "y" :db/ident "y"}
 [:myfn/make-x-y-ref "?" ? ?]]
so i can still stay in a single transaction?

favila15:02:39

I was thinking tempid was the tempid of the joining entity, so you could use it as a reference elsewhere in the tx

onetom15:02:51

ok, so i have to split the tx into too, like i showed in my forum entry?

favila15:02:04

no, just make the tx fn accept more args

favila15:02:14

[:myfn/make-x-y-ref joining-entity-tempid {:x-tempid "x" :x-unique-value "x"} {:y-tempid "y" :y-unique-value "y"}]

onetom15:02:49

and the return value of this tx-fn would be what? different shapes, based on whether [:x-unique-value "x"] & [:y-unique-value "y"] exists already or not?

onetom15:02:45

well, that's what i meant by "we should hand-roll the decision, whether a certain nested entity map (or a set of positional arguments) represent an insert or an update"

favila15:02:11

> yes. Something like [:myfn/make-x-y-ref tempid x y] It can see if x and y exist already, and use an existing tuple value to resolve tempid to an existing value accordingly yeah, I’m agreeing 🙂

onetom15:02:42

okay, thanks a lot for confirming it! that's how we were doing it in the past, it's just a bit tedious to do it for every composite key 😞

favila15:02:08

honestly I stay away from upsert, especially the upsert + unique combination

favila15:02:27

I think it’s too surprising

favila15:02:44

If datomic were designed with lookup refs from the beginning, I’m not sure it would exist

favila15:02:12

or at least I wouldnt have put it in

favila15:02:27

(especially when one entity has more than one upsert attr on it)

onetom15:02:29

we are mirroring data from existing systems and it seems straightforward in that case. the other use-case is storing oauth access and refresh tokens for a combination of our system's user, oauth system's user and the oauth resource they authorized. both of these would work with how datomic does the upsert anyway, if only it would resolve the tempids in tuples 🙂

onetom15:02:57

using tempids in this scenario allows us to use different unique keys for representing the oauth user and the oauth resource, since it varies across systems, yet have the same uniqueness constraint.

favila15:02:10

have you considered either not having a composite tuple at all (use an entity predicate), or using a heterogenous value tuple and manually maintaining it?

favila15:02:38

possibly with values instead of refs

onetom15:02:21

yes, if i would use values, i would need to customize the same overall token refresh & revocation logic, depending on oauth provider, so i would need different heterogen tuples based on whether an oauth resource is a uuid or just a string

favila15:02:15

you can always flatten to something common, or leave some slots nil

favila15:02:24

but if this isn’t for lookup, you may not need one at all

favila15:02:38

an entity predicate can ensure the invariant without needing a unique composite-tuple attr as an index

onetom15:02:26

today was the 2nd time we considered using entity predicates for a smaller use-case. previously we were able to solve the problem without any entity predicates, just with tx-fn. but now that u are mentioning it, i will consider them more seriously.

onetom15:02:49

we did consider leaving the oauth resource dimension in the uniqueness constraint nil for one of the oauth providers, so we would be utilizing that capability.

onetom15:02:03

yeah, but the uniqueness constraint on an explicit attribute makes upsert possible... :)

favila15:02:54

well a txfn can do the same I guess

onetom15:02:57

saves us writing boring, parochial code

favila15:02:06

it comes down to whether you think races will be uncommon enough that just protecting against them causing invariant volations is enough (entity predicate), or common enough that you want to make it commute (tx fn or upsert)

2
onetom15:02:41

i'll sleep on it. thanks a lot again for the exhaustive reply!

favila15:02:55

happy to help!

onetom15:02:12

(let [conn (mk-conn schema)
        txr0 (tx! conn [{:db/id "x" :x/id "x"}
                        {:db/id "y" :y/id "y"}])
        {:strs [x y]} (-> txr0 :tempids)
        ent {:db/id "ent"
             :ref/x [:x/id "x"]
             :ref/y [:y/id "y"]
             :key   [[:x/id "x"] [:y/id "y"]]}]
    (tx! conn [(-> ent (merge {:attr 1}))])
    (tx! conn [(-> ent (merge {:attr 2}))]))
this one gives a
Execution error (ExceptionInfo) at datomic.core.error/raise (error.clj:55).
:db.error/invalid-tuple-value Invalid tuple value
is using lookup-refs like this, as tuple components a different problem, or also the consequence of this single-pass resolution algo?

favila15:02:00

no, it’s just d/entid doesn’t do it

favila15:02:22

It’s just annoying; I don’t see any technical limitation, they probably just overlooked it

onetom15:02:49

im using datomic cloud, btw, so no d/entid available there

favila15:02:04

ah; well there’s still an internal function doing the same thing

onetom15:02:12

does it worth creating an entry on the datomic forum about this case, then?

favila15:02:37

I mean, I would love it if it supported this, but I think they know about it already

favila15:02:47

we noticed it right when tuples were released

onetom15:02:49

i mean, would it help to get this feature implemented eventually in datomic?

favila15:02:02

I guess it wouldn’t hurt

onetom15:02:34

alright, i will look into it then! because it would be extremely convenient during REPL explorations...

favila15:02:36

you might have to do some reflection in your txfn if you don’t have d/entity sadly

favila15:02:07

long -> entid; vector -> query, normalize to entid, etc

favila15:02:15

if I were redesigning this from scratch (hah!) I would replace existing upsert behavior with something that only triggers when it gets a lookup ref in a spot where it expects an entity id

favila15:02:47

and your application should just always use lookup refs in its transactions if it wants to not care about entity id and only about “unique-identity”

favila15:02:10

and the system would emit the assertion if the lookup ref failed to resolve and the attr was an upsert type

onetom15:02:25

i have the gut feeling, that we could bolt something on top of the current behaviour, but such logic should know about the schema too... which is available in tx-fns...

favila15:02:57

yep, everything it needs is there, it’s just not made for you

onetom15:02:08

i really don't care about how non-performant would it be, because the convenience would trump that drawback 🙂

favila15:02:21

I don’t think it would be slow

favila15:02:52

at least with on-prem, all that schema and ident stuff is aggressively cached in memory all the time (not object cache, faster)

favila15:02:10

but the apis to use those fast paths aren’t in cloud (d/entid, d/ident, d/attribute)

favila15:02:02

but I suspect they still exist in the cloud implementation; they just couldn’t be fast over a network

favila15:02:10

thus are not in the client api