Fork me on GitHub
#datomic
<
2019-06-13
>
Drew Verlee03:06:49

is it possible to create lookup ref that relies on more then one attribute? The answer seems to be no, but it would not be hard to for me to just right the query for the entity id and use that i suppose.

favila13:06:08

consider creating a composite unique-value attribute

4
eoliphant14:06:37

I do that, plus usually an associated tx-fn

Alex Miller (Clojure team)04:06:41

not currently

šŸ‘ 4
Lone Ranger13:06:05

so, if I'm running (datomic.api/connect db-uri)... am I supposed to have AMQ running in the background?

Lone Ranger13:06:08

Because I'm getting this

Lone Ranger13:06:24

AMQ119007: Cannot connect to server(s). Tried with all available servers.

Lone Ranger13:06:43

for reference this was attempted with MySQL and the transactor running in docker containers and was attempting to connect to transactor from host machine

Lone Ranger13:06:59

(attempting to troubleshoot why my ring server is having trouble connecting to datomic)

marshall13:06:10

@goomba no. see https://docs.datomic.com/on-prem/deployment.html#peer-fails-to-connect most likely your host and alt-host values in your transactor properties file are the issue

Lone Ranger13:06:47

haha, it's even highlighted on the page šŸ˜„ thanks @marshall, I'll take a look

Lone Ranger13:06:01

alright this time I think it actually is the peer

Lone Ranger13:06:40

ring_1        | ERROR: AMQ214016: Failed to create netty connection
ring_1        | javax.net.ssl.SSLException: handshake timed out
ring_1        | 	at io.netty.handler.ssl.SslHandler.handshake(...)(Unknown Source)
ring_1        | 
ring_1        | Jun 13, 2019 1:36:04 PM org.apache.activemq.artemis.core.remoting.impl.netty.NettyConnector createConnection
ring_1        | ERROR: AMQ214016: Failed to create netty connection
ring_1        | javax.net.ssl.SSLException: handshake timed out
ring_1        | 	at io.netty.handler.ssl.SslHandler.handshake(...)(Unknown Source)
ring_1        | 
ring_1        | Exception in thread "main" Syntax error compiling at (db.clj:42:11).
ring_1        | Error communicating with HOST 172.20.0.4 on PORT 4334

Lone Ranger13:06:52

I assume this is because I'm not exposing a port in docker properly

Lone Ranger13:06:00

will look into it

cgrand13:06:23

Iā€™ve got a weird corner case were datomic and datascript agree: tempids can unify together but not a tempid and an eid:

[[:db/add "foo" :db/ident :foo] [:db/add "bar" :db/ident :foo]] ; works fine and resolve to a single eid
[[:db/add <existing-eid-or-lookup-ref> :db/ident :foo] [:db/add "bar" :db/ident :foo]] ; doesn't work

favila14:06:26

seems to work for me?

favila14:06:35

at least using mem db

favila14:06:59

(-> "datomic:" d/connect
    (d/transact [[:db/add (d/tempid :db.part/user) :db/ident :foo] [:db/add "bar" :db/ident :foo]]) deref)
=>
{:db-before datomic.db.Db,
 @5e4de82f :db-after,
 datomic.db.Db @b6765e2a,
 :tx-data [#datom[13194139534312
                  50
                  #inst"2019-06-13T14:19:50.056-00:00"
                  13194139534312
                  true]
           #datom[17592186045417 10 :foo 13194139534312 true]],
 :tempids {-9223350046623220289 17592186045417, "bar" 17592186045417}}
(-> "datomic:" d/connect
    (d/transact [[:db/add (d/tempid :db.part/user) :db/ident :foo] [:db/add "bar" :db/ident :foo]]) deref)
=>
{:db-before datomic.db.Db,
 @b6765e2a :db-after,
 datomic.db.Db @276d2771,
 :tx-data [#datom[13194139534314
                  50
                  #inst"2019-06-13T14:20:00.447-00:00"
                  13194139534314
                  true]],
 :tempids {-9223350046623220291 17592186045417, "bar" 17592186045417}}

favila14:06:17

maybe the tempid is in a different partition?

favila14:06:02

actually that works too, at least where [:db/ident :foo] exists already

favila14:06:56

seems to work for initial write also, but honestly the partition difference seems like it should be an error. I don't know how it would choose a partition

favila14:06:58

(-> "datomic:" d/connect
    (d/transact [[:db/add (d/tempid :db.part/db) :db/ident :bar] [:db/add "bar" :db/ident :bar]]) deref)
=>
{:db-before datomic.db.Db,
 @28ac123 :db-after,
 datomic.db.Db @a8ad04f9,
 :tx-data [#datom[13194139534319
                  50
                  #inst"2019-06-13T14:22:26.281-00:00"
                  13194139534319
                  true]
           #datom[63 10 :bar 13194139534319 true]],
 :tempids {-9223367638809264717 63, "bar" 63}}

Alex Miller (Clojure team)14:06:21

"doesn't work" means? error?

cgrand14:06:37

@favila @alexmiller fuller repro: 1/ transacting with two tempids and one ident works fine

user=> (d/transact conn [[:db/add "foo" :db/ident :foo] [:db/add "bar" :db/ident :foo]])
#object[datomic.promise$settable_future$reify__4751 0x3386c206 {:status :ready, :val {:db-before datomic.db.Db@a02b4ea8, :db-after datomic.db.Db@e2dbad4b, :tx-data [#datom[13194139534321 50 #inst "2019-06-13T14:39:47.738-00:00" 13194139534321 true] #datom[17592186045426 10 :foo 13194139534321 true]], :tempids {"foo" 17592186045426, "bar" 17592186045426}}}]
2/ transacting the same ident on an existing eid AND a tempid fails:
user=> (d/transact conn [[:db/add :foo :db/ident :bar] [:db/add "bar" :db/ident :bar]])
#object[datomic.promise$settable_future$reify__4751 0x2321e482 {:status :failed, :val #error {
 :cause ":db.error/datoms-conflict Two datoms in the same transaction conflict\n{:d1 [:foo :db/ident :bar 13194139534323 true],\n :d2 [17592186045428 :db/ident :bar 13194139534323 true]}\n"
 :data {:d1 [:foo :db/ident :bar 13194139534323 true], :d2 [17592186045428 :db/ident :bar 13194139534323 true], :db/error :db.error/datoms-conflict}
 :via
 [{:type java.util.concurrent.ExecutionException
   :message "java.lang.IllegalArgumentException: :db.error/datoms-conflict Two datoms in the same transaction conflict\n{:d1 [:foo :db/ident :bar 13194139534323 true],\n :d2 [17592186045428 :db/ident :bar 13194139534323 true]}\n"
   :at [datomic.promise$throw_executionexception_if_throwable invokeStatic "promise.clj" 10]}
  {:type datomic.impl.Exceptions$IllegalArgumentExceptionInfo
   :message ":db.error/datoms-conflict Two datoms in the same transaction conflict\n{:d1 [:foo :db/ident :bar 13194139534323 true],\n :d2 [17592186045428 :db/ident :bar 13194139534323 true]}\n"
   :data {:d1 [:foo :db/ident :bar 13194139534323 true], :d2 [17592186045428 :db/ident :bar 13194139534323 true], :db/error :db.error/datoms-conflict}
   :at [datomic.error$argd invokeStatic "error.clj" 77]}]
 ...
where I would have expected the tempid to be assigned the existing eid (upsert semantics)

favila14:06:36

[:foo :db/ident :bar]

favila14:06:44

your tx is malformed

favila14:06:03

you meant [:db/add :foo :db/ident :bar]?

cgrand14:06:37

this one, I edited the above post

favila14:06:02

or [:db/add (d/tempid :db.part/user) :db/ident :bar]?

favila14:06:20

ok, with corrected tx + response

favila14:06:05

I see correction

favila14:06:09

This is unavoidable

favila14:06:23

the :foo resolves to the before-tx eid

cgrand14:06:46

you can replace :foo by an explicit eid

favila14:06:16

:db/ident :bar doesn't exist yet

favila14:06:56

I'm not sure how to get the same datom out of both of these tx ops

cgrand14:06:01

user=> (d/transact conn [[:db/add 17592186045426 :db/ident :bar] [:db/add "tmp" :db/ident :bar]])
#object[datomic.promise$settable_future$reify__4751 0x2964511 {:status :failed, :val #error {
 :cause ":db.error/datoms-conflict Two datoms in the same transaction conflict\n{:d1 [:foo :db/ident :bar 13194139534323 true],\n :d2 [17592186045428 :db/ident :bar 13194139534323 true]}\n"

favila14:06:54

why would "tmp" unify to the current value of :foo and/or the eid 17592186045426 ?

favila14:06:11

"tmp" may unify to current value of [:db/ident :bar] if it existed

favila14:06:24

resolution of a tempid to a possible real id is done using the db-before value; for this to work as expected, it would have to be done with a db-after value

favila14:06:20

i.e., [:db/add "tmp" :db/ident :bar] would have to rewrite "tmp" to 17592186045426 before it knew that 17592186045426 had the ident :bar

favila14:06:38

because 17592186045426 doesn't have the ident :bar until the tx is complete

cgrand14:06:47

What about this one (less heavy on db/idents)?

user=> (d/transact conn [{:db/ident :u/k :db/unique :db.unique/identity :db/valueType :db.type/keyword :db/cardinality :db.cardinality/one}])
user=> (d/transact conn [[:db/add :u/k :u/k :foo] [:db/add "tmp" :u/k :foo]])
(exception)

favila14:06:25

same issue

favila14:06:55

can you get anything from the db using (d/entid db [:u/k :foo])

cgrand14:06:48

but I donā€™t use [:u/k :foo] as a lookup ref (itā€™s not transacted yet)

favila15:06:04

it's not transacted yet, so "tmp" can't be replaced with its eid

cgrand15:06:14

then how come

(d/transact conn [[:db/add "tmp1" :u/k :bar] [:db/add "tmp2" :u/k :bar]])
works?

favila15:06:41

I don't know

cgrand15:06:13

btw I donā€™t need any db to figure this out: itā€™s a purely local unification; I could postprocess the fully expanded tx-data to perform the unification....

cgrand15:06:40

sadly my tx-data involves several tx fns...

favila15:06:26

This seems like a tricky thing to want to do

favila15:06:41

I'm still suspicious of trying to unify against something that hasn't been written yet

favila15:06:19

I'm pretty sure I exploit lack of unification in cases like this to detect real conflicts

favila15:06:01

I can sort of see why [[:db/add "tmp1" :u/k :bar] [:db/add "tmp2" :u/k :bar]] might be allowed to unify because :u/k :bar doesn't exist yet

cgrand15:06:34

how do you know it doesnā€™t exist yet? In fact it works even if it preexists:

user=> (d/transact conn [{:u/k :preexisting}])
#object[datomic.promise$settable_future$reify__4751 0x62cd562d {:status :ready, :val {:db-before datomic.db.Db@1d500de1, :db-after datomic.db.Db@c2499ebc, :tx-data [#datom[13194139534328 50 #inst "2019-06-13T15:10:42.368-00:00" 13194139534328 true] #datom[17592186045433 64 :preexisting 13194139534328 true]], :tempids {-9223301668109598113 17592186045433}}}]
user=> (d/transact conn [[:db/add "tmp1" :u/k :preexisting] [:db/add "tmp2" :u/k :preexisting]])
#object[datomic.promise$settable_future$reify__4751 0x14b5752f {:status :ready, :val {:db-before datomic.db.Db@c2499ebc, :db-after datomic.db.Db@98f067bc, :tx-data [#datom[13194139534330 50 #inst "2019-06-13T15:11:01.592-00:00" 13194139534330 true]], :tempids {"tmp1" 17592186045433, "tmp2" 17592186045433}}}]

favila15:06:59

preiexisting makes sense because both unify to the eid

favila15:06:19

asserting ":u/k :bar" on a tempid would trigger replacement of tempid with realid

favila15:06:33

then all future uses of that tempid would also be replaced with realid

favila15:06:28

similarly, if [:u/k :bar] didn't exist and was asserted, kinda makes sense to say that every other tempid trying to assert that would unify to the same newly-minted eid

favila15:06:23

but when :u/k :bar could unify to a real eid, to ask a different tempid asserting a new [:u/k :foo] to unify to the what-is-now :bar but what will be :foo seems like too much mind-reading

favila15:06:34

but that already makes me a little nervous

favila15:06:52

suppose one eid had multiple lookup refs

favila15:06:27

you could have a tempid that could potentially unify against multiple eids

favila15:06:26

e.g. {:db/id 12345 :refa :refa1 :refb :refb1} {:db/id 67890 :refa :refa2} tx [[:db/add "t1" :refa :refa2][:db/add "t2" :refb :refb1]]

cgrand15:06:11

this works ok

favila15:06:33

"works ok" in what sense? no tx error?

favila15:06:02

I'm highlighting the potential ambiguity, I'm actually concerned that it works

favila15:06:27

actually nm this isn't what I was thinking of

favila15:06:29

This is the case I was thinking of (:refa and :refb are unique-identity)

favila15:06:52

@(d/transact conn [{:db/id "t1" :refa :a1 :refb :b1} {:db/id "t2" :refa :a2 :refb :b2}])
@(d/transact conn [[:db/add "t3" :refa :a1] [:db/add "t3" :refb :b3]])
=>
{:db-before datomic.db.Db,
 @20fead49 :db-after,
 datomic.db.Db @c13cf2d2,
 :tx-data [#datom[13194139534316
                  50
                  #inst"2019-06-13T15:43:29.444-00:00"
                  13194139534316
                  true]
           #datom[17592186045418 64 :b3 13194139534316 true]
           #datom[17592186045418 64 :b1 13194139534316 false]],
 :tempids {"t3" 17592186045418}}

favila15:06:56

IMO this should be an error

favila15:06:15

it works because resolution to a real eid happened first

favila15:06:28

this btw is also the tx ops from expansion of the tx map `{:db/id "t3" :refa :a1 :refb :b3}`

favila15:06:57

I think this syntax makes the ambiguity more clear

favila15:06:23

this was probably either a mistake or too-clever code

favila15:06:57

If I were going back in time, I think I would make upserting attributes work like lookup refs that may not resolve against the db-before

favila15:06:56

e.g. {:db/id [:refb :b3] :refa :a1} would (if :b3 didn't exist) reliably make a new eid, assert :refb :b3 on it, and assert :refa :a1 on it

favila15:06:23

tempids would always make new eids

favila16:06:17

actually you may be able to replicate some of that behavior by consistently hashing the same ref lookup to the same string-for-tempid

favila15:06:01

or even worse [[:db/add "t1" :refa :refa2][:db/add "t1" :refb :refb1]]: was I making a new :refa2 or changing :refa1 to :refa2?

cgrand15:06:14

itā€™s an integrity violation you are merging two existing entities

favila15:06:15

if :refa2 existed prior, would it still be an integrity violation?

favila15:06:09

you have to decide whether forms like [:db/add "t1" :refa :refa2] are primarily a lazy way of resolving tempids or a way to assert a new ident

cgrand15:06:15

you mean if it didnā€™t exist? because it exists {:db/id 67890 :refa :refa2}

favila15:06:38

when both are possible, allowing it to guess doesn't seem like a good idea

favila15:06:37

in general when I want my tx to be an update rather than a upsert, I will use the ident or lookup ref as the eid

favila15:06:13

I will not rely on unification through the assertion

cgrand15:06:16

> you have to decide whether forms like [:db/add "t1" :refa :refa2] are primarily a lazy way of resolving tempids or a way to assert a new ident Neither (or both), they are the upsert semantics and static analysis of the tx-data is enough

[:db/add eid :ref :A] [:db/add "tmp" :ref :A]
ā€¢ :A doesn't exist yet in the db, tmp resolves to eid
ā€¢ :A does exist but on another eid -> it's a unicity conflict
ā€¢ :A does already exist on this eid -> tmp resolves to eid
In all non-conflicting cases we get the same output.

favila15:06:22

yes but in case b, it's possible I made a mistake in my tx

favila15:06:40

wait, what is case 2

favila15:06:21

is that [:ref :A] does not resolve to eid?

cgrand15:06:20

itā€™s [:ref :A] resolves to another-eid

favila15:06:36

so you would expect a conflict? then why the bug report?

favila15:06:40

I expect a conflict too

favila15:06:04

I thought you expected "tmp" to resolve to eid (not another-eid)

favila15:06:45

i.e. the value :ref :A will resolve to after [:db/add eid :ref :A] is applied

cgrand15:06:16

> you have to decide whether forms like [:db/add "t1" :refa :refa2] are primarily a lazy way of resolving tempids or a way to assert a new ident Neither (or both), they are the upsert semantics and static analysis of the tx-data is enough

[:db/add eid :ref :A] [:db/add "tmp" :ref :A]
ā€¢ :A doesn't exist yet in the db, tmp resolves to eid
ā€¢ :A does exist but on another eid -> it's a unicity conflict
ā€¢ :A does already exist on this eid -> tmp resolves to eid
In all non-conflicting cases we get the same output.

cjsauer17:06:13

Hello all, I have a question due to lack of conceptualization: why is :db/cas necessary if the following is true for a Datomic system: >The transactor queues transactions and processes them serially. Serial processing seems to imply no need for check-and-set, but Iā€™m surely missing something.

Joe Lane17:06:04

If the value youā€™re about to set is dependent upon itā€™s previous value (like a bank account) then you want cas.

cjsauer17:06:47

That makes sense. So then I think Iā€™ve been implying that the transaction functions themselves are run serially, but in reality, they might be run (expanded) in parallel, and their resulting datoms are what actually get sent to the transactor for serial processing. Is that correct?

favila17:06:49

A transaction is data describing the change you want done

favila17:06:08

To prepare that data, you may have read values out of the db

favila17:06:28

so your changes are prepared assuming a certain state of the db

favila17:06:53

the problem is by the time that tx data gets to the transactor, your assumptions may be wrong

favila17:06:01

thus invalidating your transaction

favila17:06:19

:db/cas is a way to assert that something still has the value you read at the moment the write occurs

favila17:06:20

the transactions themselves are applied serially, but the transaction data was not prepared serially (i.e. it was prepared by uncoordinated peers reading whatever they read)

cjsauer17:06:01

I see, thank you @favila. So then, in a Cloud system, the ā€œcompute groupā€ actually can prepare datoms (i.e. run tx fns) in an uncoordinated fashion, but the actual mutation of storage is always serial?

cjsauer17:06:27

This actually would explain why :db/cas is a built-in, because there must be some storage-level magic happening to ensure that CASā€™s promise is kept.

favila17:06:02

I don't think tx functions are run outside a tx

favila17:06:11

unless this is a cloud vs prem difference

favila17:06:18

it's a surprising difference if so

marshall17:06:23

they are not

marshall17:06:49

the same description for the use/need/purpose of CAS for on-prem that Francis provided above is true for Cloud

cjsauer17:06:04

Okay so to check my own understanding, if I prepare the tx outside of a tx function, then :db/cas might be required. However, if I query for the dependent data inside a tx function, then I shouldnā€™t need CAS, correct?

marshall17:06:24

more or less yes

marshall17:06:34

take a look at the link i dropped in the other thread

Joe Lane17:06:06

Hmmm. Iā€™m not sure, thatā€™s a very good question though.

cjsauer17:06:25

Hereā€™s actually an interesting example from the docs: https://docs.datomic.com/cloud/transactions/transaction-functions.html#creating Notice that inc-attr does not use :db/cas even tho it depends on the previous valueā€¦

marshall17:06:05

:db/cas is a transaction function you can use it within a custom transaction function you write, but you donā€™t have to you can also reimplement it (or something like it) as a transaction function yourself However, the general use of CAS is more frequently for optimistic concurrency applications - i.e. https://docs.datomic.com/cloud/best.html#optimistic-concurrency

favila17:06:37

This is because transactions run serially. the transaction data expansion and application to make the new DB value begins with the previous db value. All tx functions receive that previous db value. No other transactions are expanded/applied during this process (in essence, the transaction has a lock on the entire database). The datoms that result from db are applied to the previous db value to make the next db value. then the next tx is processed

favila17:06:04

there is no opportunity for a tx function to get a stale read

marshall17:06:35

the tradeoff is that whatever work youā€™re doing in your transaction function is happening in the single-writer-thread of Datomic

marshall17:06:06

so if you try to do something expensive (like call a remote service ā€” eek!) from within the transaction function, all writes are going to wait on that work

marshall17:06:09

if you instead do that work locally in your client (or peer), you can avoid that cost on the transaction stream but you need to ensure that no one has changed your relevant data out from under you in the meantime, so you can often use CAS for that

favila17:06:32

cas (and it's general technique of "assert what I read hasn't changed") allows the opposite tradeoff: possible parallel tx preparation, but a stale read is expensive to recover from. (You need to catch the tx error, detect it was a CAS error, and reprepare your tx using a newer db, and reissue hoping you don't race with some other write)

cjsauer17:06:26

Ahhh okay, that link coupled with these explanations have totally cleared up my confusion. I was totally missing the fact that concurrency was in the hands of the developer. So putting simple query logic into a tx function is totally valid, but when that logic becomes expensive (e.g. remote call as @marshall said), it might make more sense to perform that work outside the tx fn to keep the tx stream clear, and rely on CAS to uphold consistency.

cjsauer17:06:13

Thank you both very much

favila18:06:43

If you are familiar with clojure atoms: this is roughly the difference between (swap! db apply-ops (inc-something db)) and (swap! db (comp apply-ops inc-something))

cjsauer21:06:24

Makes sense. Former performs the inc transformation outside the swap, and the latter performs the inc/apply inside the swap (iiuc).

joshkh21:06:48

are there best-practices or guides for query optimisation? we're discovering queries that run orders of magnitude faster after reordering just one of their constraints.

joshkh21:06:29

we've always followed the "most specific first" rule, but it seems there are others patterns that help as well

joshkh21:06:56

thanks @marshall! that's exactly what we found via some quick trial-and-error

joshkh21:06:59

and also moving some top level constraints down into or-joins