Fork me on GitHub
#datomic
<
2020-11-03
>
joshkh12:11:21

should i be able to upsert an entity via an attribute which is a reference and also unique-by-identity?

joshkh12:11:57

for example

(d/transact conn
            {:tx-data
             [; upsert a school entity where :school/president is a reference and unique-by-identity
              {:school/president {:president/id 12345}
               :school/name      "Bowling Academy of the Sciences"}]})

favila13:11:13

This is actually two upserts isn’t it? :president/id also?

joshkh15:11:42

yes, you are correct and that is indeed the problem. it seems that you cannot upsert two entities that reference each other within the same transaction. for example, running this transaction twice causes a datom conflict

(d/transact conn
            {:tx-data
             [
              ; a president
              {:president/id "The Dude" :db/id "temp-president"}

              ; a school with a unique-by-identity 
              ; :school/president reference to the president
              {:school/president "temp-president"
               :school/name      "Bowling Academy of Sciences"}
              ]})
whereas both of these transactions upsert as expected
(d/transact conn
            {:tx-data
             [; a president
              {:president/id "The Dude" :db/id "temp-president"}
              ]})

(d/transact conn
            {:tx-data
             [; a school with a unique-by-identity 
              ; :school/president reference to the president
              {:school/president 101155069755476  ;<- known dbid
               :school/name      "Bowling Academy of Sciences"}]})

joshkh15:11:54

(note the known eid in the second transaction)

vncz14:11:52

Is there any specific reason why some kind of selection can only be done using the Peer Server?

favila14:11:35

What do you mean by “selection”?

vncz14:11:48

Let me give you an example

vncz14:11:49

:find [?name ?surname] :in $ :where [?e :p/name ?name] [?e :p/surname ?surname]

vncz14:11:59

This query cannot be executed by the peer library

vncz14:11:38

This one can :find ?name ?surname :in $ :where [?e :p/name ?name] [?e :p/surname ?surname]

favila14:11:08

ah, ok, those are called “find specifications”

vncz14:11:03

Yes, these ones. It seems like the Peer Library can only execute the "Collection of List" one

favila14:11:11

and it’s the opposite: only the peer API supports these; the client API (the peer server provides an endpoint for the client api) does not

vncz14:11:38

This is weird, I'm using Datomic-dev (which I guess it's using the peer library?!) and I can't execute such queries

favila14:11:02

dev-local?

favila14:11:32

that uses the client api. (require '[datomic.client.api])

favila14:11:43

the peer api is datomic.api

favila14:11:53

but you are using a client api

favila14:11:00

the client api does not support these

vncz14:11:07

Hmm :thinking_face:

vncz14:11:22

Ok so in theory I should just change the namespace requirement?

favila14:11:35

no, datomic.api is not supported by dev-local

vncz14:11:47

Ah ok so there's no way around it basically

favila14:11:08

Maybe historical background would help: in the beginning was datomic on-prem and the peer (`datomic.api` ), then came cloud and the client-api, and the peer-server as a bridge from clients to on-prem peers.

favila14:11:08

Maybe historical background would help: in the beginning was datomic on-prem and the peer (`datomic.api` ), then came cloud and the client-api, and the peer-server as a bridge from clients to on-prem peers.

favila14:11:20

dev-local is “local cloud”

favila14:11:24

that came even later

favila14:11:33

(like, less than two months ago?)

vncz14:11:41

Oh ok, so it's a simulation of a cloud environment. I guess I was confused by the fact that's all in the same process

favila14:11:28

the client-api is designed to be networked or in-process; in dev-local or inside an ion, it’s actually in-process

vncz14:11:56

Got it. So to keep it short I should either move to Datomic-free on Premise or workaround the limitation in the code

favila14:11:48

as to why they dropped the find specifications, I don’t know. My guess would be that people incorrectly thought that it actually changed the query performance characteristics, but actually it’s just a convenience for first, map first, etc

favila14:11:06

the query does just as much work and produces a full result in either case

vncz14:11:18

I could see these conveniente useful though. The idea of having to manually do that every time is annoying.

vncz14:11:22

Not the end of the world, but still

kschltz17:11:33

Hi there. We've been facing an awkward situation with our Cloud system From what I've seem of Datomic Cloud architecture, it seemed like I can have several databases in the same system, as long as there are transactor machines available in my Transactor group. With that in mind, we scaled our compute group to have 20 machines, to serve our 19 dbs. All went well for a few months, until 3/4 days ago, when we started facing issues to transact data, having "Busy Indexing" errors. If Im not wrong this is due to our transactors being unable to ingest data the same pace we are transacting it, or is there something else I'm missing here? Thanks :D

kschltz21:11:49

Another odd thing is that my Dynamo Write Actual is really low, despite my IndexMemDb metric being really high

kschltz21:11:22

I have 130 Write provisioned, but only 2 is used

tony.kay22:11:55

are you running your application on the compute group? Or are you carefully directing clients to query groups that service a narrow number of dbs? If you hit the compute group randomly for app stuff, then you’re going to really stress the object cache on those nodes.

tony.kay22:11:52

which will lead to segment thrashing and all manner of badness

kschltz22:11:14

Im pointing my client directly to compute group

tony.kay22:11:43

yeah, I don’t work for cognitect, but my understanding of how it works leads me to the very strong belief that doing what you’re doing will not scale. Remember that each db needs it’s own RAM cache space for queries. The compute group has no db affinity, so with 20 dbs you’re ending up causing every compute node to need to cache stuff for all 20 dbs.

kschltz22:11:06

@U0CKQ19AQ would you say it would be best if I transacted to a query group fed by a specific set of databases?

tony.kay22:11:40

right, so a given user goes with a given db?

tony.kay22:11:11

(a given user won’t need to query across all dbs?)

kschltz22:11:22

From what Ive read, transactions to query groups end up in compute group

tony.kay22:11:31

yes, but that is writes, not memory pressure

kschltz22:11:32

this application is write only

tony.kay22:11:44

writes always go to a primary compute node for the db in question. no way around that

tony.kay22:11:03

the problem is probably that you’re also causing high memory and CPU pressure on those nodes for queries

tony.kay22:11:30

you could also just be ingesting things faster than datomic can handle…that is also possible

tony.kay22:11:50

but 20dbs on compute sounds like a recipe for trouble if you’re using that for general application traffic

kschltz22:11:37

I tried shutting my services down and give time to datomic to ingest, but to no avail. IndexMemDB is just a flat line

kschltz22:11:38

I will give your suggestion a try, thanks in advance

tony.kay22:11:43

there’s also the possibility that the txes themselves need to read enough of the 20 diff dbs to be causing mem problems. I’d contact support with a high prio ticket and see what they say.

tony.kay22:11:56

could be something broke 🙂

kschltz22:11:39

The way things are built, there is a client connection for each one of the databases, depending on the body of a tx it is transacted to a specific db

tony.kay22:11:25

the tx determines the db?

tony.kay22:11:29

ooof. much harder to pin limited dbs to a query group then.

Nassin03:11:10

If each node will be indexing/caching all 19 DBs, what's the point of increasing the node count to 20?