datomic 2019-09-25 | Slack Archive

tatut12:09:23

best practices page states that you should annotate transactions with the what, who, when information… but it seems to me that that makes it less convenient to pull because the tx is a separate entity. Are there some good tricks for that or simply do 2 queries?

tatut12:09:53

vs. adding modification timestamp and modifying user info as direct attributes of the entity

favila12:09:55

The consideration here is same as in time-of-record vs domain-time considerations

favila12:09:58

Imagine your tx is a git commit. Would the data be part of the commit message or part of the commit body itself?

favila12:09:49

If the message, that’s Tx metadata; if body, that’s not tx metadata

tatut12:09:27

for example I have comments, where the when and who are metadata imo… but it would be convenient to pull it at the same time as the comment itself

tatut12:09:48

in that case I know each comment is added in its own tx

favila13:09:29

comment author and time sound like data not metadata to me. What if you have to backdate a comment? or import them?

tatut13:09:43

that’s a good point, perhaps it is data

tatut13:09:16

better to err on the side of data, it seems… it is more flexible precisely if we need to import

favila13:09:47

that doesn’t mean you still can’t record e.g. the system and user that transacted the comment

favila13:09:15

but the use case is more dev-time auditing and debugging

favila13:09:37

comment author = tx writer may not be a solid assumption for eg

favila13:09:17

e.g. some systems have “impersonation” features (usually for support)

tatut13:09:23

I’ll go with both, thanks

cjsauer13:09:33

So maybe a distinction could be made that is “domain metadata” versus “operational metadata”. Tx entity seems to be for the latter.

tatut13:09:45

would you model the domain metadata as a single attribute type that all entities have, or separate (like :comment/author and :file/author etc)

tatut13:09:10

it seems a single one would allow answering questions like “give me all entities authored by this user”

favila13:09:36

a single one dovetails nicely with how spec likes to model namespaced attributes

favila13:09:03

but: make sure you never would need both of them in the same map (that’s a clue there really is some semantic difference between :comment/author and :file/author) and the range (possible legal values) is the same in all contexts

favila13:09:51

other downsides: it’s less index friendly, and it’s less intrinsically obvious what attributes are expected together on an entity “type” (spec can help here)

tatut13:09:13

in rdbms you often have the same created_by, created_at etc fields in all main tables… as a newcomer to datomic I don’t have an intuition about what is best here

favila13:09:20

correct, but 1) those are often really tx metadata in disguise 2) they’re in different tables (usually), so they are different fields. they’re equivalent to :TABLENAME/FIELDNAME

favila13:09:28

(roughly)

favila13:09:46

if you would use “joined table polymorphism” (I think it’s called?) in sql, that would be like using one attribute

favila13:09:29

i.e. you have one table with all “common” fields, and other tables use those fields by joining (in either direction)

cjsauer14:09:53

Given the (abbreviated) schema:

{:db/ident :customer/id
 :db/unique :db.unique/identity}

and the entity

{:db/id 1
 :customer/id "XYZ"}

What terms are given to the following: - 1 - "XYZ" - [:customer/id "XYZ"] - 1 || [:customer/id "XYZ"] (i.e. things you can pass to pull) The vocabulary in the wild seems to be somewhat inconsistent. I’ve seen words like: - eid - lookup - ident - entity Maybe there are specs available?

favila14:09:07

https://docs.datomic.com/on-prem/identity.html

benoit14:09:23

I call them: 1: an entity id or eid "XYZ": a customer id [:customer/id "XYZ"]: a lookup ref

favila14:09:57

“entity-identifier” = entity-id (eid) OR ident (keyword value of :db/ident attr) OR lookup-ref

favila14:09:36

lookup ref is the [attr value] lookup

favila14:09:52

the attr itself can be any entity-identifier for an attr also

favila14:09:18

e.g. :customer/id is eid 123: [123 "XYZ"]

favila14:09:15

d/entid can coerce any entity identifier to an eid

cjsauer14:09:23

Thanks @U09R86PA4, this link I found appears to agree: https://docs.datomic.com/cloud/schema/schema-reference.html#orgee3fac1 Which would suggest: 1: eid "XYZ": customer id [:customer/id "XYZ"]: lookup-ref :customer/id: ident 1 || [:customer/id "XYZ"]: identifier (entity identifier)

favila14:09:43

correct

cjsauer14:09:03

However the datomic API itself seems inconsistent. pull for example uses the arg name eid, even tho it really can take an identifier…

favila14:09:58

probably just not careful naming

favila14:09:36

any “eid” argument in a public api will accept any entity identifier

favila14:09:49

if it helps, eid = Entity IDentifier

favila14:09:59

(not sure that’s what they mean though)

cjsauer14:09:49

Maybe eident would be more specific

cjsauer14:09:47

Does there happen to be a public spec available for these? Seems like that would be very useful for libraries/frameworks.

cjsauer14:09:50

This is perhaps the closest: https://github.com/edn-query-language/eql

favila14:09:08

no afaik. I made one but it’s proprietary. (it’s also harder than it looks to get very narrowly defined types!)

favila14:09:35

e.g distinguish between :t and :tx

favila14:09:49

or know that a long is a potentially invalid entity id

cjsauer14:09:14

Hm yeah…library authors lament 😅 Public, standalone specs would be amazing for the community imo. Without them I’m afraid the semantics will drift wildly from lib-to-lib.

cjsauer14:09:38

(For example that eql spec uses “ident” where datomic uses “lookup-ref”)

favila14:09:37

it’s also hard to compress these names without ambiguity

favila14:09:54

entity identifier -> eident isn’t bad, but it’s still more than “eid”

favila14:09:18

maybe “edent”

favila14:09:55

edent = eid | ident | eref

cjsauer14:09:47

Yeah edent is nice

cjsauer14:09:24

I don’t have a datomic connection handy at the moment, but is [:db/id 1] a valid lookup-ref?

favila15:09:48

no, :db/id is not an attribute

👍 8

λustin f(n)17:09:49

I want to start using Datomic, at first in parallel with my existing SQL database. Is it possible (and safe) to put Datomic on top of your existing SQL database to take advantage of existing backup infrastructure, etc?

favila17:09:49

You need to use another schema/tablespace/whatever, but you can use the same server

λustin f(n)17:09:47

After all, doesn't Datomic just use a single backing table in SQL? I would assume that it wouldn't break anything in other tables it didn't care about.

genekim17:09:51

I’m getting lots of errors of “Insufficient memory to complete operation” when I run a case-insensitive string matching query against 110K entities on a Datomic Solo instance. I’m running the code on my laptop, running through the datomic proxy. Am I doing something obviously wrong in the query? Or do I need to upgrade the compute instance type? (Ugh. Hoping that’s note the case!!!) THANK YOU in advance! Query is as follows: (annotated with Ghostwheel types, which I freaking love):

(>defn get-id-by-screen-name
  "case insensitive search "
  [nm] [string? => any?]
  ; 
  (let [retval (d/q '[:find ?id
                      :in $ ?lowercasename
                      :where
                      [?e :user/id ?id]
                      [?e :user/screen-name ?screenname]
                      [(.toLowerCase ^String ?screenname) ?lowercasename]
                      [(= ?lowercasename ?screenname)]]
                    (d/db (get-conn))
                    (.toLowerCase nm))]
    (ffirst retval)))

favila17:09:58

maybe moving :user/id clause to the end would help

favila17:09:43

ah, actually: [(.toLowerCase ^String ?screenname) ?lowercasename]

favila17:09:07

this is binding to your already-defined :in ?lowercasename

genekim17:09:48

THX for reply!!! ohh… I will try reordering the clauses — as soon as the Datomic instance accepts queries again! 🙂 Is there something I should do differently with ?lowercasename?

favila17:09:04

I think you want [(.toLowerCase ?screenname) ?lcsn] [(= ?lcsn ?lowercasename)]

Joe Lane17:09:06

@genekim That is creating a TON of garbage in the JVM because you're creating 110K new strings. try using https://docs.oracle.com/javase/7/docs/api/java/lang/String.html#equalsIgnoreCase(java.lang.String)

favila17:09:18

your query as-is is actually looking for cases where the input ?lowercasename, ?screenname, and to-lower ?screenname are all equal

favila17:09:24

so, that’s wrong

Joe Lane17:09:11

Also you could use https://docs.oracle.com/javase/7/docs/api/java/lang/String.html#compareToIgnoreCase(java.lang.String)

favila17:09:33

[(.toLowerCase ?screenname) ?lowercasename] should work without the = clause (but I feel like I’ve gotten burned by “clever” unifications like this before)

favila17:09:58

but also take @U0CJ19XAM’s advice about using better string methods here

favila17:09:11

that would allow you to filter instead of compare

genekim17:09:45

Ah! AWESOME! I’ll give that a try… I think I need to wait 15m while my Datomic instance recovers. @U0CJ19XAM In the meantime, I will study that query, @U09R86PA4 — it does return correct answers, but most of the time, it runs out of memory. I’ll let you know how it goes!!! THX AGAIN!

Joe Lane17:09:43

@genekim You may need to bounce the box. I've run out of mem on those little guys before too and it's annoying 🙂

genekim17:09:54

Wow, that’s amazing! It worked!!! Thanks @U0CJ19XAM and @U09R86PA4!!!!!! You made my morning!!!

ghadi17:09:28

storing the names pre-lowercased is a good idea, too

ghadi17:09:15

or both ways -- not sure the business req

Joe Lane17:09:06

Yeah, if you can afford the space, I completely agree with @U050ECB92. Then the datomic = can ( I believe ) leverage index navigation which is very fast!

genekim19:09:01

@U050ECB92 Oh, that is super smart idea. I can definitely store :screenname-lowercase! Thx!!

timgilbert17:09:05

Say, any chance of resolving this dependency conflict in the datomic peer library?

[com.datomic/datomic-pro "0.9.5951"] -> [com.google.guava/guava "18.0"]
 overrides
[com.datomic/datomic-pro "0.9.5951"] -> [org.apache.activemq/artemis-core-client "1.5.6" :exclusions [org.jgroups/jgroups commons-logging]] -> [org.apache.activemq/artemis-commons "1.5.6"] -> [com.google.guava/guava "19.0"]

ghadi17:09:20

@genekim side note: pass the db as an argument -- don't reach out from the inside of the function

👍 8

genekim19:09:09

Ah… I can see how that might make testing easier, and allow for better determinism… What are other benefits? And what do you typically name the function that gets the db and wraps the actual query? (Looking for some examples and conventions.) (I’ll go look in the musicbrainz example, too…) Thx!

ghadi01:09:50

Taking the db as an argument allows you to correctly ask higher-level or larger questions, and have all parts of the process use the same basis for query / calculation

ghadi02:09:34

It also makes the function referentially transparent

ghadi02:09:05

Give it the same input and you get the same output.

ghadi02:09:09

colloquially people call defns that contain calls to d/q “queries”

ghadi17:09:31

[db name]

genekim19:09:59

So happy to have solved the datomic memory issue (thank you, all!). I just noticed that the Datoms Cloudwatch graph is totally blank — is there an easy remedy for this? (Like rebooting something? 🙂

genekim19:09:25

(The data is there if I zoom out timescale far enough…)

marshall19:09:55

@genekim I believe Datoms is only reported when an indexing job occurs; are you actively transacting against the system?

marshall20:09:42

@genekim actually, i don’t think that’s true ^; not sure why you’re not seeing any data in there

genekim20:09:21

@marshall Yep, as we speak! 🙂

2019-09-25

Channels