datomic

A-Helberg 2025-05-29T09:53:29.135839Z

Might be a little pedantic, so feel free to ignore πŸ™‚ https://docs.datomic.com/transactions/transaction-data-reference.html#entity-identifiers consist of among other things an eid = nat-int (pull db pattern eid & {:as options}) but eid here seems to refer to any of the entity-identifiers, except temp-id On https://docs.datomic.com/transactions/transaction-data-reference.html#list-forms :db/add references an entity-id but as far as I can see, only supports an eid while :db/retract supports "entity-identifiers, except temp-id" trying to make an internal api clear, what do people use (as in variable names) to refer to β€’ eid = nat-int β€’ "entity-identifiers, except temp-id" β€’ entity-identifiers as defined in the first link

favila 2025-05-29T10:46:29.735629Z

Db/add accepts any entity identifier. Datomic public apis only have two classes of input: entity identifier and entity identifier+tempid

favila 2025-05-29T10:49:12.424709Z

If you are designing internal apis that care about distinguishing and accepting only certain identifier types and not others (why?), you should use specs and docs and not rely on idioms

A-Helberg 2025-05-29T11:03:34.621649Z

> If you are designing internal apis that care about distinguishing and accepting only certain identifier types and not others (why?) it seems :db/add works for existing entity-identifiers (ie. a lookup-ref that has already been transacted) while the pull and map-transact syntax doesn't care. So :db/add supports temp-id or existing other entity-identifiers where there are others that support existing entity-identifiers and non-existing entity identifiers but not temp-ids > you should use specs and docs and not rely on idioms Sure I do have schemas and specs, but I was hoping for some better names for those

A-Helberg 2025-05-29T11:17:20.392479Z

As context this is part of migrating data from one system to another Since we're doing it gradually, some things may already exist in the target and can be updated Some may not, and need to be transacted for the first time. We were originally planning on doing the diff and returning a list of datoms :db/add :db/retract to be transacted later, but we can't know the db/id ahead of time, and we can't upsert in a :db/add. This caught me off guard, and that started the discussion about what we name the identifiers in different cases Not a big point of contention though, there are ways around it

A-Helberg 2025-05-29T11:19:53.951439Z

(yes I know we can use temp-ids in the list of datoms, it just complicates the logic much more than using lookup-refs)

cch1 2025-05-29T12:21:46.090909Z

I have noticed the inconsistency in the documentation as well. And we have struggled at my company and choosing good idiomatic names for entity identifiers. It’s not a big deal, but it would be nice to have an unambiguous vocabulary word for that which can be used in a transaction and that which can be used in a query

βž• 1
☝️ 1
favila 2025-05-29T14:50:33.580079Z

> We were originally planning on doing the diff and returning a list of datoms :db/add :db/retract to be transacted later, but we can't know the db/id ahead of time, and we can't upsert in a :db/add. Doesn't this mean it's not type that matters, but provenance? You care about the source of an eid, not whether it's a lookup ref or not

braai engineer 2025-05-29T14:51:01.608919Z

@helberg.andre, when referring to internal Datomic :db/id Long IDs, I use symbol eid. For identifiers like [:unique/ident 123], I use ident, which can be passed to d/entity or d/entid.

braai engineer 2025-05-29T14:52:11.450659Z

@helberg.andre what do you mean by you can't upsert with :db/add?

A-Helberg 2025-05-29T15:21:21.785469Z

@petrus That’s good, but doesn’t help in the case where the field can be some super set of those ident|eid|lookup-ref

πŸ‘ 1
βž• 1
Sven 2025-05-29T11:09:44.239739Z

A few days ago we had an issue where in a Datomic Cloud query group CPU spiked too 100 and free memory droped to zero. We are looking into the cause of this but I've found these messages in the logs. As they are not Alerts then should I just consider them as expected Datomic behaviour?

{
    "Msg": "RestartingDaemonException",
    "Name": "adopter-2",
    "Ex": {
        "Via": [
            {
                "Type": "clojure.lang.ExceptionInfo",
                "Message": "Unable to load index root ref 456ed91a-ca5f-4448-869b-bbb7e7538625",
                "Data": {
                    "Ret": {
                        "CognitectAnomaliesCategory": "CognitectAnomaliesFault",
                        "CognitectAnomaliesMessage": "Unable to execute HTTP request: Request did not complete before the request timeout configuration.",
                        "Error": "Unable to execute HTTP request: Request did not complete before the request timeout configuration."
                    },
                    "DbId": "456ed91a-ca5f-4448-869b-bbb7e7538625"
                },
                "At": [
                    "datomic.cloud.index$require_ref_map",
                    "invokeStatic",
                    "index.clj",
                    858
                ]
            }

Sven 2025-05-29T11:19:03.188219Z

Also, as long as I remember we have always had Alerts w/ CreateUpdateSystemFailed message in the logs. Ions deployments work, app functionality too so until now I have not bothered looking into it. These happen in all the query groups and (in the past) the primary compute group too. This is not related to the previous post.

{
    "Msg": "CreateUpdateSystemFailed",
    "Ex": {
        "Via": [
            {
                "Type": "clojure.lang.ExceptionInfo",
                "Message": "Unable to execute HTTP request: Request did not complete before the request timeout configuration.",
                "Data": {
                    "CognitectAnomaliesCategory": "CognitectAnomaliesFault",
                    "CognitectAnomaliesMessage": "Unable to execute HTTP request: Request did not complete before the request timeout configuration.",
                    "Error": "Unable to execute HTTP request: Request did not complete before the request timeout configuration."
                },
                "At": [
                    "datomic.core.anomalies$throw_if_anom",
                    "invokeStatic",
                    "anomalies.clj",
                    94
                ]
            }
        ],
        "Trace": [
            [
                "datomic.core.anomalies$throw_if_anom",
                "invokeStatic",
                "anomalies.clj",
                94
            ],

Joe Lane 2025-05-29T16:09:30.508979Z

Can you share what the EC2 InstanceTypes are for the Primary Compute Group and the Query Groups?

Joe Lane 2025-05-29T16:09:42.230549Z

And the Datomic Cloud version you're using?

Joe Lane 2025-05-29T16:10:10.324249Z

(And whether you're storing large strings in your system)

Sven 2025-05-29T16:15:54.617539Z

Currently: β€’ i3.large for one query group β€’ t3.medium for the primary compute group and other query groups It appears in all the query groups except the primary compute group where we do not deploy anything at the moment. This error was also present in logs in the past when we were running everything in a single compute group (i3.large or t3.xlarge).

Sven 2025-05-29T16:17:00.857419Z

We are running the latest version of everything, I'll look up versions in a sec

Joe Lane 2025-05-29T16:17:08.084979Z

No need.

Joe Lane 2025-05-29T16:17:24.216289Z

Do you know how far back (months/ releases/ wall clock time) you started seeing these errors?

Sven 2025-05-29T16:19:09.654089Z

> (And whether you're storing large strings in your system) and we generally do not save long strings but I am pretty sure we have data close to the 4096 string limit.

Sven 2025-05-29T16:21:07.222149Z

Do you know how far back (months/ releases/ wall clock time) you started seeing these errors?Not really. I am 90% sure it was present one year ago (100% October) but as we do not retain logs that long I cannot really prove it.

Sven 2025-05-29T16:21:39.016439Z

We have been running pretty much the latest Datomic all the time.

Sven 2025-05-29T16:21:57.862719Z

I recall Jaret noticing the last error during one of our discussions but we did not look further.

Joe Lane 2025-05-29T16:22:04.969749Z

Ok, well I'll have a release I think you'll want coming soon for Cloud.

❀️ 2
Sven 2025-05-29T16:22:36.615799Z

I am waiting for all the Cloud releases like Christmas πŸ™‚

πŸŽ… 1
braai engineer 2025-05-29T14:48:38.969909Z

Datomic Lucene Custom Parser request: can we please get custom Lucene parsers for Datomic fulltext search in a future Datomic version? I'm having to do a lot of sanitation and parsing Lucene grammar to offer user-friendly fulltext to end-users.

braai engineer 2025-05-29T14:59:59.030919Z

I'm seeing some unexpected behaviour when using tuples in Datalog rules. I'm trying to optimize a complex query by introducing tuples for composite values to reduce the number of clauses. I added tuples for all the heavy stuff, but my tests fail when I replace N clauses with a single tuple-matching clause. I would expect composite tuples to match exactly against any existing clauses that cover the tuple attrs, i.e. if a tuple has attrs [A B C], then in any query which has a clause to constrain A, B & C, then adding the tuple clause on [A B C] with the same logic vars, should not change the output of the query. Why are my queries returning different results? (in this case it's true|false result from an authorization system)

braai engineer 2025-06-04T16:46:31.323899Z

Managed to get a 250x+ https://github.com/theronic/eacl/pull/6/files on EACL with the help of Claude o4 Opus (MAX). Tests passing excl. expand-permission-tree which is not impl.

braai engineer 2025-06-03T09:28:07.496199Z

OK Mr @favila, in this https://github.com/theronic/eacl/pull/4 I've modelled arrow permissions under their own set of attrs to re-enable the unique ident on arrow 4-tuple (avoids the nils in tuples). Then, branching off that in https://github.com/theronic/eacl/pull/5/files, I've started to make different rulesets for can?, lookup-subjects and lookup-resources (`build-can?-rules` vs. build-slow-rules) so I can take out the slow [?resource :resource/type ?resource-type] clause at the top of each rule that matches everything. It already speeds things up a lot when I move that clause to the bottom. The bulk of changes in PR 4 are in: β€’ https://github.com/theronic/eacl/pull/4/files#diff-ad9e4d12c19c764fd2c47886c4eae769fe13a2a1d44f02259270e9ba818cdc95, and β€’ https://github.com/theronic/eacl/pull/4/files#diff-49a2e83709d07fd8c2db4bec3b6ca8d4280ce245fb20d101dac487bf30292bce. I'm wondering if those bindings "destructuring" the matching tuple values could move down for speed, or if they could be pulled into a second phase. Something like: first traverse the graph for all reachability paths, and then bind values to cull the search space πŸ€” . Any of your expert feedback is most appreciated πŸ™‚. I'll try to benchmark next to see how it performs with ~10k entities.

favila 2025-05-29T15:19:12.674449Z

1. Meta: composite unique tuples can be https://favila.github.io/2023-07-28/unique-composite-attribute-footguns/ 2. Syntactically, you can't just plop a vector into a data match clause, you need a single binding. 3. Adding a composite doesn't automatically backfill that value to existing entities. Did you backfill? 4. Tuple values are not interpreted for entity identifiers (i.e. they are "raw" when they contain refs). Are ?resource and ?subject values normalized to entity ids in your tuple?

favila 2025-05-29T15:23:26.475889Z

Illustrating 2: [(tuple ?resource ?reation-name ?subject) ?r+rn+s][?relationship :eacl.relationship/resource+relation-name+subject ?r+rn+s]

braai engineer 2025-05-29T15:32:32.887859Z

1. Thanks @favila :) I'm aware of the nil problem (wish this was configurable) 2. Ah, I tried using the tuple fn and get the same results:

'[(has-permission ?subject ?permission-name ?resource)
     [?resource :resource/type ?resource-type]
     [?relationship :eacl.relationship/resource ?resource]
     [?relationship :eacl.relationship/relation-name ?relation-name]
     [?relationship :eacl.relationship/subject ?subject]

  ; I now use (tuple ...):
     [(tuple ?resource-type ?relation-name ?subject) ?res-type+rel-name+subject]
     [?relationship :eacl.relationship/resource+relation-name+subject ?res-type+rel-name+subject]
...]
3. Yes, this is in test suite that runs against fresh in-memory Datomic each time from empty. 4. Re: interpreted as for entity identifiers, I think this also relates to using tuple fn?

favila 2025-05-29T15:33:41.279439Z

tuple is just an alias for vector

favila 2025-05-29T15:33:57.032839Z

it doesn't interpret anything (and can't! it doesn't get a db!)

favila 2025-05-29T15:35:29.040829Z

The problem must be somewhere else that you're not showing me

favila 2025-05-29T15:35:37.303639Z

I notice ?permission-name

braai engineer 2025-05-29T15:35:57.204279Z

that's a join further down – I'll push a more complete example to GH

favila 2025-05-29T15:36:17.522399Z

and you know for sure that ?subject and ?resource are entity ids?

favila 2025-05-29T15:36:52.411939Z

(stepping back, I assume this rule is just for testing this out--it doesn't make sense to join on individual items then join on the composite too)

braai engineer 2025-05-29T15:38:14.039349Z

(yes, joining on items as well as composite tuple was just to test)

braai engineer 2025-05-29T15:42:13.840299Z

?subject & ?resource are entity IDs, and ?relation-name is a keyword. Could this be related to the order of the tupleAttrs, or that ?relation-name is a keyword?

braai engineer 2025-05-29T15:42:33.763229Z

OK sorry there's a bug in my example above (tuple ?resource-type ... should be (tuple ?resource...) (fixing)

braai engineer 2025-05-29T15:43:53.228199Z

Hmm, even if I make a different tuple with just resource + subject, tests still fail:

[(tuple ?resource ?subject) ?resource+subject]
[?relationship :eacl.relationship/resource+subject ?resource+subject]
(combined with the other clauses)

braai engineer 2025-05-29T15:44:14.456949Z

(also tried removing :db/unique constraint)

favila 2025-05-29T15:44:36.413909Z

have you inspected the entity you expect this to read?

favila 2025-05-29T15:46:48.921739Z

If you just do (d/pull db '[*] [:eacl.relationship/resource+relation-name+subject [resource-id relation-name-kw subject-id]) do you see it?

braai engineer 2025-05-29T15:47:11.745589Z

If I query the relationships pertaining to my test resource :test/server1, I see:

[{:db/id 17592186045465, :eacl.relationship/subject #:db{:id 17592186045452}, :eacl.relationship/relation-name :account, :eacl.relationship/resource #:db{:id 17592186045464}, :eacl.relationship/resource+subject [17592186045464 17592186045452], :eacl.relationship/resource+relation-name+subject [17592186045464 :account 17592186045452]}]
So I can see the tuples are populated.

braai engineer 2025-05-29T15:47:45.986469Z

query:

(d/q '[:find [(pull ?rel [*]) ...]
        :where
        [?rel :eacl.relationship/resource :test/server1]]
  (d/db conn))

favila 2025-05-29T15:48:34.851599Z

and you are absolutely sure that can? is getting subject-id and resource-id as entity ids? because there's no normalization to db-id happening

favila 2025-05-29T15:48:54.990229Z

that if-not guard also doesn't make sense unless they're not entity ids

braai engineer 2025-05-29T15:50:40.438679Z

aha, this could be the issue! I see in can? I am passing the idents throuhg. let me try that... πŸ™‚

favila 2025-05-29T15:51:19.201379Z

This is my point 4

favila 2025-05-29T15:51:57.710439Z

> Tuple values are not interpreted for entity identifiers (i.e. they are "raw" when they contain refs). Are ?resource and ?subject values normalized to entity ids in your tuple?

braai engineer 2025-05-29T15:52:31.612009Z

gotcha, thanks! I thought I was passing in eids, but can? function was passing potential idents πŸ™‚

favila 2025-05-29T15:53:19.139529Z

Consider [(datomic.api/entid $ ?subject) ?subject-eid] etc in your rule, because with tuples it matters

favila 2025-05-29T15:53:54.671699Z

or normalize on the outside

favila 2025-05-29T15:54:50.163169Z

or write a function that does this whole thing for you, gives you the eid, bypass query completely

favila 2025-05-29T15:55:32.909409Z

(defn relationship-matching [db subject relation resource] -> eid)

braai engineer 2025-05-29T15:55:33.613019Z

so can? now looks like this:

(defn can?
  "Returns true if subject has permission on resource. Copied from core2.
  Note: we are not checking subject & resource types, but we probably should."
  [db subject-id permission resource-id]
  (let [{:as _subject-ent, subject-eid :db/id} (d/entity db subject-id)
        {:as _resource-ent, resource-eid :db/id} (d/entity db resource-id)]
    (if-not (and subject-eid resource-eid)
      false
      (->> (d/q '[:find ?subject .                          ; Using . to find a single value, expecting one or none
                  :in $ % ?subject ?perm ?resource
                  :where
                  (has-permission ?subject ?perm ?resource)] ; do we still needs this?
                db
                rules
                subject-eid
                permission
                resource-eid)
           (boolean)))))
note subject-eid and resource-eid. Is that what you mean?

braai engineer 2025-05-29T15:56:26.459829Z

multiple relationships could match to confer a given permission, so not sure how I would do that in the recursive query, or do you mean like a DB function?

favila 2025-05-29T15:57:02.146029Z

I mean just the "lookup the relationship" part, not the whole rule

favila 2025-05-29T15:57:49.977229Z

this thing about refs in tuples needing to be eids to match into indexes is a fiddly bit that is easy to get wrong in composition

favila 2025-05-29T15:58:11.365249Z

you should isolate that requirement into something that composes well and does the fiddly bit correctly

favila 2025-05-29T15:59:55.148579Z

the rule signature gives the impression that it doesn't matter; it appears your code assumes not-entity-ids

favila 2025-05-29T16:00:05.749799Z

both of these are at odds with the requirement

braai engineer 2025-05-29T16:01:36.232169Z

OK, it works now πŸ™‚ But frustratingly, I still need the other binding clauses like along with the tuple clause, because I reuse these rules in lookup-subjects and lookup-resources:

[?relationship :eacl.relationship/resource ?resource]
[?relationship :eacl.relationship/relation-name ?relation-name]
[?relationship :eacl.relationship/subject ?subject]
I assume this would still be a speed up if the tuple clause is first by constraining matches?

favila 2025-05-29T16:02:29.527569Z

Don't you have these values already by virtue of using the tuple?

favila 2025-05-29T16:05:01.524269Z

I see, they are not bound

braai engineer 2025-05-29T16:05:09.772049Z

that's what I expected, but lower down I have this clause:

[(not= ?subject ?resource)]
and Datomic complains that it is not sufficiently constrained if I don't also bind the other clauses:
":db.error/insufficient-binding [?relation-name] not bound in expression clause: [(= ?relation-name ?relation-name-in-perm-def)]"
I would expect that the ?relation-name in tuple would sufficiently constrain it:
[(tuple ?resource ?relation-name ?subject) ?resource+rel-name+subject]
 [?relationship :eacl.relationship/resource+relation-name+subject ?resource+rel-name+subject]

favila 2025-05-29T16:05:10.976319Z

you can't use tuple matching for that

favila 2025-05-29T16:05:23.193179Z

You need index-range specifically

favila 2025-05-29T16:05:37.293289Z

data clauses match exact values, you want a cursor range

favila 2025-05-29T16:05:42.565359Z

so you definitely need a helper fn

braai engineer 2025-05-29T16:05:51.328119Z

can I use helper functions from rules? (never tried)

favila 2025-05-29T16:06:10.230649Z

yes

favila 2025-05-29T16:06:17.040129Z

you already are, e.g. tuple

favila 2025-05-29T16:06:30.453889Z

tuple is not an intrinsic, it's just a fn

braai engineer 2025-05-29T16:06:53.115439Z

how would I use index-range in a query like this to match on multiple values? (I'll have to do some reading)

braai engineer 2025-05-29T16:07:38.401809Z

Do I understand correctly that even if I have to provide all the bindings, as long as the big tuple exact match is at the top, it should speed up the query by constraining the search space and avoiding set disjunctions (right word?) ?

favila 2025-05-29T16:08:30.039289Z

if you use the tuple, it gives you a single index on which to get candidates (including exact match). But it can only do that with prefix-matches.

favila 2025-05-29T16:08:50.820019Z

so lookup-subject is pointless for example

favila 2025-05-29T16:09:03.066449Z

the subject is the last thing in the tuple

cch1 2025-05-29T16:09:12.341979Z

"this thing about refs in tuples needing to be eids to match into indexes is a fiddly bit that is easy to get wrong in composition (edited" ^^^ this. I wrote a general purpose db fn to resolve to an EID, but usage can still be clunky ...and you have to remember to use it.

braai engineer 2025-05-29T16:09:55.934479Z

yeah, I suspected I'd need a different set of rules to optimize can?, lookup-subjects and lookup-resources, and probably play with the order of tuples to support prefix-matching, e.g. most systems have many resources but fewer subjects.

favila 2025-05-29T16:11:27.457089Z

using the same rule name to execute different search strategies depending on what is bound is not a thing datomic can do

braai engineer 2025-05-29T16:12:13.029749Z

right, I'd have to pass in different set of rules in can? vs lookup-subjects vs lookup-resources

favila 2025-05-29T16:12:15.219039Z

so I think making lookup-subjects, lookup-resources, and can? all use has-permission as-is is not possible

braai engineer 2025-05-29T16:14:46.716199Z

@cch1 would you mind sharing how that looks and how to use it?

favila 2025-05-29T16:19:03.012999Z

Example of index-range: (d/index-range db tuple-attr [resource-eid nil nil] [(inc resource-eid) nil nil]) gives you a seq of all datoms where resource-eid is the first element in the tuple

braai engineer 2025-05-29T16:19:40.067239Z

can you do that in a d/q, or do you pass that into a d/q?

favila 2025-05-29T16:20:05.910809Z

I would extract to a fn you call in a d/q

favila 2025-05-29T16:20:12.277289Z

(or call in a rule)

favila 2025-05-29T16:20:41.095009Z

in on-prem there's no point to the "make everything pure query" discipline. Stuff that is syntax pain in datalog just encapsulate into fns

πŸ‘ 1
braai engineer 2025-05-29T16:23:16.348859Z

Here are my https://github.com/theronic/eacl/pull/4. Sorry it's a bit big, but the new schema w/tuples is in schema.clj, and the new rules are under fn build-fast-rules. in eacl/datomic/impl.clj I just added the tuple constraints above the other bindings. Next steps I'll try to use index-range and have different rules for different purposes. Direct links: β€’ https://github.com/theronic/eacl/blob/optimize/tuples/src/eacl/datomic/schema.clj#L3 β€’ https://github.com/theronic/eacl/blob/optimize/tuples/src/eacl/datomic/impl.clj#L142 (hopefully faster. I need to benchmark next)

favila 2025-05-29T16:25:04.994059Z

[?resource :resource/type ?resource-type] ; would this be faster lower down? Depends on whether ?resource is bound

favila 2025-05-29T16:25:28.213809Z

if not bound (e.g., lookup subjects) it is way slower

favila 2025-05-29T16:25:44.949279Z

building set of all relationship types from all relations

favila 2025-05-29T16:26:25.881439Z

Really important to use the rule binding syntax if you are optimizing rules to make them direction-specific

braai engineer 2025-05-29T16:26:37.139789Z

what do you mean by rule-binding syntax? I don't quite understand direction-specific

favila 2025-05-29T16:27:09.926499Z

"required bindings" https://docs.datomic.com/query/query-data-reference.html#rule-required-bindings

favila 2025-05-29T16:27:50.485719Z

Looks like [(subject-has-permission [?subject] ?x ?y) ...] (note extra brackets around ?subject

favila 2025-05-29T16:28:02.938929Z

This causes the query to error if you call a rule without the bindings it expects

favila 2025-05-29T16:29:34.717159Z

"direction specific" means you are optimizing your rules for a certain direction of traversal (e.g. subject to relationship vs resource to relationship)

favila 2025-05-29T16:30:02.998549Z

maybe not precise way of putting it. Essentially what you know changes how you learn new things.

favila 2025-05-29T16:30:38.834309Z

abstract datalog/logic, nothing changes (you're describing the same relations), but implementation wise what indexes you use in what order changes

braai engineer 2025-05-29T16:34:18.903909Z

OK, great! Thanks so much for your help, @favila πŸ™‚. Re: directionality, it's a bit tricky because subjects & resources are connected via the relation in one or more Relationships and you can inherit permissions via other resources, e.g. user Favila owns Account A, under which fall 10 servers and that confers <these> permissions. So you kind of need to traverse that graph and it's not always obvious in which direction to search: β€’ up from resource to subject? (probably), or β€’ down from subject to resource? and then the permissions fall out of the schema for those Relations mentioned in various Relationships. Hopefully EACL can be useful to you and other Datomic users who need Spice-like AuthZ (once it's fast enough). The goal is not to try and match SpiceDB, just make it "good enough" for 10k-100k entities. (I posted my https://x.com/PetrusTheron/status/1927409042659422478, if you're interested)