datomic

Tobias Sjögren 2025-08-21T05:55:29.665279Z

I just discovered Data Semantics (J. R. Abrial, 1974) and Database Design – The Semantic Modeling Approach (Naphtali Rishe, 1992). Has anyone here read them? They seem like early predecessors of Datomic.

Tobias Sjögren 2025-08-21T06:20:39.606949Z

Abrial’s “Data Semantics” appears to be the first data modeling paper to make explicit use of binary relations as the foundational construct, which naturally yields triples...

refset 2025-08-21T13:55:05.596539Z

You have me curious - don't suppose you have a link/copy handy?

Tobias Sjögren 2025-08-21T16:14:07.187459Z

This is another paper that is relevant I think: https://cake.fiu.edu/Publications/Rishe-90-SB.Semantic_Binary_Database_Model.published-scanned.pdf

refset 2025-08-21T18:21:05.930109Z

Nice find! All the Date/Darwen 6NF history is also very relevant. As is http://daslab.seas.harvard.edu/reading-group/papers/dsm.pdf

2025-08-21T14:34:44.108439Z

The docs state: > Diagnostic: if your [Memcache] average is substantially less than 1.0, then you can serve more reads from Memcache by increasing the size of your memcached cluster. But I’m a bit unclear on what “average” would be here in the process monitor logs. Is it the ratio of sum to count?

:Memcache {:lo 0, :hi 1, :sum 93, :count 137}

2025-08-22T13:24:47.872719Z

I see, that makes sense and also likely since we are giving quite a bit of ram

Joe Lane 2025-08-21T18:05:39.323129Z

SUM/COUNT

🙏 1
2025-08-21T18:08:16.061029Z

is 0.67 considered “substantially less than 1.0”. It would not shock me that we could increase memcached cluster capacity, however when looking at the utilization stats it doesn’t appear to be particularly full or have high eviction

Joe Lane 2025-08-22T04:47:12.354799Z

So, that count is pretty low per minute meaning your object cache hit ratio is probably actually pretty good.

braai engineer 2025-08-21T18:41:54.647859Z

Is there a way to achieve the same performance as On Prem d/index-range but with reverse index traversal over tuple indices using something like d/datoms?

ghadi 2025-08-21T19:13:04.407689Z

not currently but we're considering it. Can you describe the use case and what your schema is?

braai engineer 2025-08-21T19:42:40.506989Z

It's for https://github.com/theronic/eacl, my SpiceDB-compatible ReBAC authorization library built in Clojure and backed by Datomic. EACL calculates multiple paths through the permission graph and then recursively traverses those paths "in parallel", interleaving + deduplicating results in a stable return order. Since parallel paths can emit duplicate results, EACL lazily deduplicates results and return either subjects or resources in a stable order that matches the order of resource eids stored at-rest, from one of two Relationship indices. Return order must be stable to enable cursor-based pagination, since sorting later would require materialising the full index (can be large, >1M results). The https://github.com/theronic/eacl/blob/main/src/eacl/datomic/schema.clj#L180-L211 is two tuple indices that store a sorted set of Relationships when either the subject or resource is known: • :eacl.relationship/subject-type+subject+relation-name+resource-type+resource for LookupResources, where subject is known, but resources are unknown, and • :eacl.relationship/resource-type+resource+relation-name+subject-type+subject for LookupSubjects, where resource is known but matching subjects are unknown, as well as CheckPermission. The lack of reverse-order traversal in d/index-range makes it hard to do cursor-based pagination in reverse. This makes "< Previous Page" navigation costly: to go back a page, you need to hold on to all previously seen cursors in a list of cursors and pop cursors off the list to navigate to the previous page, since you cannot construct the previous page using only the current cursor. Queries look https://github.com/theronic/eacl/blob/main/src/eacl/datomic/impl/indexed.clj#L264-L288, where we first calculate possible paths through the graph and then traverse those paths recursively, returning lazy deduplicated seqs. I'm sure there are many areas to optimize and functions can be pulled apart, but it works 🙂

favila 2025-08-21T19:44:19.146689Z

You can get close but not exactly the same with d/index-pull. There's an additional cost for what you select, even if what you select is already in the datom you are scanning.

braai engineer 2025-08-21T19:45:01.985889Z

does d/index-pull support nil wildcards in tuple end values like d/index-range?

favila 2025-08-21T19:45:48.607079Z

nil inside a tuple isn't really a wildcard in either case. It's just a value that sorts lower than any other

braai engineer 2025-08-21T19:46:25.705169Z

hmm, good point I can just ask for Long/MAX_VALUE as "wildcard" in end value and take N using d/index-pull?

favila 2025-08-21T19:46:55.538999Z

generally if you're termination condition depends on values within a tuple, omit a termination value and terminate via take-while with appropriate predicate

🙏 1
favila 2025-08-21T19:47:19.945569Z

this is true for all the lazy scanners (d/datoms, d/seek-datoms, d/index-range, d/index-pull)

favila 2025-08-21T19:48:14.534519Z

we don't have an official "highest value" or "lowest value" sentinel value

braai engineer 2025-08-21T19:48:50.719919Z

Roughly how much slower is d/index-pull vs d/index-range to gain reverse traversal?

braai engineer 2025-08-21T19:49:00.899559Z

like 10%? 2x?

favila 2025-08-21T19:50:01.380309Z

all the extra work is satisfying whatever :selector asks for

favila 2025-08-21T19:50:09.485039Z

so how much slower depends on what's in there

braai engineer 2025-08-21T19:50:38.928589Z

in my case, I only care about the tail position of the tuple value, which happens to be an eid

favila 2025-08-21T19:51:07.918229Z

so it will cost you an AEVT per item

🤕 1
braai engineer 2025-08-21T19:51:57.567319Z

that said, I hit the returned results with a d/entity lookup later on anyway, so maybe that can be consolidated

favila 2025-08-21T19:52:32.006609Z

d/entity would use EAVT, so it would be superior to that

👍 1
favila 2025-08-21T19:52:53.935869Z

so would standalone d/pull

favila 2025-08-21T19:54:03.128429Z

d/pull-many, query :find's pull, and index-range all bias towards AEVT for reads to satisfy the pull expression, on the theory that AEVT is more selective when pulling a few datoms from a large number of entities

favila 2025-08-21T19:54:17.033189Z

d/pull and d/entity assume you want a lot from a single entity, so favor EAVT

ghadi 2025-08-21T20:35:46.516069Z

thanks for the info. I get the idea. (Can't click into AGPL stuff though)

braai engineer 2025-08-21T21:17:48.388389Z

Can't click into? Fully intend to relax the licence to something more permissive soon, just need to get approval from employer.

braai engineer 2025-08-21T21:25:32.925289Z

@favila oh, so d/index-pull https://docs.datomic.com/indexes/index-pull.html walk :aevt or :avet, but not a specific tuple index? that's gonna be way too slow. I was hoping it could take a tuple attr as :index. the people need an d/index-range with :reverse true, please

favila 2025-08-21T21:26:13.488689Z

d/index-pull can do :avet , which is exactly what d/index-range does

favila 2025-08-21T21:26:23.181859Z

d/index-pull can only do :avet or :aevt

favila 2025-08-21T21:27:12.420889Z

I'm not sure what you mean by "not a specific tuple index". :start requires an a

favila 2025-08-21T21:28:13.036129Z

(d/index-range db :attr [start tuple] nil) is like (d/index-pull db {:index :avet :start [:attr [start tuple]})

braai engineer 2025-08-21T21:37:55.897529Z

ohh, I see. It was not clear to me that :start could have [:tuple-attr start-tuple], because I wasn't sure how it would destructure it. A tuple example in the docs would go a long way. I totally discarded index-pull as a solution from reading the docs, because I assumed it could only query simple attrs on entities. This can work then.

favila 2025-08-21T21:38:34.897079Z

tuples are just values 🤷 no difference

👍 1
braai engineer 2025-08-21T21:43:28.707369Z

so if you want to traverse a tuple attr with d/index-pull and you specify :start as [:year+semester [2000 1]], what do you put in :selector, :course/year? I don't understand how it destructures the tuple value, if it all. In my case, I want the tupleAttr in the https://github.com/theronic/eacl/blob/main/src/eacl/datomic/schema.clj#L187 which is :eacl.relationship/resource (for lookup-resources)

favila 2025-08-21T21:44:30.217589Z

it does not destructure the tuple value, it pulls from E

favila 2025-08-21T21:45:59.344979Z

equivalent to this

braai engineer 2025-08-21T21:46:02.831599Z

right, so it matches on the tuple, but selects over :e, and in my case the :selector would be [:eacl.relationship/resource]?

favila 2025-08-21T21:46:05.419219Z

(->> (d/index-range db :year+semester [2000 1] nil)
    (map (fn [[e]]
           (d/pull db selector e))))

👍 1
favila 2025-08-21T21:46:23.875399Z

sure, or you could pull the tuple itself if you want the whole thing

braai engineer 2025-08-21T21:46:45.826179Z

would putting the whole tuple be any faster?

favila 2025-08-21T21:46:56.907239Z

no

favila 2025-08-21T21:47:30.869629Z

it's going to do that AVET read to satisfy the pull no matter what

favila 2025-08-21T21:47:59.177979Z

that's what makes this inferior to a true reverse-index scan for your use case

👍 1
braai engineer 2025-08-21T21:51:49.231019Z

I'll need to benchmark. Luckily there are plenty of low-hanging optimizations remaining in EACL, like caching permission paths, using refs for relations & permissions instead of matching on keywords, and a few other things, possibly pmap, and optimizing lazy-dedupe-merge-sort. Thankfully SpiceDB does not support reverse lookup 😅, but we want it internally.

braai engineer 2025-08-21T22:10:24.566679Z

@favila why do you say d/index-pull has to do an AVET read if the :e is known? Wouldn't that be an EAVT given that :e of the attr (e.g. tuple) is available?

braai engineer 2025-08-21T22:11:28.294779Z

as per equivalence,

(->> (d/index-range db :year+semester [2000 1] nil)
    (map (fn [[e]]
           (d/pull db [:year] e))))
...wouldn't d/pull select over e via :eavt?

favila 2025-08-21T22:11:30.621299Z

Sorry meant aEvt

favila 2025-08-21T22:12:13.741319Z

Code is imprecise. More precise would be using d/pull-many

braai engineer 2025-08-21T22:14:35.532329Z

OK, I think I understand: it uses :AE* since both A & E are known, it should be better performance than EA* since num of e's are large, and have better...locality?

braai engineer 2025-08-21T22:15:54.878519Z

i.e. in a DB with millions of entities, not all e's will have a given A, so traversing via AE* first is more selective than EA*, which might span more pages in storage.

favila 2025-08-21T22:20:58.916279Z

If you want few A over many E, EAVT is almost guaranteed to have A you don’t want

favila 2025-08-21T22:21:18.405079Z

(Per segment of index)

favila 2025-08-21T22:21:35.888609Z

Whereas AEVT will only at worse have E you don’t want

👍 1