2026-06-22 datalevin | Clojure Slack Archive

datalevin 2026-06-22

Max 2026-06-22T16:23:03.918159Z

A question that came up while I was reviewing the book: what's the purpose of d/db in datalevin? In datomic, it gives you a consistent view of the data you can use in multiple places to ensure the data isn't changing out from under you across multiple queries. But datalevin doesn't have databases-as-a-value, so how is that d/db returns distinct from a connection? If it doesn't do anything, why can't you skip it and query a connection directly?

Samuel Ludwig 2026-06-22T16:33:08.351469Z

From what I could tell, d/db is identical to derefing the connection, so you could just do that if you wanted I think; I assume its inclusion also adds to API parity

Max 2026-06-22T16:34:20.653719Z

I get the argument from a parity perspective, though I'd imagine that in docs this would be presented as more of a footnote than the default way to use datalevin. Why does a connection need to be deref'd?

Huahai 2026-06-22T16:51:39.558889Z

d/db has a practical purpose in server mode, it refresh :last-modified by default, which decides if we want to refresh cache.

Max 2026-06-22T16:52:49.926979Z

Does it have a purpose outside of server mode?

Huahai 2026-06-22T16:55:09.954999Z

Other than server mode needs it, the purpose was mostly compatibility with datascript/datomic. However, the parity between server/embedded is one of the design goals.

Max 2026-06-22T16:56:41.165949Z

That makes sense I guess. It just seems like unnecessary api overhead in embedded mode, which I feel like will be the more common mode to use (though I may well be mistaken)

Huahai 2026-06-22T17:21:45.910829Z

A level of indirection with conn also allows us to add things like listeners and tx-meta, etc.

⭐ 1

Max 2026-06-22T17:40:53.232679Z

In embedded mode, is there any difference in behavior between reusing a db and creating a new one with d/db?

Huahai 2026-06-22T17:41:46.774129Z

of course, you don't want to reuse a db, it may be stale.

Max 2026-06-22T17:42:58.050249Z

If it were stale would you see old data? I.e. is a db a value or a reference?

Huahai 2026-06-22T17:43:19.155519Z

you may see wrong data

Huahai 2026-06-22T17:43:54.446999Z

db is an mutable object that carries some state, those state could be wrong in an old db

Max 2026-06-22T17:45:12.853609Z

So dbs are mutable and may change out from under you, but old ones may not be consistent with the state of all transacted data?

Huahai 2026-06-22T17:45:22.673369Z

correct

Max 2026-06-22T17:46:58.581939Z

That seems like kind of a footgun. Would it be crazy to have fns that take a db also accept a conn and auto-coerce it to a db?

Huahai 2026-06-22T17:48:02.930419Z

no, fns accept either db or conn

Max 2026-06-22T17:49:03.122589Z

Right, I’m saying that for example d/q could accept a conn or a db and if it's a conn, call d/db on it

Huahai 2026-06-22T17:49:15.757609Z

Huahai 2026-06-22T17:49:38.754129Z

there's no point, what's for?

Huahai 2026-06-22T17:55:46.384689Z

the point is that d/q is not only working with a datalevin db, the point is to stress the asymmetry between txn and query.

Huahai 2026-06-22T17:56:03.973699Z

txn is a connection concern, query is not

Max 2026-06-22T17:56:09.335739Z

Otherwise it seems like the api relies on users to use the api perfectly and always call d/db anew for every query. If they mess up and reuse a db, then they silently get the wrong data back, and the problem wouldn't be immediately obvious until some poor dev ends up debugging a very strange issue in production. Also in general, a db isn't meaningful or useful to an api consumer. They (hopefully) know they have to call d/db for every query and never reuse the return value, so it ends up being boilerplate for every query at best, and a time bomb at worst. Given those issues, the first thing I would do after bringing datalevin into a project would be to wrap all of its db-accepting fns to accept a conn and call d/db on it. That's the only correct way to use them anyways, so I'd want to ensure no one on my team ever messed that up. And if I’m doing that, then I feel like it's worth asking if that should just be the api.

Max 2026-06-22T17:57:42.961419Z

This is different from datomic (et al) where a db value is useful on its own as a point-in-time snapshot, you can reuse it across multiple queries to provide them with a consistent view of the database between the queries

Huahai 2026-06-22T17:58:03.857429Z

they will need to, that's the book will teach them

Huahai 2026-06-22T17:58:50.493339Z

the point of d/db is to get a fresh view of the data, that much is clear. what's the problem with that?

Huahai 2026-06-22T17:59:49.642999Z

I am updating the book to avoid @conn. It will not even be shown in the book.

Huahai 2026-06-22T18:01:44.234509Z

again, d/q does not work on conn, it works on a snapshot of the data. I think the story is coherent.

Huahai 2026-06-22T18:02:14.944289Z

it also works on things that are not database at all

Huahai 2026-06-22T18:03:26.784119Z

also, it works on more than one database. so I don't think the expectation of a symmetric with transaction is warranted. Transaction with conn, q works with (d/db conn), this is perfectly fine. I wouldn't change a thing.

Huahai 2026-06-22T18:06:12.413789Z

if you introduce a conn to q, you are making things confusing to people. Keep things simple. There's only one way to do things. Much easier to understand.

Huahai 2026-06-22T18:08:47.846459Z

A connection is different from a db, that much is what people already understand. There's no need to mud the water.

Max 2026-06-22T18:33:35.686419Z

Maybe there's something I'm not understanding, but I'm having trouble wrapping my head around something you just said: > d/q does not work on conn, it works on a snapshot of the data With Datomic, this is 100% true. d/db returns a consistent snapshot of the data that's the same 5ms and 5 hours later. If you query the same db over and over, you always get the same result. In Datalevin's docs, it specifically says that Datalevin does not have db-as-a-value semantics, making me think that if you query the value returned by d/db at different times, then you get different results. Is that true? If not, then I completely understand the split between conn and db and I have no confusion about their separation. If is is true though that querying the same db at different times can give you different results, then d/q in Datalevin does not operate on a snapshot of the data, it works on something else. I don't think I understand what that thing is and what its semantics are. All my other comments/suggestions lead from that lack of understanding, so perhaps I should start there.

Huahai 2026-06-22T19:16:29.024119Z

Db is a mutable object. If you use an old db, it is simply wrong. That's it. There is no such thing as old db.

Huahai 2026-06-22T19:17:33.290659Z

Datalevin only works on current db.

Huahai 2026-06-22T19:18:36.725659Z

There's no concept called an old db. There is no such thing.

Huahai 2026-06-22T19:20:17.664019Z

It is like you are holding a reference that has expired. That's about all you can say. What exactly that thing you are holding, it is undefined.

Huahai 2026-06-22T19:24:17.932889Z

The purpose of calling db function, is to ensure you are working with the current and the only db you should be working with.

Huahai 2026-06-22T19:25:47.357019Z

Db is not a value, db is a state. Db function allows you to access that state. That's it.

Huahai 2026-06-22T19:29:51.877799Z

A Db is the surrogate of external world, which is changing. Db function is basically perceiving the world, which gives you a snapshot of the state, in time. You don't hold onto your perception, you always constantly looking, get a new look before deciding to do anything. That's the model. The so called db as a value is simply a wrong model. The world is constantly changing, not something you can hold onto.

Huahai 2026-06-22T19:34:04.046489Z

The world is a river, you cannot step into the same river. That s the mental model. There is no such thing as a static world, therefore, the idea of a previous world doesn't exist.

Huahai 2026-06-22T19:43:04.895919Z

Added these clarification in the book. Thanks for pointing these out.

Max 2026-06-22T19:45:20.964839Z

So then what happens if you do reuse an old db reference? Is it essentially undefined behavior?

Huahai 2026-06-22T19:46:17.473499Z

correct. what exactly is going to happen depends on implementation details that will change and not part of the public API.

Max 2026-06-22T19:51:03.934069Z

And how long does it stay "fresh"? For example, is this code incorrect?

(defn two-queries [conn]
  (let [db (d/db conn)
        ids (map first (d/q db ...some query with a smaller return size...))
    (->> ids (filter ...) (mapv #(d/pull db [...larger pattern...] (find :attr %))))))

This is a pretty common datomic pattern. For filters that are complicated to apply in a query or for aggregations that would require complicated subqueries, you can just call d/q multiple times with the same db and all the calls with have the same view of the data.

Huahai 2026-06-22T19:51:10.714869Z

if db is a value, you don't need transaction, ACID etc. That's why Datomic's transaction model is unusual, because it is not compatible with the reason to introduce transaction.

Huahai 2026-06-22T19:52:52.140139Z

you are encourage to use a single query to do what you want in datalevin.

Huahai 2026-06-22T19:53:29.113119Z

the datomic patterns do not work well, because datomic doesn't have a query optimizer

Huahai 2026-06-22T19:53:59.398999Z

you are not encouraged to do the query like thing in your own code, you are supposed to work with query.

phronmophobic 2026-06-22T19:54:10.277879Z

> if db is a value, you don't need transaction, ACID etc. That's why Datomic's transaction model is unusual, because it is not compatible with the reason to introduce transaction. Transactions aren't just about reading consistent values. ACID is also about ensuring that the db only can transition between consistent states.

Max 2026-06-22T19:55:24.322819Z

Sure, but what I'm really trying to get at is, if db is a mutable reference that goes "stale" over time and should only be used while "fresh", what are the semantics of "fresh"? Is it 1ms? 10ms? 10000ms?

Huahai 2026-06-22T19:56:43.212349Z

the original reason to introduce the idea of transaction is to simulate the world, which does not do thing in inconsistent ways, but computer can. If db is a value, you don't need to introduce this idea, because it is by definition already the case.

Huahai 2026-06-22T19:57:46.800589Z

it is not about fresh, it is about "current".

Max 2026-06-22T19:58:08.815779Z

Ok, so it stays "current" until the next transaction is applied?

Huahai 2026-06-22T19:59:44.444989Z

all your read will be current if you use db function, because that's what db function does. It is a snapshot, because it may not be "really" current, because it is the db state when you called it, while you are reading, the db may have advanced, but what you are reading is current to that snapshot. So it is a snapshot.

phronmophobic 2026-06-22T20:03:24.714549Z

> the original reason to introduce the idea of transaction is to simulate the world maybe? but that's not the only reason folks currently reach for a db > which does not do thing in inconsistent ways, but computer can. If db is a value, you don't need to introduce this idea, because it is by definition already the case. https://en.wikipedia.org/wiki/ACID#Consistency Consistency ensures that a transaction can only bring the database from one consistent state to another, preserving database https://en.wikipedia.org/wiki/Invariant_(computer_science): any data written to the database must be valid according to all defined rules, including https://en.wikipedia.org/wiki/Integrity_constraints, https://en.wikipedia.org/wiki/Cascading_rollback, https://en.wikipedia.org/wiki/Database_trigger, and any combination thereof. This prevents database corruption by an illegal transaction. An example of a database invariant is https://en.wikipedia.org/wiki/Referential_integrity, which guarantees the https://en.wikipedia.org/wiki/Unique_key–https://en.wikipedia.org/wiki/Foreign_key relationship.https://en.wikipedia.org/wiki/ACID#cite_note-Date2012-7 Datomic supports validation functions that will reject transactions that would introduce invalid states or violate invariants. It also stores transactions for auditing which is useful in its own right.

Huahai 2026-06-22T20:06:34.061549Z

what I am saying, datomic use the word transaction not in its original sense. that's why its transaction semantics is considered "unusual", because that's not what that word means. Read the Jepson report on datomic, "unusual" is not the label I give, it is in that report.

Huahai 2026-06-22T20:10:35.165859Z

What Datalevin does, it is the usual thing almost all other databases do. That's basically what it is.

phronmophobic 2026-06-22T20:11:27.253719Z

Sure. I wasn't objecting to "unusual". I was objecting to the idea that transactions don't make sense for datomic.

Max 2026-06-22T20:11:59.525059Z

If we want to have a convo about comparing datomic and datalevin txn semantics, could we move that to a new thread? I'd prefer not to derail this one

👍 1

Huahai 2026-06-22T20:12:10.050109Z

If you change the meaning of "transaction", yes.

Max 2026-06-22T20:19:20.295649Z

The difference is that with most databases, you query and transact against the same connection object, which serves as a transparent connection to the outside world. Datomic introduced a separate db object specifically as an immutable value to solve a problem they created by not having traditional "fenced", session-based transactions. I guess I'm confused what value having separate db and connection objects provides if Datalevin has fenced transactions and no db-as-a-value semantics. If the only correct thing you can do with a db object is create it from a connection, pass it to a single api fn, and then discard it, then it seems like it's already basically a connection with extra steps.

Huahai 2026-06-22T20:20:33.646919Z

right, but datalevin query is more powerful than working with a single connection object. It can also query other things.

Max 2026-06-22T20:20:49.842639Z

such as?

Huahai 2026-06-22T20:21:42.440359Z

such as query multiple dbs in the same query, query data structure directly.

Huahai 2026-06-22T20:22:45.966869Z

q is more than working with a single connection.

Huahai 2026-06-22T20:24:57.846859Z

I have already addressed your expectation of symmetry between txn and query above. that expectation is wrong.

Max 2026-06-22T20:26:03.250099Z

Ok sure, you could do a query that involved multiple dbs. Are they all based on the same conn, or different conns? How do you create a db for a specific database?

Huahai 2026-06-22T20:27:49.010909Z

they usually are based on different connections. creating a datalevin db is cheap. it's perfectly ok to create many in one application. in fact, a db per user is a common pattern.

👍 1

Huahai 2026-06-22T20:28:25.457229Z

a datalevin db is just a single file

Max 2026-06-22T20:31:30.415849Z

Got it. So then, why not have this tiny bit of code in q?

(let [args (map #(if (conn? %) (d/db %) %) args)]
  ...)

That can handle querying one db, multiple dbs, data structures, whatever

Huahai 2026-06-22T20:32:51.414699Z

args are more complicated that there are rules, there are collections.

Max 2026-06-22T20:33:13.497529Z

of course. But for all of those things, conn? will be false, right?

Huahai 2026-06-22T20:33:49.636019Z

conn? is expensive, it is what actually refresh cache and may send remote calls

Huahai 2026-06-22T20:34:22.446749Z

conn? call db?, which check timestamp

Max 2026-06-22T20:34:23.046179Z

Oh interesting. Is there no cheap way to check whether an object is a connection? Like via instance? or something?

Huahai 2026-06-22T20:35:45.862819Z

the design decisions were all trade-offs. you can always qrgue one way or another. it's a judgement. If you only consider one thing, you may disagree, but if you have fuller picture, your judgment may change.

Huahai 2026-06-22T20:36:20.306959Z

it's really come down if you trust the designer's judgment or not.

Max 2026-06-22T20:36:28.085519Z

Sure, I totally get it. I'm more trying to understand what factors led to the decision.

Max 2026-06-22T20:37:07.641039Z

And I'm pushing a little extra as a book reviewer, because I suspect you're going to get this exact same line of questioning from anyone who's used datomic before

Huahai 2026-06-22T20:37:54.221759Z

q is already complicated enough. also, you want clear conceptual boundary. mixing conn with db is not conceptually helpful.

Max 2026-06-22T20:38:45.979399Z

I think some may argue that they've been already mixed since dbs don't have value semantics

Huahai 2026-06-22T20:39:50.009449Z

that's because you have contacted datomic. most people haven't. my goal is wide adoption, my target is PostgreSQL, datomic is not in my consideration to be honest

Max 2026-06-22T20:40:37.288379Z

Well, I'd imagine postgres users being confused why they need to call d/db at all

Huahai 2026-06-22T20:40:41.213639Z

as I have already mentioned many times, I simply believe db as a value as anti-pattern.

Max 2026-06-22T20:42:01.062949Z

I'm not disputing that. I'm poking at this a little more than I would otherwise because I can imagine d/db being a little bit of a scary function for users. You have to use it correctly or else undefined behavior might silently occur, and the correct way to use it is a) different from how it would be used in datomic, and b) peculiar coming from sql dbs

Huahai 2026-06-22T20:42:01.190789Z

because datalevin can do more than just a single connection that's enough to get people to accept a new thing. It has new capability, what's the matter to accept a new boilerplate?

Huahai 2026-06-22T20:42:57.747799Z

no, you use it always, so it's just a matter of cause.

Max 2026-06-22T20:43:50.302089Z

Anyways, I think it might be worthwhile to have something in the book that calls out calling d/db as a small piece of necessary boilerplate and explaining how not to use it

👍 1

Huahai 2026-06-22T20:44:15.108099Z

done

Max 2026-06-22T20:46:33.456849Z

I think it's easier to justify its presence as "this is just how it works" than to make a design-based argument, since I think any design-based argument will end up with readers going down the same line of questioning I just went down. If we hadn't had this conversation, then I would've wrapped d/q with that same little bit of mapping code that calls conn? not knowing that there were problems with that approach

Huahai 2026-06-22T20:48:00.187929Z

sure, in the book, it is presented as how it is.

Huahai 2026-06-22T20:48:49.667639Z

thanks for the discussion, it does improve the book clarity on this.

Huahai 2026-06-22T21:05:46.931439Z

besides, this required (d/db conn) call would make the future distributed db feature much cheaper and easier to implement and be more performant, as it gives a clear boundary of when to observe the database state now. We no longer have to make sure db is in consistent state all the time.

👍 1

Samuel Ludwig 2026-06-22T18:21:00.435499Z

I'd certainly be interested in taking a look! I started learning/integrating Datalevin into a project at $COMPANY (and being a nuisance in this channel) a few months ago now- and I'd still definitely classify myself as someone who could benefit from a book like this :^)

Anton Shastun 2026-06-22T20:22:32.377159Z

wow, I can’t wait to get started reading

Clojurians Log v2

datalevin 2026-06-22