datomic

2026-05-07T13:11:28.833469Z

For those running Datomic in production, what are your operational approaches for performing schema updates and data migrations? Let's say I've got a form:

<form>
  <input name="birthday">
</form>
And a schema:
{:db/ident       :person/birthday
 :db/valueType   :db.type/inst}
And I need to add a notion of a date format preference to my app:
<form>
  <input name="birthday">
  <input name="date-format-preference">
</form>

{:db/ident       :person/date-format-preference
 :db/valueType   :db.type/keyword} ;; [:yyyy-mm-dd :mm-dd-yyyy]
So ultimately, to get production fully updated, I need to do three things: 1. add this new schema attr to Datomic 2. update my existing entities to set their initial :person/date-format-preference 3. deploy the updated application code with my new form so new entities can be created via the website Thus far, my approach has been to: 1. Deploy a new build of the code to my server. 2. On application startup, before opening my web server up to traffic, I re-transact my schema into the DB so the new attribute is present in Datomic. Doing this on startup so I don't get into any funky situations where my app code has a new form field for creating a new attribute, but my DB schema doesn't know about it yet 3. I create a set of transactions to run after my schema is updated so I can update my existing DB entities with this new schema attribute, and I use Conformity (https://github.com/qtrfeast/conformity) to transact them into the DB.
(let [ents-w-birthdays (d/query {:query '{:find [?e ?c]
                                          :in [$]
                                          :where [[?e :person/birthday _]
                                                  [?e :person/continent ?c]]}
                                 :args [(d/db conn)]})
      migrated-entities (->> ents-w-birthdays
                             vec
                             (map (fn [entity continent]
                                    (if (= continent "North America")
                                      (merge entity {:person/date-format-preference :mm-dd-yyyy})
                                      (merge entity {:person/date-format-preference :yyyy-mm-dd})))))]

  migrated-entities)
4. After all that, my DB and app code are in alignment, I turn on my web server, and the cycle is complete. Some drawbacks of this approach: • If a migration has to update a lot of entities, it can slow down the release process • If a migration fails for some reason, then the whole application is at risk of failing to start • "hard coding" a migration into the startup process means maintaining a list of migrations, pruning away the ones that have already run, making sure that not too many run within the scope of a single release, etc.

cch1 2026-05-07T15:00:49.182049Z

We use an approach very similar to what you have described leveraging https://github.com/recbus/caribou to track and apply migrations to keep things "up-to-date". Caribou has the concept of a data "correction" as a migration (using Clojure's iteration behind the scenes to manage long-running corrections). The drawbacks we see are exactly the ones you have noted. But, in three and a half years we have had only one failure to start -and that was indeed running a data correction where the developer was new to the correction process. The benefits of this approach are that application code does not need to live in two worlds at once -which can be very tricky to manage.

cch1 2026-05-07T15:01:22.177319Z

The general idea of tracking all migrations in caribou and leaving it to ensure the schema and data (seeds, corrections) are "current" is pretty liberating.

Gustavo A. 2026-05-07T15:05:41.949079Z

What we usually do is running all the migrations first before deploying the new code. The approach is that we only add new attributes to the schema, so, these new attributes only are going to be used by the new code until it is deployed. The old code doesn't know/care about these new attributes and hence nothing breaks. We don't keep track of which migrations were run, we execute all of them every time we deploy. For old entities, with the new code: - When reading we use default values or calculate them if the attribute is not present. If this approach is not possible then we would need to run a script to set the correct value for each one. - When updating or inserting we assign the corresponding value.

💯 1
favila 2026-05-07T15:21:29.752049Z

On application startup, before opening my web server up to traffic, I re-transact my schema into the DB so the new attribute is present in Datomic. Doing this on startup so I don't get into any funky situations where my app code has a new form field for creating a new attribute, but my DB schema doesn't know about it yetAre you using conformity for this too? You should. If you are merely retransacting static schema in code you could indeed get into "funky states", e.g. an old app version deploys and changes the schema to an older revision, or you delete :db/index true from your code and expect the index to go away on next deploy. (It won't)

2026-05-11T14:32:43.206229Z

an old app version deploys and changes the schema to an older revisionThis is a great point! Up until this point, I figured that just re-transacting my schema every time on app startup was basically "harmless", i.e. only ever incrementing the transaction counter because most of the time I'm just transacting facts that Datomic already knows. But, especially when working with a team, the order that a PR gets merged and released may impact whether, how and when those schema alterations get applied. Great tip 🙂

favila 2026-05-11T14:47:02.509889Z

Most rolling deployment methods can't ensure that it never ever deploys an old instance when the new one is being deployed, so you don't even need team development to hit this.

favila 2026-05-11T14:48:06.036339Z

e.g. you deploy while a scale-up is happening, the scaled-up instance is the old one, it transacts old schema. If you are lucky there's an unresolvable conflict and it dies; otherwise your actual schema is some soupy mix of old and new

favila 2026-05-07T15:21:59.532429Z

I recommend treating all schema changes as if it were a separate library that your application code depends on, and not mixing any of that into your application lifecycle

favila 2026-05-07T15:24:44.450149Z

So: use migration frameworks for schema and data migrations (as you are), apply those separately (eg in CI or a one-off job), and only then turn on the application code that depends on it (whether via deployment, feature flag, or whatever), If you are adding an index, include a final blocking d/sync-index to make sure the index is in place before you deploy. (Conformity may do this already, I don't remember.)

2026-05-11T14:44:27.131889Z

I've been kicking around the idea of setting up blue-green deployments for a while, I think this will be the aspect that pushes me to commit to that. The startup + migration time for certain migrations is already kind of long, and it will continue to get longer as my database grows. Being able to start up a secondary Docker container for my app, run migrations from THAT peer, and switch over once everything has been fully transacted in and spot checked, without worrying about the downtime from a long migration, or a migration that hits a transactor timeout, is the way.

favila 2026-05-07T15:30:08.825719Z

a useful thing to have to reinforce this is an "at least this high" marker in your database (conformity already keeps a queryable set of applied norms). An application can check this at startup, and refuse to start if it doesn't find what it needs. You should strive to make backward-compatible changes (so old code can use newer database states), but if you find yourself unable to do this, you can also use a marker like this to make the application refuse to start if it encounters a value it doesn't understand.