From my local machine I connect to live/prod datahike postgres db (Heroku) and establish conn. From local, I transact new schema and new tx against that conn. On Heroku, I need to restart dynos to see most recent version of my datahike [peer? db?]. Can someone point to article/reading so I can understand the architecture?
it is not entirely clear to me how your setup is. I am making the assumption that you have an instance of an application running on heroku (dyno?) that has an in process datahike instance running both inside the jvm (means the default use case datahike was made for, as a dependency inside your clj-app). So this application is running and is connected to a postgres instance (on heroku). Now you say that you connect from your local machine to the postgres db to transact? If you transact from local the index on the app running on heroku is not updated and it does not know about the just written data. Datahike has no peers in the sense Datomic has, only with datahike-http-server you can have something similar. Please look at the distributed docs: https://github.com/replikativ/datahike/blob/main/doc/distributed.md#single-writer I don't understand what you're saying regarding the restart of your dyno(??) I don't know nothing about heroku but do you now see the transacted data after restarting your application on heroku?
Thanks timo
Heroku dyno is “isolated environments that provide compute, memory, an OS, and an ephemeral filesystem.”
Thing is, my cfg (and therefore @conn) only specify address of my Heroku hosted postgres db (which is separate from dyno and persists).
Since I am always deref'ing the conn, I don't understand why tx does not immediately reflect in app but requires me to restart the dyno (container is destroyed and everything re-built and evaluated/loaded).
Here is Heroku schematic of simple app structure:
(By "immediately reflected" I mean on web page refresh, apologies.)
in datahike the store is the only centralized part which was an early design-decision. each datahike instance has an index which is updated only when you transact in this very instance. if you want to transact and read from multiple instances of datahike (e.g. local repl and remote app) you will have to use the datahike-http-server which distributes the index-updates to multiple datahike-instances.
Wow, I see. So despite the "same" conn, my REPL has one instance of datahike and the remote app has entirely different instance that only "sees" whatever transactions available on dyno or container formation.
yes, right... I am not entirely up to date with datahike atm but I am pretty sure you would have to use http-server to solve that
maybe @whilo can chime in
When I recreate the dyno (with restart) datahike instance retrieves again from postgres store.
I am pretty certain it does a complete reindex but not 100% sure right now
Ok, but this is helpful. Thank you
I will test it myself this afternoon probably
@feedmyinbox02_clojuri does this graphic make sense https://github.com/replikativ/datahike/blob/main/doc/distributed.md ?
when working correctly all connections will virtually point to the same db snapshot. in the distributed setup we have so far each connection reading from the remote store to get the latest snapshot when you deref it.
the most critical part is that transactions happen in a single writer process
multiple writers would overwrite each other
for readers a problem can be if they think they see all changes automatically and don't do a fresh read from the shared store. when you configure a writer process in the config the clients should automatically be configured correctly and always fetch fresh from the store on deref
i haven't used heroku in a long time and haven't used it with datahike; is there a way to have single worker dyno for the writer?
In the graphic, each black box is a runtime instance of datahike or app containing an instance?
yes
the runtime can serve all reads (queries) directly from the store, only write operations are serialized to the writer
I will check to see if I can establish a single worker dyno for a writer; by defining :writer and setting :url to some address provided by Heroku that exclusively locates a worker dyno to handle writes. Interesting!
yes, that should work. lmk how it goes
(This is the part I understand least; I don't grasp peers vs. conns vs. db vs. what datahike really is maybe.)
After dyno restart, I don't see transacted data in my live web app (though it appears when querying conn locally).
In the web app, I d/q against what should be the same @conn I transacted against in my REPL.
Response from chatGPT 5: > Store: where bits live (you’re using JDBC/Postgres). > > Conn (conn): a process-local atom that points at the current db value. Transactions update this atom; deref (@conn or (d/db conn)) to read the latest value known to that process. > > Db value (db): an immutable snapshot. If you capture it once (e.g., (def db (d/db conn))) and reuse it, it never “sees” later txs. > > Multi-process: Each process has its own conn and caches. You can either let every process connect to the same Postgres store, or funnel all requests through a Datahike server so caches are shared.