datahike 2023-12-28 | Slack Archive

silian19:12:24

I seem to have "overwritten" transactions in production by transacting an item locally. How do I "rollback" Datahike? (I am using postgresql/jdbc :store, deploying to heroku.)

silian19:12:02

I don't exactly understand what I did. I don't have separate local and production instances of Datahike cfg stores — I connect only to the heroku postgresql production instance. This is probably not best practice but tried to keep things easy. (Easy but likely not simple!)

silian19:12:34

Okay, maybe I did overwrite something. I am seeing a big gap in :db/txInstant when I run:

(d/q '[:find ?t :where [_ :db/txInstant ?t]] @conn)

I have transacted many times in December (via the production web app) but I am seeing only the most recent transaction I performed locally (yesterday).

timo21:12:44

Hey @U02U1T66REZ, do you have history enabled in Datahike?

silian21:12:45

Yes, on startup I can see that:

... DEBUG [datahike.connector:141] - Using config  {:keep-history? true, :search-cache-size 10000, :index :datahike.index/persistent-set, :store {:dbtype "postgresql", :dbname "...", :backend :jdbc, :host "...amazonaws.com", :port 5432, :user "..."}, :store-cache-size 1000, :attribute-refs? false, :writer {:backend :self}, :crypto-hash? false, :schema-flexibility :write, :branch :db}

timo21:12:31

Can you see what you are searching for when querying the historical database?

silian21:12:37

No, for example, the following should result in hit:

(d/q '[:find ?e
       :where
       [?e :url-link "junji ito"]] (d/history @conn))

But returns an empty set.

timo21:12:45

hmm, don't know really. You don't know what you did?

timo21:12:31

Can you see roughly the amount of data you used to see when looking at the postgres directly?

silian21:12:59

I have the app running locally and have not re-started the REPL in over a month. (Just long-lived process on my machine.) About a month ago, I deployed to production. Since that time, I have transacted via the production web app, which shows updates. Then, yesterday, I made a small update locally. No deployments. I checked the production site, and all data made within the month are gone. The production site reflects what I see locally (which is all tx up to a month ago plus the small update I made yesterday).

timo21:12:31

which version are you using?

silian21:12:16

0.6.1553

silian22:12:17

I use Postico (a postgres GUI) to inspect raw data in "traditional" relational db projects but can't make sense of the Datahike postgres store. (But I assume it was not designed to be human-readable.)

silian22:12:39

During the course of the month, I never really understood how my local app showed "old" data when I was continually submitting new data via the production site. (I assumed my local app was using a "stale" conn, which reflected the db at the time I defined conn; what I didn't expect was that running (d/transact conn ... ) would somehow "overwrite" or, in git parlance, set the "old" branch as "HEAD".)

silian23:12:16

I thought it was not possible to "overwrite" anything in Datahike. I have never used d/delete-database so I don't understand how transactions made in production are now "gone" or not retrievable via d/history. I never used the branch feature either. What is best practice to avoid this in future?

silian04:12:44

Just now transacted. REPL reports its success and I can see the change on 127.0.0.1 but I also get in the REPL multiple lines like this:

... [wrapping: SELECT id, header, meta, val FROM konserve WHERE id = '0594e3b6-9635-5c99-8142-412accf3023b.ksv'] closed orphaned ResultSet: com.mchange.v2.c3p0.impl.NewProxyResultSet@67922b8c [wrapping: null]

And (about 18 lines of):

... [wrapping: SELECT 1 FROM konserve WHERE id = '0432b03a-5929-5129-82c8-667da06ca226.ksv'] closed orphaned ResultSet: com.mchange.v2.c3p0.impl.NewProxyResultSet@5b429bd3 [wrapping: null]

All issued by com.mchange.v2.c3p0.impl.NewProxyPreparedStatement — I apologize for the dump!

silian04:12:25

Interesting. After making the transaction, I restarted my heroku dynos and now the production app reflects what I see locally; anything submitted to the production app in the interim (before restarting the dyno) is gone!

silian04:12:24

So it seems as if transactions submitted to production app are stored in-memory? In-memory tx are lost when transactions executed from the local peer? (Datahike supports only a single peer?)

silian05:12:23

So before you can make any changes to db from local environment, you need to release conn and then re-connect to it (this will "import" updates made to production); otherwise, executing a transaction locally writes to the db without syncing any updates made in production and those updates seem to be wholly lost (cannot be retrieved with d/history).

timo12:12:59

If this is true it is a serious bug but transacting always resolves the latest root via conn afaik. So it should not be possible to overwrite anything by transacting even if your repl was running very long and is stale somehow

timo12:12:52

Lately there were features added to datahike to run a Datahike server as a transactor. I need to know what your setup currently is. Are you running datahike locally and on a production server in the classic way? With the same configuration? That would mean you are running two instances (one in your repl and one in production) that don't know each other. They would happily overwrite each others konserve structure in the PG-DB.

timo12:12:48

But I am not too deep into these things. Can you please https://github.com/replikativ/datahike/issues/new?assignees=&labels=bug%2Ctriage&projects=&template=1-bug-report.yml&title=%5BBug%5D%3A+?

✅ 1

silian15:12:13

Yes, I will file an issue with more description. Thank you Timo.

silian17:12:36

Are you running datahike locally and on a production server in the classic way? With the same configuration? That would mean you are running two instances (one in your repl and one in production) that don't know each other.> Hmm, "classic way" ... I am not sure what that means but, yes, I have only a single configuration:

(def cfg
  {:store {:backend :jdbc
           :dbtype "postgresql"
           :host "..."
           :port 5432
           :user "..."
           :password "..."
           :dbname "..."}})

So I suppose this was my mistake! 😭 If so, thank you for helping me understand!

timo22:12:08

I don't know if this is, I will have to understand what went wrong

silian22:12:57

Is there something I can test or provide to help better determine?

silian22:12:15

Oh, I see — perhaps you mean you need to reproduce my steps so you can interrogate on your machine. Makes sense.

timo22:12:19

yeah, exactly.

👍 1

whilo18:01:53

@U02U1T66REZ I am sorry that you ran into this issue (I just saw your messages). Ideally you should not be able to accidentally write to the same database from two different peers. Since we want to avoid coordinating processes to access databases we have not yet made a decision of how to manage a single writer more transparently. It basically is up to the user to ensure that no other peer also comes up as a writer.

whilo18:01:42

@U4GEXTNGZ Thanks for picking up this conversation, you pointed out the issue correctly, I think.

👍 1

2023-12-28

Channels