datahike

silian 2024-01-01T18:55:01.700949Z

Happy New Year all. When trying to connect to local copy of postgres db, I am getting error:

Execution error (ExceptionInfo) at datahike.connector/ensure-stored-config-consistency (connector.cljc:127).
Configuration does not match stored configuration.
I can see from error messages that :stored-config is the credentials for my production postgres db. I am trying to create a local db to experiment with, without corrupting or overwriting data in production.

whilo 2024-01-03T18:18:25.773769Z

@feedmyinbox02_clojuri Yes, this is always possible.

whilo 2024-01-03T18:30:07.128559Z

We can relax this validation, it is mostly there to avoid accidentally overwriting existing DB's with the wrong configuration. @feedmyinbox02_clojuri What are your respective DB configs that clash?

silian 2024-01-03T22:01:28.013889Z

I created a copy of production postgres db via heroku CLI pg:pull (which employs pg_dump). Local config I was attempting to connect to:

(def local-cfg
        {:store {:backend :jdbc
                 :dbtype "postgresql"
                 :host "localhost"
                 :port 5432
                 :user "postgres"
                 :password "whatever"
                 :dbname "from-heroku"}})
But the backup somewhere stores the production credentials, which look like this (sensitive redacted):
(def cfg
        {:store {:backend :jdbc
                 :dbtype "postgresql"
                 :host "ec2..."
                 :port 5432
                 :user "..."
                 :password "..."
                 :dbname "..."}})

whilo 2024-01-04T00:28:46.487669Z

Hmm, I see. I think the host is the problem here.

silian 2024-01-04T01:27:59.851769Z

How :host is specified?

silian 2024-01-04T01:28:49.613379Z

Sorry if I'm not understanding

whilo 2024-01-04T01:30:39.495209Z

I think this is where the consistency check failed. Could you use the "ec2...http://amazonaws.com" address in both cases (instead of localhost)?

silian 2024-01-04T01:49:02.453619Z

Hmm, I think I am not being clear. The redacted values in cfg are my production credentials, which I know not to share and have replaced here with "..." The local db shows up in the Postgres GUI app I use to run a local Postgres server. When I try to connect to that local db, wouldn't the :host have to be "localhost"?

silian 2024-01-04T01:59:41.439149Z

The cfg map that connects to production db is actually something like:

(def cfg 
  {:store 
    {:backend :jdbc
     :dbtype "postgresql" 
     :host "ec2..."  
     :port 5432  
     :user "ddxjb2hijdllx"  
     :password ; random string
     :dbname ; random string 
}})

silian 2024-01-04T02:04:19.870529Z

IIUC, if I use the http://amazonaws.com EC2 address it will connect to the production db, which is not what I want. I want to connect to my local copy to experiment, etc.

whilo 2024-01-04T14:26:05.229929Z

Then they are different databases and should never clash, i.e. for some reason during your connection it sees the other db. I am not sure why this is happening.

silian 2024-01-04T15:20:04.064019Z

Yes, when I (def conn (d/connect local-cfg)) I get the error ensure-stored-config-consistency. I will try again and include more of the error output perhaps.

timo 2024-01-04T15:23:02.474559Z

@whilo isn't the stored-config stored in the konserve store which is a pg_dump from the prod-db? As I understand it we need a way to change the dump to accept the new configuration, isn't that the case here?

alekcz 2024-01-04T15:30:50.105129Z

Yip that's correct. Doing a pg_dump would copy the config across. Hence the error. We'd need something like a create-new-from. Like a one time overide. Also, if I remember correctly the cache is part of the config check (accidentally I presume) which we need to fix.

whilo 2024-01-04T16:08:23.915859Z

I see, we haven't covered this case yet. Maybe we should allow overriding the consistency check and setting a new config.

whilo 2024-01-04T16:10:16.743739Z

This can be done by adding a guard for this check https://github.com/replikativ/datahike/blob/main/src/datahike/connector.cljc#L187.

whilo 2024-01-04T16:10:41.046089Z

@feedmyinbox02_clojuri Are you comfortable with open PRs?

whilo 2024-01-04T16:11:00.842949Z

Ideally we want people to be able to hack Datahike to their needs.

👍 1
whilo 2024-01-04T16:11:41.038979Z

Where hack means adapt 😉 We are not hacking around in the codebase, but it should be able to explore modifications easily.

silian 2024-01-04T16:13:21.301089Z

I've never done one but that shouldn't stop me!

❤️ 1
timo 2024-01-04T16:43:04.971069Z

Great! If you need help just ask here or put up the PR and we can discuss it there

silian 2024-01-01T18:57:56.553009Z

I created the copy using the heroku CLI which provides pg:pull to copy a db. (I believe it runs pg_dump.)

timo 2024-01-01T21:16:01.074609Z

hey @feedmyinbox02_clojuri. Hmm, I know this problem.... don't know exactly how to get around it. It is a check that validates the connection. I did not implement this validation and don't exactly know how to get around it. I guess the config you were using in prod is stored inside the dump. Maybe @whilo can chime in.

silian 2024-01-01T21:19:48.319799Z

Thanks timo. I suppose I can try to find this row in local and just delete? (Afraid this might corrupt somehow.)

silian 2024-01-01T21:21:18.496809Z

What is best practice to keep local and production instances of a Datahike db?

silian 2024-01-01T21:22:44.675779Z

I would like to continue developing with Datahike but don't want to experiment on production db. What is best practice to create copy of prod to experiment with?

timo 2024-01-01T21:24:36.421069Z

so the exception you are seeing above is expected. it's a way to globally identify databases that you can connect to. There should be a way to use a dump, not sure how though.

timo 2024-01-01T21:25:08.378189Z

@whilo wants to improve documentation and there were a lot of changes as of recently so it is needed

timo 2024-01-01T21:25:33.396709Z

@alekcz360 might have an idea as well on how to use a dump from prod locally

👍 1
silian 2024-01-01T21:26:32.145139Z

I suppose I could re-create the entire db by replaying all transactions. (I recall seeing this somewhere in docs, I think in migrate namespace.)

silian 2024-01-01T23:33:08.424219Z

I am attempting to use migrate instructions. I was able to create a successful export (an eavt-dump). I now see:

;; ... setup new-conn (recreate with correct schema)
(import-db new-conn "/tmp/eavt-dump")
What does ... setup new-conn (recreate with correct schema) mean? Is correct schema created automatically with import-db or I am required to actually d/transact all over again?

✅ 1
silian 2024-01-01T23:37:13.657439Z

(I can [somewhat] inspect the eavt-dump file and I see schema transaction statements.)