Fork me on GitHub
#datomic
<
2023-09-11
>
Hendrik05:09:56

Error communicating with HOST I try to setup datomic pro in kubernetes, but now I am stuck with the above error (complete stack trace in the comments), while I try to create a db. What I have so far is: a service “transactor” with one pod, that runs datomic transactor a 2nd service “transactor-backup” with one pod, that runs datomic transactor They expose ports 4334, 4335, 4336 HOST is set to transactor.datomic.svc.cluster.local for transactor and transactor-backup.datomic.svc.cluster.local for the backup service. These are the fully qualified domain names in the cluster. They are corrected. I tested this by swapping the image to traefik/whoami and did a curl from an unrelated pod. datomic-pro version is 1.0.6735 for transactor and peer. Java version is 11 for transactor and peer. I also tested 17 for transactor, which failed, too. Any ideas where I can look into to fix this problem? Any help is appreciated :)

Hendrik05:09:19

05:27:24.545 [main] DEBUG datomic.peer - {:tid 1, :request :create-database, :cluster {:protocol :sql, :db-name "backend", :system-root "jdbc:"}, :phase :end, :pid 1, :event :peer/transactor-admin-request, :arg {:db-name "backend"}, :msec 940.0, :threw clojure.lang.ExceptionInfo}
Exception in thread "main" java.lang.RuntimeException: could not start [#'com.aeditto.backend.app.db/!conn] due to
	at mount.core$up$fn__10232.invoke(core.cljc:80)
	at mount.core$up.invokeStatic(core.cljc:80)
	at mount.core$up.invoke(core.cljc:78)
	at mount.core$bring.invokeStatic(core.cljc:247)
	at mount.core$bring.invoke(core.cljc:239)
	at mount.core$start.invokeStatic(core.cljc:289)
	at mount.core$start.doInvoke(core.cljc:281)
	at clojure.lang.RestFn.invoke(RestFn.java:397)
	at mount.core$start_with_args.invokeStatic(core.cljc:389)
	at mount.core$start_with_args.doInvoke(core.cljc:385)
	at clojure.lang.RestFn.invoke(RestFn.java:410)
	at com.aeditto.backend.prod$_main.invokeStatic(prod.clj:15)
	at com.aeditto.backend.prod$_main.doInvoke(prod.clj:12)
	at clojure.lang.RestFn.invoke(RestFn.java:397)
	at clojure.lang.AFn.applyToHelper(AFn.java:152)
	at clojure.lang.RestFn.applyTo(RestFn.java:132)
	at com.aeditto.backend.prod.main(Unknown Source)
Caused by: clojure.lang.ExceptionInfo: Error communicating with HOST transactor-backup.datomic.svc.cluster.local on PORT 4334 {:alt-host nil, :peer-version 2, :password "<redacted>", :username "Rq36C27nE5erLFAmZCfKCEzCIKFF00sOqb2iSGTm97E=", :port 4334, :host "transactor-backup.datomic.svc.cluster.local", :version "1.0.6735", :timestamp 1694410041372, :encrypt-channel true}
	at datomic.connector$endpoint_error.invokeStatic(connector.clj:53)
	at datomic.connector$endpoint_error.invoke(connector.clj:50)
	at datomic.connector$create_hornet_factory.invokeStatic(connector.clj:135)
	at datomic.connector$create_hornet_factory.invoke(connector.clj:119)
	at datomic.connector$create_transactor_hornet_connector.invokeStatic(connector.clj:306)
	at datomic.connector$create_transactor_hornet_connector.invoke(connector.clj:301)
	at datomic.connector$create_transactor_hornet_connector.invokeStatic(connector.clj:304)
	at datomic.connector$create_transactor_hornet_connector.invoke(connector.clj:301)
	at datomic.peer$send_admin_request$fn__10270.invoke(peer.clj:800)
	at datomic.peer$send_admin_request.invokeStatic(peer.clj:792)
	at datomic.peer$send_admin_request.invoke(peer.clj:790)
	at datomic.peer$create_database.invokeStatic(peer.clj:812)
	at datomic.peer$create_database.invoke(peer.clj:802)
	at datomic.peer$create_database.invokeStatic(peer.clj:804)
	at datomic.peer$create_database.invoke(peer.clj:802)
	at clojure.lang.Var.invoke(Var.java:384)
	at datomic.Peer.createDatabase(Peer.java:115)
	at datomic.api$create_database.invokeStatic(api.clj:24)
	at datomic.api$create_database.invoke(api.clj:22)
	at com.aeditto.backend.app.db$fn__10448.invokeStatic(db.clj:15)
	at com.aeditto.backend.app.db$fn__10448.invoke(db.clj:11)
	at mount.core$record_BANG_.invokeStatic(core.cljc:74)
	at mount.core$record_BANG_.invoke(core.cljc:73)
	at mount.core$up$fn__10232.invoke(core.cljc:81)
	... 16 more
Caused by: ActiveMQNotConnectedException[errorType=NOT_CONNECTED message=AMQ219007: Cannot connect to server(s). Tried with all available servers.]
	at org.apache.activemq.artemis.core.client.impl.ServerLocatorImpl.createSessionFactory(ServerLocatorImpl.java:701)
	at datomic.artemis_client$create_session_factory.invokeStatic(artemis_client.clj:114)
	at datomic.artemis_client$create_session_factory.invoke(artemis_client.clj:104)
	at datomic.connector$try_hornet_connect.invokeStatic(connector.clj:97)
	at datomic.connector$try_hornet_connect.invoke(connector.clj:81)
	at datomic.connector$create_hornet_factory.invokeStatic(connector.clj:129)
	... 37 more

Hendrik05:09:40

kubernetes transactor yaml:

apiVersion: v1
kind: Service
metadata:
 name: transactor
 namespace: datomic
spec:
 ports:
   - name: port4334
     port: 4334
     targetPort: port1
   - name: port4335
     port: 4335
     targetPort: port1
   - name: port4336
     port: 4336
     targetPort: port1
   - name: http
     port: 80
     targetPort: web
 selector:
   app: transactor
---
apiVersion: apps/v1
kind: Deployment
metadata:
 name: transactor
 namespace: datomic
spec:
 selector:
   matchLabels:
     app: transactor
 template:
   metadata:
     labels:
       app: transactor
   spec:
     containers:
       - name: transactor
         image: my/transactor
         #image: traefik/whoami
         ports:
          - name: port1
            containerPort: 4334
            protocol: TCP
          - name: port2
            containerPort: 4335
            protocol: TCP
          - name: port3
            containerPort: 4336
            protocol: TCP
          - name: web
            containerPort: 80
            protocol: TCP
         env:
          - name: POSTGRES_HOST
            value: cluster-example-rw.default.svc.cluster.local
          - name: TRANSACTOR_SERVICE
            value: transactor.datomic.svc.cluster.local
          - name: POSTGRES_PASSWORD
            valueFrom:
              secretKeyRef:
                name: datomic-user
                key: password

Hendrik06:09:33

ok I finally got it working. I directly injected the pods ip address as an env var and set host to the pod ip. Now it is working. Also tested transactor failover. that is working now, too. Took me the complete weekend to set this up 🤯.

👍 2
jasonjckn07:09:53

"HOST is set to transactor.datomic.svc.cluster.local for transactor and transactor-backup.datomic.svc.cluster.local" strikes me as unusual, k8s will load balance the that service DNS to whichever pods are ready to accept traffic

jasonjckn07:09:28

and the one in standby, will not accept traffic, so you're guaranteed to be routed to the elected transactor

jasonjckn07:09:42

glad you got it working though 🙂

Hendrik08:09:48

Ah ok. Somewhere I read that loadbalancing could be an issue. So my intention was: Having two distinct services (one for transactor and one for standby) each with one pod. So a service could not loadbalance to a standby transactor pod because the peer controls which service (the active or standby one) to use. But that did not work and failed with the above mentioned error. Yeah I am glad that it is working now with the ip based solution 🙂

👍 2
jasonjckn08:09:29

i’m not sure what you read, but i’d be curious to read if you remember the link

jasonjckn08:09:08

if you actually tried to load balance between two transactors that would be a problem, but with proper readiness probe configuration the load balancer will only send traffic to the elected one