Fork me on GitHub
#xtdb
<
2023-04-04
>
rickheere14:04:32

I try to configure the document store to use DigitalOcean spaces to store the documents but I get an error and I don't know where to look further.

rickheere14:04:07

I have a integrant system setup

(def config
  {:config.xtdb/s3 {:region "us-east-2"
                    :bucket "..."
                    :access-key-id     "..."
                    :secret-access-key "..."
                    :endpoint ""}

   :xtdb.s3/client {:config (ig/ref :config.xtdb/s3)}

   :xtdb-client/document {:s3-config (ig/ref :config.xtdb/s3)
                          :client (ig/ref :xtdb.s3/client)}
   :xtdb-client/client {:document-store (ig/ref :xtdb-client/document)}})

(defmethod ig/init-key :config.xtdb/s3 [_ config]
  config)

(defmethod ig/init-key :xtdb.s3/client [_ {:keys [config]}]
  (let [builder (S3AsyncClient/builder)
        creds (StaticCredentialsProvider/create
               (AwsBasicCredentials/create
                (:access-key-id config)
                (:secret-access-key config)))

        _ (.credentialsProvider builder creds)
        _ (.region builder (Region/US_EAST_1))
        _ (.endpointOverride builder (URI/create ""))]
    (.build builder)))

(defmethod ig/init-key :xtdb-client/document [_ {:keys [s3-config client]}] 
  {:xtdb/module 'xtdb.s3/->document-store
   :bucket (:bucket s3-config)
   :configurator (.makeClient (reify S3Configurator
                                (makeClient [_] client)))})

(defmethod ig/init-key :xtdb-client/client [_ {:keys [document-store]}]
  (xt/start-node {:xtdb/document-store document-store}))

rickheere14:04:02

I now get an exception

; Execution error (IllegalArgumentException) at xtdb.error/illegal-arg (error.clj:12).
; Unexpected config option #object[software.amazon.awssdk.services.s3.DefaultS3AsyncClient 0x451fe1aa "software.amazon.awssdk.services.s3.DefaultS3AsyncClient@451fe1aa"]

rickheere14:04:52

I don't know why it is angry. Where do I have to look to get further?

refset16:04:40

Hey @U4XT72NNT I'm not 100% on what the specific error is here, but it looks like you are missing the configurator 'module' (i.e. don't wire up the client directly), and should either be using our S3Configurator, as per https://github.com/xtdb/xtdb/blob/master/modules/s3/test/xtdb/s3_test.clj or you can define your own configurator module, e.g. https://gist.github.com/refset/52e61db1cff3c8df3aa057e3625e47f2 (old but hopefully useful)

refset16:04:33

I would be curious about the rest of the stack trace for that error also

rickheere16:04:26

Here is the whole thing I got, I did not post it because it did not look that useful to me.

; Execution error (IllegalArgumentException) at xtdb.error/illegal-arg (error.clj:12).
; Unexpected config option #object[software.amazon.awssdk.services.s3.DefaultS3AsyncClient 0x59da7dd5 "software.amazon.awssdk.services.s3.DefaultS3AsyncClient@59da7dd5"]
clj꞉com.arqiver.file-in.system꞉> 
clojure.lang.Compiler$InvokeExpr/eval (Compiler.java:3719)
clojure.lang.Compiler$DefExpr/eval (Compiler.java:457)
clojure.lang.Compiler/eval (Compiler.java:7199)
clojure.core/eval (core.clj:3215)
clojure.core/eval (core.clj:3211)
nrepl.middleware.interruptible-eval/evaluate (interruptible_eval.clj:87)
clojure.core/apply (core.clj:667)
clojure.core/with-bindings* (core.clj:1990)
nrepl.middleware.interruptible-eval/evaluate (interruptible_eval.clj:87)
clojure.main/repl (main.clj:437)
clojure.main/repl (main.clj:458)
clojure.main/repl (main.clj:368)
nrepl.middleware.interruptible-eval/evaluate (interruptible_eval.clj:84)
nrepl.middleware.interruptible-eval/evaluate (interruptible_eval.clj:56)
nrepl.middleware.interruptible-eval/interruptible-eval (interruptible_eval.clj:152)
nrepl.middleware.session/session-exec (session.clj:218)
nrepl.middleware.session/session-exec (session.clj:217)
java.lang.Thread/run (Thread.java:829)

👍 2
rickheere16:04:04

About the configurator, can you tell me what I did wrong because I thought I was using it

....
(defmethod ig/init-key :xtdb-client/document [_ {:keys [s3-config client]}] 
  {:xtdb/module 'xtdb.s3/->document-store
   :bucket (:bucket s3-config)
   :configurator (.makeClient (reify S3Configurator
                                (makeClient [_] client)))})

refset16:04:23

ah sorry, I didn't spot that you had the reify S3Configurator in there already. I think it needs to be more explicit as a separate def + symbol like in the test I linked though, to allow the XT module system to call makeClient later on your behalf (instead of calling it yourself):

(defn ->configurator [{:keys [client]}]
  (reify S3Configurator
    (makeClient [_] client)))

(defmethod ig/init-key :xtdb-client/document [_ {:keys [s3-config client]}] 
  {:xtdb/module 'xtdb.s3/->document-store
   :bucket (:bucket s3-config)
   :configurator {:xtdb/module '->configurator, :client client}})

rickheere16:04:00

And it worked!

rickheere16:04:04

Thank you so much!

woop-woop 2
rickheere16:04:38

- :xtdb/module '->configurator
+ :xtdb/module ->configurator
without the '

refset16:04:42

aha, great! it would probably need the ' if you qualify the symbol - thanks for confirming that all worked 🙂

msuess14:04:25

@taylor.jeremydavid is there going to be a virtual meetup event today?

refset15:04:11

Hey @U0K1KAJTB there was in theory but I decided to postpone it once again on the basis that there were no new sign-ups and I wanted to focus on other things. With that said though, were you keen to discuss anything in particular? We could still run something more casual in ~25m if you're around and have burning questions...? In the meantime I'll be running through and responding to the backlog of other threads in this channel 🙂

Young-il Choo23:04:40

First project using XTDB. The document and transaction are stored on a Postgres db. My issue is that I have two processes connected to the same node and they are not seeing the same data. I create an object, then delete, from one process. The other process still sees the deleted object. I assume I am using the internal memory KV for the index, since I have not specified something else. Is there someway to synch? Is there a place where the internal KV is caching?

tatut04:04:04

xt/sync but also make sure you are getting a new db snapshot from the node, as any db gotten from the xt/db call is tied to a specific point in time

refset11:04:04

Hi @U01CUL007JP can you share your config map? Have you seen any data appear in both processes correctly already?

Young-il Choo17:04:42

Hi @taylor.jeremydavid here is my config map. Yes, most of the time, when I create an object in one, then the other can see it. It seems to be deletes are not synched. (defonce thermo-node (xt/start-node {:xtdb.jdbc/connection-pool {:dialect {:xtdb/module 'xtdb.jdbc.psql/->dialect} :db-spec {:host "" :dbname "xtdb" :user "user-name" :password "user-password"}} :xtdb/tx-log {:xtdb/module 'xtdb.jdbc/->tx-log :connection-pool :xtdb.jdbc/connection-pool} :xtdb/document-store {:xtdb/module 'xtdb.jdbc/->document-store :connection-pool :xtdb.jdbc/connection-pool}}))

Young-il Choo17:04:08

Also, I have noticed that as more data is added, when I start the node, my JVM CPU usage spikes for many minutes. I have increased the JVM memory size, but is there something else happening to cause the huge CPU spike at the beginning.

tatut17:04:13

If you dont have local indexes with rocksdb or lmdb and checkpoints then every node must index every tx at start. That will cause a cpu spike naturally.

2
refset19:04:23

what Tatu said 🙂

refset19:04:27

if data is eventually sync'd then deletes should eventually sync also - are you using transaction functions?

Young-il Choo19:04:31

Here is an example, of a delete function. (defn delete-session "Delete cook session" [session-str node] (if-let [session-id (parse-uuid session-str)] (if-let [session-doc (xtdb.api/entity (xt/db node) session-id)] (let [user-doc (get-user-doc (:userId session-doc) node) ;; remove session-id from the user's collection of sessions new-user-doc (update user-doc :sessions disj session-id) ;; remove session-id from the favorite, if it has one fav-id (:favoriteId session-doc) fav-doc (when fav-id (xtdb.api/entity (xt/db node) fav-id)) new-fav-doc (when fav-doc (update fav-doc :sessions (fn [sessions] (disj sessions session-id)))) ] (try (xt/submit-tx node (if new-fav-doc [[::xt/put new-user-doc] [::xt/delete session-id] [::xt/put new-fav-doc]] [[::xt/put new-user-doc] [::xt/delete session-id]])) (xt/sync node) {:status 200 :body {:sessionId session-str :message "Cook session deleted"}} (catch Exception e {:status 400 :body {:sessionId (.toString session-id) :message (str "Error deleting cook session." (.getMessage e))}}))) (no-cook-session-found-response session-str)) (invalid-session-id-response session-str)))

👍 2
Young-il Choo19:04:30

Not using transaction functions, but am considering it for updating a time series data that is updated frequently with new values. Is this the use case for transaction functions? Does this mean that they do not create a new instance?

refset20:04:32

Transaction functions allow you to make fully-consistent reads during writes, but at the cost of stalling writes because evaluation (including any reads) happens synchronously during the single-threaded ingestion processes. But compared with ::xt/match operations, you can be much more expressive and avoid contention/races (which might be better for overall throughput for frequently updated data). See also https://xtdb.com/blog/xtdb-transaction-functions/

refset20:04:38

I don't have a good explanation for why you may not be observing deletes correctly currently, but if you can distil it down to a reliable minimal failing example then I would definitely like to help fix it

Young-il Choo17:04:02

It turns out the issue was running out of JVM heap space. Seems like the default was too small to create local index and there was failure that was not noticed, since other queries were succeeding. Thanks for your help.

blob_thumbs_up 2
tatut17:04:32

Good to hear. I would still encourage you to move to RocksDB for local indexing and make checkpoints for it.

💯 2
tatut17:04:13

The indexing time on node start will only increase as your data grows. For example our production takes 45+ minutes to index from start. Compare that to a 100 second checkpoint restore.

tatut17:04:01

But if you run mostly long lived servers then I guess you can get quite far with only in-memory. We run ephemeral ECS tasks and could not operate without checkpoints

Young-il Choo02:04:54

Thanks for the tip on using RocksDB. Will definitely do it for the next release.