Fork me on GitHub
#xtdb
<
2022-11-02
>
macrobartfast11:11:22

is the simple day format #inst "1986-10-24" shown in this xtdb tutorial pseudocode or actually something one would see? https://docs.xtdb.com/language-reference/1.21.0/datalog-queries/#valid-time-travel I need the simplest way to find all documents created on a given day and saw this. An added wrinkly may to consider the user’s time zone, but that can wait.

malcolmsparks11:11:58

It's called an EDN tag literal and is something you would see. I may have misunderstood your question

macrobartfast11:11:42

You didn’t… apologies, posted a partial question by accident and just finished it with an edit. Thanks!

nonrecursive15:11:09

hey y’all, I’ve put together a tool that generates and inserts fixture data for tests: https://github.com/donut-party/datapotato there’s an XTDB integration: https://github.com/donut-party/datapotato/wiki/database-integration#xtdb would love any feedback on whether the docs are clear and whether the way I’m using XTDB makes sense 😛 I’m demoing this for london clojurians next Tuesday but if any of the juxt folks would like a little demo separately that would be fun 🙂

❤️ 5
nonrecursive19:11:30

Sweet thanks, I’ll check those links out!

🙏 1
tatut04:11:16

donuts and potatoes… that’s too many carbs for me 😉 . but jokes aside, this looks neat, thanks, I can definitely see using this for testing

🔥 1
zclj18:11:25

Hi! I am aiming at a small production deployment of a low traffic service using sqlite as the storage for XTDB. However, now doing some system testing, my service is restarted alot. Sometimes it cannot start up again due to the sqlite file being locked ("The database file is locked" in the log). Is this to be expected? Only way out of this I know of is to backup/restore the db-file, but this do not seem great for a production deployment where a crashed service would be auto restarted. Any advice?

refset19:11:44

Hey @U1G8B7ZD3 are you "restarting alot" ~instantaneously? Or is this always manual (i.e. a >10ms+ gap in between .close and start-node)? My first guess would be some sort of shutdown / GC issue preventing the sqlite file from being accessed too soon after a .close :thinking_face: Which version are you using? Can you share your start-node config map?

zclj08:11:19

Hi @U899JBRPF, thanks for your answer. The restart are typically automated between different test runs, but it's not super fast, there are some other steps in that process. So I think we are in the 'seconds' ballpark. Regarding .close, if the service crashes (simulated with killing the process), .close will not be called. Will this leave the file locked? I can share the config later today, but it would be very close to https://docs.xtdb.com/storage/1.22.1/jdbc/ with ''sharing connection pools'' and sqlite as the dialect. I need to understand if this is a risk inherent to this setup (using a local file-based storage), instead of using a dedicated storage server (Postgre for example), or if its my usage that is causing this.

zclj16:11:25

@U899JBRPF Here is the config map

(xt/start-node
   {:xtdb.jdbc/connection-pool
    {:dialect 'xtdb.jdbc.sqlite/->dialect
     :db-spec {:dbname (str (io/file dbdir dbname))}}
    :xtdb/tx-log
     {:xtdb/module     'xtdb.jdbc/->tx-log
      :connection-pool :xtdb.jdbc/connection-pool}
     :xtdb/document-store
       {:xtdb/module     'xtdb.jdbc/->document-store,
                   :connection-pool :xtdb.jdbc/connection-pool}})

zclj16:11:59

versions

[com.xtdb/xtdb-core "1.22.0"]

zclj16:11:27

[com.xtdb/xtdb-jdbc "1.22.0"]
[org.xerial/sqlite-jdbc "3.39.3.0" :scope "provided"]

refset18:11:55

Thanks for the follow up info. > Regarding .close, if the service crashes (simulated with killing the process), .close will not be called. Will this leave the file locked? This might well be the culprit, but I don't know the answer off-hand. It should be easy enough to verify with a ~vanilla setup. I'll take a look tomorrow

zclj07:11:42

@U899JBRPF, an update on this. It seems that this problem is not only a problem in my test scenario or when the service have been killed. I did a simple setup with JMeter, 20 users that GET the index page. When serving the index page I write some data to XT (user-agent etc). After running this simple JMeter config, I get multiple errors due to sqlite file locks.

47535 ERROR a.services.resource.xtdb - #error {
 :cause [SQLITE_BUSY] The database file is locked (database is locked)
 :via
 [{:type org.sqlite.SQLiteException
   :message [SQLITE_BUSY] The database file is locked (database is locked)
   :at [org.sqlite.core.DB newSQLException DB.java 1135]}]
 :trace
 [[org.sqlite.core.DB newSQLException DB.java 1135]
  [org.sqlite.core.DB newSQLException DB.java 1146]
  [org.sqlite.core.DB execute DB.java 941]
  [org.sqlite.jdbc3.JDBC3PreparedStatement lambda$execute$0 JDBC3PreparedStatement.java 56]
  [org.sqlite.jdbc3.JDBC3Statement withConnectionTimeout JDBC3Statement.java 429]
  [org.sqlite.jdbc3.JDBC3PreparedStatement execute JDBC3PreparedStatement.java 51]
  [com.zaxxer.hikari.pool.ProxyPreparedStatement execute ProxyPreparedStatement.java 44]
  [com.zaxxer.hikari.pool.HikariProxyPreparedStatement execute HikariProxyPreparedStatement.java -1]
  [juxt.clojars_mirrors.nextjdbc.v1v2v674.next.jdbc.result_set$stmt__GT_result_set invokeStatic result_set.clj 643]
  [juxt.clojars_mirrors.nextjdbc.v1v2v674.next.jdbc.result_set$stmt__GT_result_set invoke result_set.clj 638]
  [juxt.clojars_mirrors.nextjdbc.v1v2v674.next.jdbc.result_set$fn__17628 invokeStatic result_set.clj 855]
  [juxt.clojars_mirrors.nextjdbc.v1v2v674.next.jdbc.result_set$fn__17628 invoke result_set.clj 831]
  [juxt.clojars_mirrors.nextjdbc.v1v2v674.next.jdbc.protocols$fn__16906$G__16884__16915 invoke protocols.clj 33]
  [juxt.clojars_mirrors.nextjdbc.v1v2v674.next.jdbc$execute_one_BANG_ invokeStatic jdbc.clj 272]
  [juxt.clojars_mirrors.nextjdbc.v1v2v674.next.jdbc$execute_one_BANG_ invoke jdbc.clj 256]
  [xtdb.jdbc$insert_event_BANG_ invokeStatic jdbc.clj 106]
  [xtdb.jdbc$insert_event_BANG_ invoke jdbc.clj 104]
  [xtdb.jdbc.JdbcDocumentStore$fn__19080 invoke jdbc.clj 127]
  [juxt.clojars_mirrors.nextjdbc.v1v2v674.next.jdbc.transaction$transact_STAR_ invokeStatic transaction.clj 72]
  [juxt.clojars_mirrors.nextjdbc.v1v2v674.next.jdbc.transaction$transact_STAR_ invoke transaction.clj 51]
  [juxt.clojars_mirrors.nextjdbc.v1v2v674.next.jdbc.transaction$fn__17769 invokeStatic transaction.clj 136]
  [juxt.clojars_mirrors.nextjdbc.v1v2v674.next.jdbc.transaction$fn__17769 invoke transaction.clj 115]
  [juxt.clojars_mirrors.nextjdbc.v1v2v674.next.jdbc.protocols$fn__16961$G__16956__16970 invoke protocols.clj 57]
  [juxt.clojars_mirrors.nextjdbc.v1v2v674.next.jdbc$transact invokeStatic jdbc.clj 348]
  [juxt.clojars_mirrors.nextjdbc.v1v2v674.next.jdbc$transact invoke jdbc.clj 340]
  [xtdb.jdbc.JdbcDocumentStore submit_docs jdbc.clj 118]
  [xtdb.document_store.CachedDocumentStore submit_docs document_store.clj 54]
  [xtdb.node.XtdbNode submit_tx_async node.clj 220]
  [xtdb.node.XtdbNode submit_tx_async node.clj 206]
  [xtdb.node.XtdbNode submit_tx node.clj 223]

refset13:11:15

Ah, maybe it's a case of submit-tx being called in parallel from multiple threads - I believe a SQLite doc-store can only work with a single-threaded writer pattern. Like it pushes the coordination strategy "up the stack" and means you have to handle back-off-and-retry in userspace / introduce a queue

refset14:11:12

Or are you fairly certain that the submit-tx requests are happening serially (no thread pool etc.) and returning successfully between each call?

zclj07:11:13

OK, the single threaded requirement explain the runtime locks I see. However, looking into the log for the test/crash-scenario, it show that the node can not get back up when this has happened. See it seems to be two different problems. As it seems then the runtime locks in on me to solve, but the one below I don't know what to do about..

2022-11-05 06:53:51,450 [main] INFO  a.services.resource.xtdb - starting XTDB node 
2022-11-05 06:53:51,450 [main] INFO  a.services.resource.xtdb - making node with dialect: :sqlite 
2022-11-05 06:53:52,937 [main] INFO  com.zaxxer.hikari.HikariDataSource - HikariPool-1 - Starting... 
2022-11-05 06:53:52,999 [main] INFO  com.zaxxer.hikari.HikariDataSource - HikariPool-1 - Start completed. 
2022-11-05 06:53:53,009 [main] INFO  xtdb.tx - Started tx-ingester 
2022-11-05 06:53:56,034 [main] ERROR a.services.resource.xtdb - failed to start XTDB node 
org.sqlite.SQLiteException: [SQLITE_BUSY] The database file is locked (database is locked)
	at org.sqlite.core.DB.newSQLException(DB.java:1135)
	at org.sqlite.core.DB.newSQLException(DB.java:1146)
	at org.sqlite.core.DB.throwex(DB.java:1106)
	at org.sqlite.core.DB.exec(DB.java:198)
	at org.sqlite.SQLiteConnection.commit(SQLiteConnection.java:444)
	at com.zaxxer.hikari.pool.ProxyConnection.commit(ProxyConnection.java:387)
	at com.zaxxer.hikari.pool.HikariProxyConnection.commit(HikariProxyConnection.java)
	at juxt.clojars_mirrors.nextjdbc.v1v2v674.next.jdbc.transaction$transact_STAR_.invokeStatic(transaction.clj:80)
	at juxt.clojars_mirrors.nextjdbc.v1v2v674.next.jdbc.transaction$transact_STAR_.invoke(transaction.clj:51)
	at juxt.clojars_mirrors.nextjdbc.v1v2v674.next.jdbc.transaction$fn__17769.invokeStatic(transaction.clj:136)
	at juxt.clojars_mirrors.nextjdbc.v1v2v674.next.jdbc.transaction$fn__17769.invoke(transaction.clj:115)
	at juxt.clojars_mirrors.nextjdbc.v1v2v674.next.jdbc.protocols$fn__16961$G__16956__16970.invoke(protocols.clj:57)
	at juxt.clojars_mirrors.nextjdbc.v1v2v674.next.jdbc$transact.invokeStatic(jdbc.clj:348)
	at juxt.clojars_mirrors.nextjdbc.v1v2v674.next.jdbc$transact.invoke(jdbc.clj:340)
	at xtdb.jdbc.JdbcDocumentStore.submit_docs(jdbc.clj:118)
	at xtdb.document_store.CachedDocumentStore.submit_docs(document_store.clj:54)
	at xtdb.node.XtdbNode.submit_tx_async(node.clj:220)
	at xtdb.node.XtdbNode.submit_tx_async(node.clj:206)
	at xtdb.node.XtdbNode.submit_tx(node.clj:223)

1
zclj07:11:37

Are there other local file options for XT that are know to work better in this scenario?

refset08:11:53

RocksDB is multi-threaded and very battle tested, in comparison

zclj15:11:51

OK, so I will switch to RocksDB for my local storage option and see how that goes. In summary, it seems that I have described two different problems above. The runtime locks need to be solved in the application layer, i.e., by me, due to how Sqlite works. The other problem, where the node can not start up due to file lock, I don't see how I could solve in the application. Since I don't have any hard requirement on Sqlite I can switch. However, the first issue (Sqlite require serialization in the application layer) might be nice to document. I guess it might be out of the scope of XT doc, but otoh it can have a big effect on a user of XT. For the startup issue, I don't know if that is also an inherent risk when running Sqlite, that you get dangling file-locks if your service crash.. Thanks for the discussion, I learned some more about the trade-off of using Sqlite vs. RocksDB.

refset23:11:53

Hey again, thanks for following up. I'll circle back to this in a couple of days - good points though, and I appreciate the feedback 🙏

macrobartfast20:11:26

Any tips for finding documents created on a given day? I’m not sure how much of that rests in code vs how much may be a xtdb baked-in functionality.

refset20:11:14

XT doesn't index across document histories in a first-class way. If you need to track things like that then you would need to reify the information into your data model (potentially with additional entities/documents). This is a good example of the kinds of "cross time" querying we are building https://xtdb.com/core2 to address (with a distinctly more powerful temporal index). e.g. see the examples in https://github.com/xtdb/core2-playground/blob/8c96fc79051182837153d1d2330e7f7533ed455a/bitemporal/snodgrass-99.sql

macrobartfast21:11:20

Thanks! Unpacking your response a bit… what would it mean in practical terms to ‘reify’ the information in my data model. I’m building a toy app (that I’ll actually use) that tracks task completions every day. Some tasks have a certain number of repetitions to get checked off. I have a ‘work’ data model (that is enforced by a Malli schema) that looks like

:work/id :uuid
   :work/task :task/id
   :work/recorded-at inst?
   :work/reps :int
   :work [:map {:closed true}
          [:xt/id :work/id]
          :work/task
          :work/recorded-at
          :work/reps]
and so I’ll be wanting to get all the work documents for a given task each day. Then, in code, once the number of the work documents matches the number that the task requires each day the task will display in the ui as ‘completed’. Currently :work/recoreded-at is the product of (java.util.Date.). Do I perform string manipulation to get the relevant parts of that date for a function that matches it for today? Or maybe it’s time I delve into the clojure/java time fundamentals and there will be something there to use. Apologies for my haplessness.

refset21:11:40

Cool, thanks for the added context! > Do I perform string manipulation That should never be necessary. Given :work/recorded-at is just a java.util.Date then you can essentially treat it like a java.lang.Long and add/subtract milliseconds as needed with + / -. However knowing that alone may not be enough to solve your problem sufficiently, depending on: 1. how many tasks are happening in parallel 2. how long-running the tasks are (avg / max number of work documents) 3. how fast you need the queries to be If the answers are: 1. lots 2. long (lots / ~unbounded) 3. fast ...then the exact recommendation could be worth a video chat, or a longer write-up 🙂 > haplessness I need to use this more 🤌

macrobartfast21:11:58

This is awesome! Thank you. As this is a single user toy app right now I won’t need 1, 2, or 3… but once it gets accidentally mentioned by Taylor Swift as her favorite productivity app I’ll trip over myself trying to get on a video chat for scaling help, without a doubt! I’ll work on your tips here for now…

🙏 1
refset22:11:02

> once it gets accidentally mentioned by Taylor Swift 😄 looking forward to that!

😅 1