This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2023-02-26
Channels
- # announcements (1)
- # babashka (6)
- # beginners (12)
- # biff (39)
- # calva (4)
- # cider (5)
- # clj-commons (1)
- # clj-yaml (9)
- # clojure (42)
- # clojure-conj (1)
- # clojure-europe (8)
- # clojurescript-ios (1)
- # clr (1)
- # conjure (7)
- # data-science (1)
- # datalevin (9)
- # emacs (3)
- # helix (1)
- # honeysql (11)
- # hyperfiddle (60)
- # introduce-yourself (1)
- # lsp (26)
- # music (1)
- # off-topic (1)
- # pathom (2)
- # polylith (3)
- # releases (1)
- # sci (1)
- # scittle (21)
- # shadow-cljs (57)
- # spacemacs (3)
- # xtdb (38)
having some trouble with big(ish) data, we have some reports that have a large number of rows and we have modeled it currently as report and each row being separate documents and report referring to rows like {:rows [r1 r2 …]}
, trouble is that there will be relatively few reports (in the tens of thousands range) but quite a few rows (100+ million in production) and this model seems troublesome
the thing is that while the reports are something that we want to query in datalog, but the individual rows not so much, I’m thinking should I just batch the rows into a serialized map
one huge report might have 100k or more rows even, so that is a big thing to even transact in a single tx
you could consider chunking the rows into modelled batches, e.g.
{:xt/id :my-report
:row-data [:my-report-rows-batch-1
:my-report-rows-batch-2
...]}
{:xt/id :my-report-rows-batch-1
:rows-outer {:rows [:foo :bar :baz ...]}}
something like this should limit the bloat in the indexesyou could add a batch-count also to regain transactional integrity (to know whether all batches have been written as intended)
so using {:rows [...]}
as value would plain serialize them... that is probably ok for my case
is it true that xt/listen is only available when "embedding XTDB within the JVM application"
Is it possible to run XTDB in a production configuration and connect to it from the clojure API with xt/listen?
> is it true that xt/listen is only available when "embedding XTDB within the JVM application"
yes, but you can also poll the tx-log
endpoint over HTTP https://docs.xtdb.com/clients/http/#tx-log
> Is it possible to run XTDB in a production configuration and connect to it from the clojure API with xt/listen? yes, simply embedding an XT node (but not the tx-log) is considered production configuration
@U899JBRPF starting from https://github.com/hyperfiddle/electric-xtdb-starter, practically how should my XT configuration change if I want to deploy and scale my application on http://Fly.io? I assume I need a “master” XTDB node to which my scaled up edge nodes will connect to over HTTP and poll /tx-log for changes? Should each edge node then be a carbon copy of master, or something else? Can I run an in-process XT node on each edge deployment and let XTDB worry about getting and staying up to speed on master? Thanks! (may I suggest considering adding a section to XT docs for “Deployment”, “Production” and/or Replication)
Hey @U051SPP9Z XT nodes are essentially all ~identical and deterministic replicas (with eventually consistent processing of the tx-log). To scale up you will need to use a remote tx-log and doc-store for >1 XTDB node to connect to (e.g. Kafka or a JDBC backend) - currently it is configured to use an embedded KV store which can't be access by more than 1 node https://github.com/hyperfiddle/electric-xtdb-starter/blob/ffe3ed23cc51e7dd7a001263b52d52a6fd00738a/src/user.clj#L16-L17 I haven't looked at the specifics of http://Fly.io scaling or deeply thought about whether/how (distributed) Electric intersects with clustering of XTDB node, but hopefully that information gives you some intuition.
resurrecting this thread, I'm using the electric-xtdb-starter with embedded KV store and managed to compile and deploy it to http://fly.io. I'm seeing the following runtime error:
[info] ERROR hyperfiddle.electric: #error {
[info] :cause xtdb.api.PXtdb
[info] :via
[info] [{:type java.lang.NoClassDefFoundError
[info] :message xtdb/api/PXtdb
[info] :at [app.xtdb_contrib$latest_db_GT_ invokeStatic xtdb_contrib.clj 12]}
[info] {:type java.lang.ClassNotFoundException
[info] :message xtdb.api.PXtdb
[info] :at [jdk.internal.loader.BuiltinClassLoader loadClass BuiltinClassLoader.java 581]}]
any hints to what PXtb does and how I might resolve this error?
here's the line causing the exception: https://github.com/hyperfiddle/electric-xtdb-starter/blob/master/src/app/xtdb_contrib.clj#L12C13-L12C13Hi @U066TMAKS - is the !xtdb
var definitely bound? have you observed that XT's start-node
API ever gets called following https://github.com/hyperfiddle/electric-xtdb-starter/blob/1bfc0255997ab6ace19b45536c70946500d48567/src/user.clj#L31 ?
yes- I extracted a minimal example here- https://github.com/yayitswei/electric-xtdb-starter-fly-io, start-node
gets called here: https://github.com/yayitswei/electric-xtdb-starter-fly-io/blob/master/src/app/db.clj#L10
trying to compile and run the uberjar locally, I'm actually getting the same java.lang.NoClassDefFoundError
but for a different class, which makes me think it's an uberjar build issue. I found some old threads pointing to AOT as a possible culprit, but I don't think we're using AOT compilation here.
WARN org.eclipse.jetty.websocket.common.WebSocketSession: Exception while notifying onClose
java.lang.NoClassDefFoundError: clojure/tools/logging/impl/LoggerFactory
at hyperfiddle.electric_jetty_adapter$electric_ws_adapter$on_close__14649.invoke(electric_jetty_adapter.clj:68)
at ring.adapter.jetty9.websocket$proxy_ws_adapter$fn__14515.invoke(websocket.clj:159)
at ring.adapter.jetty9.websocket.proxy$org.eclipse.jetty.websocket.api.WebSocketAdapter$WebSocketPingPongListener$12d400b6.onWebSocketClose(Unknown Source)
at org.eclipse.jetty.websocket.common.events.JettyListenerEventDriver.onClose(JettyListenerEventDriver.java:149)
at org.eclipse.jetty.websocket.common.WebSocketSession.callApplicationOnClose(WebSocketSession.java:394)
at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.close(AbstractWebSocketConnection.java:225)
at org.eclipse.jetty.websocket.common.WebSocketSession.close(WebSocketSession.java:130)
at org.eclipse.jetty.websocket.common.events.AbstractEventDriver.openSession(AbstractEventDriver.java:221)
at org.eclipse.jetty.websocket.common.WebSocketSession.open(WebSocketSession.java:493)
at org.eclipse.jetty.websocket.common.WebSocketSession.onOpened(WebSocketSession.java:459)
at org.eclipse.jetty.io.AbstractConnection.onOpened(AbstractConnection.java:213)
at org.eclipse.jetty.io.AbstractConnection.onOpen(AbstractConnection.java:205)
at org.eclipse.jetty.io.AbstractEndPoint.upgrade(AbstractEndPoint.java:444)
at org.eclipse.jetty.server.HttpConnection.onCompleted(HttpConnection.java:401)
at org.eclipse.jetty.server.HttpChannel.onCompleted(HttpChannel.java:820)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:368)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:279)
at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105)
at org.eclipse.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:135)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:882)
at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1036)
at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: java.lang.ClassNotFoundException: clojure.tools.logging.impl.LoggerFactory
... 27 common frames omitted
it’s a subtle clojure issue that the electric xtdb starter doesn’t go out of its way to defend against
as to whether it is technically an XT issue it might be debatable tbh i don’t understand the issue
@U899JBRPF How does this change (if it does) with XTDB v2
Hi 🙂 the v2 integration should probably look quite different - I would start afresh, looking at the current state of the art with electric<->Postgres integration and try to treat v2 similarly
Is this a typo in the docs? https://docs.xtdb.com/language-reference/datalog-transactions/#transaction-time The tx time is provided as the third arg of submit-tx
Sorry I just checked the changelog and it's a new feature in 1.21 apparently. I'm still using 1.20 in this session. The docs don't seem to reflect 100% that this feature is part of the signatures (see individual transactions such as put etc.) so I got confused.
I'll have a look at how to make this clearer. I guess having some sort of "New feature since 1.x" label in the docs somewhere would be a useful indicator too :thinking_face:
> Were you hoping to use it? Ill definitely upgrade to the newest version. the project I'm working on is basically in an experimentation stage. The reason I'm looking at XTDB (and Clojure) is because I wrote (and maintain) and application 2y ago with SQL and a conventional web framework that has a reporting system and a bi temporal data model. It works well and my clients are happy. I've been personally very happy with having a queryable audit trail of all the transactions. But implementing it was painful. And the web framework didn't help me at all outside of very trivial baseline stuff.
I'm currently trying to figure out how to have. a "transient" database in front of XTDB (for drafts/work in progress documents that update very quickly and frequently) so I was trying out stuff with valid time/tx time.
Ah that's all great to hear ☺️ Out of interest, do you make use of future / proactive valid-time operations (either currently or in your previous app)?
The 'transient' database aspect sounds quite intriguing. I guess in addition to performance concerns you're also hoping to avoid muddying the main database that contains agreed documents with quickly-irrelevant drafts. If you think there are ways XT could make this setup easier (cross-db joins, perhaps?) we would be keen to hear your thoughts :)
in theory yes in the previous app. My client doesn't use it though 🙂 in the current prototype I likely actually need it though. I'm thinking of leveraging this in order to create a publishing workflow. But I'm not quite sure yet whether that's a good idea or not!
> The 'transient' database aspect sounds quite intriguing. I guess in addition to performance concerns you're also hoping to avoid muddying the main database that contains agreed documents with quickly-irrelevant drafts.
> If you think there are ways XT could make this setup easier (cross-db joins, perhaps?) we would be keen to hear your thoughts 🙂 I'm thinking of applying the as-of-now state to the transient database per document that has a draft, on request. I'm thinking of using with-tx for this, where only a single draft is applied for a given query. I'm not sure whether I really need actual joins over all the drafts (yet) and how I would achieve that properly. That's still in hammock stage. As for how could xt make this easier: if cross joins were possible it would be cool. But that has all kinds of implications, for example how are identities resolved or what role will tx/valid time play? For my particular use case it would be at least somewhat clear but I'm not sure whether that's a general enough case. the most convenient (for me) feature would be to have transient documents, that are / can be applied over currently valid ones in terms of both querying and transacting within the same db. but that's very specific to my little project and likely not a concern for xt as a whole 🙂
So...something like with-tx but higher-level, durable, and without inflating the main data :thinking_face: I guess the key bits all exist to make it happen in userland, at least
Yes! Functionality wise everything is there. The only thing that I'm slightly worried about is filling up the tx log with stuff I don't really care about - for performance/resource reasons. I would have to have bang !
version of submit-tx
etc.
But that's a purely theoretical worry. It's more important for the to have the same API. I have to experiment more and see what happens to say/ask more useful things!
resurrecting this thread, I'm using the electric-xtdb-starter with embedded KV store and managed to compile and deploy it to http://fly.io. I'm seeing the following runtime error:
[info] ERROR hyperfiddle.electric: #error {
[info] :cause xtdb.api.PXtdb
[info] :via
[info] [{:type java.lang.NoClassDefFoundError
[info] :message xtdb/api/PXtdb
[info] :at [app.xtdb_contrib$latest_db_GT_ invokeStatic xtdb_contrib.clj 12]}
[info] {:type java.lang.ClassNotFoundException
[info] :message xtdb.api.PXtdb
[info] :at [jdk.internal.loader.BuiltinClassLoader loadClass BuiltinClassLoader.java 581]}]
any hints to what PXtb does and how I might resolve this error?
here's the line causing the exception: https://github.com/hyperfiddle/electric-xtdb-starter/blob/master/src/app/xtdb_contrib.clj#L12C13-L12C13