2024-05-14 xtdb | Clojure Slack Archive

xtdb 2024-05-14

nivekuil 2024-05-14T20:20:37.181649Z

does the indexing strategy of xt2 facilitate speculative transactions like xt1?

refset 2024-05-17T10:42:08.367029Z

makes sense, and is it important for anything in the pipeline to be durable / recoverable in the event of some failure? like do you see this as something that sits within a transaction?

nivekuil 2024-05-17T20:18:06.716589Z

that's an interesting question, I think for perf reasons it would make sense to cache each stage in the pipeline, analogous to building a container image, or more generally building an ultimate state of affairs through immutable layers. ostree does this too

nivekuil 2024-05-17T20:21:03.351879Z

at some point we're thinking about a dataflow graph but more opinionated: passing around a reference to a single central state instead of ad-hoc arguments. In contrast to UI dataflow systems we don't get reactivity but we don't want it, the memory cost would be astronomical and just querying indexes should be good enough latency. Indexes are like "virtual" (to borrow a term from Deleuze) dataflows, real but not yet actualized in memory

nivekuil 2024-05-17T20:26:10.848789Z

this dataflow is unidirectional, like react, and the rendering is inspired by fulcro's idea of rendering as a function of a projection from a single graph db. but here rendering is more general than just writing HTML to a page, and the queries are (much) more powerful than EQL

nivekuil 2024-05-18T01:56:27.490039Z

but I I think you're concerned with the design of tx-fns in xt2, right? unless you really need to use the tx log's serialization power it seems better to do the work asynchronously using another table as the "staging branch" and then merge the docs back into the canonical table

nivekuil 2024-05-18T02:01:20.007009Z

can't think of a need for this to be done synchronously given bitemp makes the semantics deterministic

jarohen 2024-05-15T08:40:00.720769Z

yep - it's a very similar idea to XT1 in that respect, we'd add the speculative tx as another source of log events to be taken into account

refset 2024-05-15T09:09:45.285909Z

what's the use case here? e.g. is this for testing? or for preparing transactions ahead of submission (like a dry-run to save needless work on the hot thread)? or for running some domain-level speculative queries (https://docs.xtdb.com/tutorials/financial-usecase/time-in-finance.html#8-what-if-time-travelling-sources-of-truth)?

nivekuil 2024-05-15T09:33:31.956039Z

queryable intermediate states in a transform (db->db) pipeline, that doesn't thrash the canonical vt timeline

nivekuil 2024-05-15T09:40:47.924269Z

so docs that are unchanged at the end of that pipeline should not affect anything, otherwise an ultimately unchanged doc will be thrashed by the intermediate states and show up as new. At the end of this transform pipeline, renderers will perform cross-time queries, starting from a last changed time, and shouldn't render docs that were unchanged as an optimization. can demo the exact use case if desired

🙏 1

nivekuil 2024-05-15T09:41:37.322859Z

not sure what the best solution is considering further requirements, but speculative tx came to mind as an easy way out

refset 2024-05-20T14:56:40.718859Z

> the queries are (much) more powerful than EQL sounds very interesting 🙂 ...I'm guessing this is totally unrelated to Catablog 😅 > you're concerned with the design of tx-fns in xt2, right? essentially yes > unless you really need to use the tx log's serialization power it seems better to do the work asynchronously using another table as the "staging branch" and then merge the docs back into the canonical table based on your descriptions I would agree with this, thanks for the back-and-forth 🙏

nivekuil 2024-05-20T20:18:59.548219Z

oh, that's gone a much more ambitious direction as I better understand the problem.. this is like the 6th evolution of hatchery, and it's converging on juxt/site I think. I finally understand the paradigm I'd been reaching for to be powerful enough not only to manage infrastructure but also be a static site generator and program microcontrollers, and other use cases I'm still looking for. The LOC keeps getting smaller so I must be on the right track 🙂

refset 2024-05-16T13:57:08.268619Z

queryable intermediate states in a transform (db->db) pipeline

did you see the Datomic Jepson report & discussion? 🙂 (https://news.ycombinator.com/item?id=40373311) The design space here is interesting... do you need to support concurrent writers during the pipeline or can you afford to stall other writers? How long might a pipeline take to execute?

nivekuil 2024-05-16T20:13:00.959239Z

I glanced at it, didn't see anything too surprising (i.e. transactions are declarative, not imperative?) I think using the transaction log for the transform pipeline is wrong, because it implies that there's a canonical state of the db when that's really a concretion. Ideally it's more like git where "master" branch is just another branch that clients happen to treat as canonical

👍 1

nivekuil 2024-05-15T03:32:35.328619Z

maybe this would be ~equivalent to cheaply merging documents between tables, as branches could just be implemented as tables

Clojurians Log v2

xtdb 2024-05-14