đź‘‹ Hello everyone, I have been working on a state management library https://github.com/jumraiya/wizard which works alongside datascript. Would love to have folks try it out and provide any feedback.
Datoms are a fundamental data-model building block, i.e., a fact. And yes, datom cannot be added "twice", so from that perspective, weights seem unnecessary. However, weights are essential for performing most operations within the DBSP computation model: joins, aggregate queries (which Datomic Datalog supports), etc. For example, an aggregate query like:
'[:find (count ?e)
:where [?e :user/id]]
... can use zsets (with weights) to represent the intermediary state of the count computation.Overall, zsets are a great fit with datoms and the overall datomic model in general, in my experience so far.
@taylor.jeremydavid thanks for the shoutout! It's a heavy work-in-progress at the moment; the README is generally up to date on what's currently working. The one thing that can perhaps work standalone is the zset implementation at https://github.com/saberstack/zsxf/blob/8e80ac2dffdba20802a5e5b4b8fd5a32f5d31883/src/org/zsxf/type/zset.cljc , has a few basic generative tests around it. Working on dramatically improving the efficiency of intermediary representation (hopefully an order of magnitude with multiple queries declared to be materialized/maintained live). Currently, every query is a brand new world, which works but is not ideal from a) total memory use b) speed of initial materialization of each query.
This looks great! Just making sure you're also aware of the team working on https://github.com/saberstack/zsxf - glancing briefly at your tests you may even have already got more things working. Cc @raspasov
I am! we talked briefly at the conj.
Hey @jumraiya and @raspasov, @taylor.jeremydavid just mentioned this thread to me. I am also working on a incremental Datalog engine (albeit mostly JVM based) to explore some ideas. I am wondering if we should create an extra channel to exchange ideas and concepts. Maybe ask questions about DBSP etc.. as it can be quite a rabbit hole. What do you think? #incremental-datalog? #dbsp-datalog?
I'd use incremental instead of dbsp to be more generic, as dbsp is just one technique. Hyperfiddle is also in that game with v3 @dustingetz although it's not datalog to be fair
I think Datalog, SQL, etc are just “front end”, in a way. Datalog/SQL/etc -> Compiler -> DBSP circuits (aka Clojure code, or other implementation). The most general concept in that context is probably “incremental computation” without specifying languages, models, techniques, etc.
Re: DBSP is a specific model of incremental computation, yes, and it’s the only one that has a proof, as far as I know. It effectively allows full Turing complete incremental computation (which is both a blessing and a curse, depending on the perspective). I think a #dbsp channel is worth having. I wouldn’t sub-specify it to dbsp-datalog, unless there’s clear demand for that granularity.
Created #dbsp.
Fascinating. Is there an underlying intent like data sync between client and server? Or is it just experimentation for now. One thing that's confusing me in DBSP is this notion of z-sets, and how the weight allows for capturing deltas. Coming from my understanding of datoms I imagine they are like all datoms relating to one entity concatenated into a table row with an added weight column. But then I can't make sense of what a value above 1 or below -1 means since it would mean inserting or deleting the same row multiple times.
I won't claim to fully understand the theory behind the DBSP paper, some of the stuff flew over my head. But my interpretation of z-sets is that they are meant to represent any database changeset or "bags" in general. You can insert multiple rows in a SQL table containing the same values multiple times (usually we have some uniqueness constraint but it's not a given). You're right about it not making sense for datoms because they are assumed to be unique per entity, that's why I chose to not have numeric weights in my library. They're booleans instead indicating a assertion/retraction. If there are multiple identical tuples in a z-set, they are simply deduped, like Datomic does for datoms in a transaction. There are multiple use cases I am thinking of, one is a data sync between client and server like you said. The use case I am most interested in is writing business logic triggered by deltas from materialized views, I made a https://jumraiya.github.io/posts/datalog.html about it. By "experimental" I meant it's very far from ready for production use and will probably change significantly.
Yes the only explanation I could give was to support tables that are bags without unique constraints.