Fork me on GitHub

thanks @rauh @jrychter it was really interesting reading about your use-cases. I tend to use DS more as @rauh described: storing entities & navigating between them


@rauh I wonder if there’s something that can make DS better for this use-case (apart from throwing away queries). Are there any potential for big speed gains if we re-purpose DS for this use-case only


@jrychter did you found Clojure maps to be significantly faster than DS? I envisioned that even if you don’t use rich DS capabilities it can still be a nice place to store all your data anyways. Is performance difference the only thing that forced you to move part of you data out of DS?


I actually saw a mistake I wrote: "I found read performance of DS to be *superb*".


So, I've been really liking the automatic re-render and automatic "dependency" recording when accessing attributes of entities. One thing that's slightly problematic: Force re-render only one component which has some DS entities as args, will not update those entities, I have a mixin that automatically does update the entities but it's not super nice. I kind of want a "mutable entity" that always stays up to date with the DS connection and which is snapshottable and is then attached to a single DB.


I haven't measured the performance, I optimized the -attrs-by to do a faster JS map lookup. And the cmp-datoms for CLJS only.


Eg. I'm doing:

(defn cmp-datoms-eavt-quick
  [d1 d2]
    (- (.-e d1) (.-e d2))
    (js* "~{} > ~{} ? 1 : ~{} < ~{} ? -1 : 0"
         (.. d1 -a -fqn) (.. d2 -a -fqn) (.. d1 -a -fqn) (.. d2 -a -fqn))
    (compare (.-v d1) (.-v d2))
    (- (.-tx d1) (.-tx d2))))


Obviously hacky, but it works well.


yeah I have no intuition how much slower entities are to maps. 2×, less than that, or 10×. Would be interesting to measure


I'd add "datoms are perfect for client-server sync", I'd keep datascript in worker/thread just for that


@rauh are you comparing strings through JS? Is it significantly faster?


hah, keywords are actually compared differently, I believe


@tonsky Yeah, it's faster, but I haven't measure by how much. I'd have to resetup datascript and measure the changes. The .fqn hack also means that it's a different sort order than vanilla CLJS.


first namespaces are compared and then actual names


yes I can see that


Yeah I'm aware. But not an issue for the index order.


Also, I only use namespaced keywords, so actually not a difference for me.


well I guess sorting order doesn’t really matter as long as we use the same algorithm during binary search


Yeah, exactly.


I also strip out a few validate-.... call in production. Certainly does't harm performance


back to your idea about “live” entities. You actually track during component render all attribute access, right? And then set up listeners that trigger re-render when those attributes change? Is that what you do?


Yeah, I track every attribute access and store it in a *dynamic*, I actually got the idea from your DS examples.


Then, similar to your examples, I listen on the DS connection and force rerender every component.


should-update is static to #(-> false), and a mixin inspects the args and calls a (updated-entity ent) on each entity before re-rendering


My queries are like e-by-av, e-by-a, es-by-a, es-by-av which also record the access so I get automatic rerendering of those queries


I modeled my data so all my data can be fetched very fast and easy with such queries, very rarely (2x in my app) i do a filter on a second attribute on the result.


It's quite similar to precept which also only re-renders the components that need to be.


I think entity could be sped up: Store an Iter along in the entity that represents all datoms for the given entity. Then add a seek to btset that can seek on an Iter. Would that be good idea @tonsky ?


I was actually thinking about that


like arraymap works


most of entities are small and linear seek might be more efficient


have small amount of attributes


Yeah if Iter had an efficient ICounted then the cache could just be filled right away if the Iter has --let's say-- <= 20 datoms.


So basically an implicit/automatic touch, which isnt' done if the entity has many many refs.


actually you wouldn’t need a cache at all


no need to move attributes from one linear collection to another one


Oh right, if a seek on Iter is faster or as fast as CLJS maps, then yeah... no cache.


Another unrelated idea: We often need only ONE single datom. So (first (datoms ...)) is quite common in DS. Could this be implemented in btset?


my only concern is how to figure out if entity has a lot attributes or a little?


I tried, but avoiding calling -rseek in -slice gave funky results.


So is (count some-iter) == right - left ?


sometimes, yes. But if iter starts and ends on a different leafs of btree it’s not that simple :)


Any thoughts on the (first (datoms ...))? IMO that'd be more beneficial since that is used a bunch in transacting.


still can be calculated relatively fast


how do you see perf improvement in (first (datom?


Yeah I think the count can be made fast, I've been using manual count with the chunking of the Iter.


one binary lookup instead of two?


Yeah I tried that but it didn't work. I'm missing something


(defn slice-one
  "BROKEN! DONT USE! Like slice but returns a single Datom."
  [btset x]
  (let [path (-seek btset x)]
    (when-not (neg? path)
      (when-some [keys (keys-for btset path)]
        (aget keys (path-get path 0))))))


yeah I won’t be able to look into it any time soon but you can leave a note in the issue tracker for such API


Ok I'll create some. Do you also want: 1. Count on Iter. 2. Optimize entity?


yeah put it there so I wont forget


@tonsky Yes, Clojure maps were significantly faster. And performance was the only reason why I moved data out. Now I have to maintain another map ("table") and manage changes (store incoming data, process updates, delete unnecesary data), so there is some code overhead. But it makes sense: if I'm not querying on attributes, "exploding" them into EAV and then assembling data together when I need it is a lot of work. I also found that in a React app managing re-rendering is easier if your data isn't in DataScript. If performance wasn't a concern I would rather have everything in DataScript, but I'd have to figure out a good way to process re-rendering without pulling everything.


On the topic of re-rendering on DS changes, have you looked at @jrychter ?


Works by listening to tx report queue, then pattern matching to re-query affected queries


@colindresj No, I use Rum for interfacing with React (migrated from Reagent). But I did once implement a solution that processed the tx queue, along with a pub/sub mechanism so that components could subscribe to "interesting" changes. It didn't work that well in practice, although I'm hazy on the details. I know I scrapped it pretty quickly, so there must have been drawbacks.


The first thing that bites you is that you need to use pull, because lazy entity resolving will not cause components to update properly. And then it quickly turns out (not unexpected) that pulling lots of data eventually becomes noticeable performance-wise. Posh has the right idea and I think it could work well for many apps — but a similar idea did not work for me.


@jrychter what was your target environment? Like, old android mobile browser, or desktop chrome?