datascript

Ben Sless 2022-08-20T06:22:56.220779Z

It's a bad habit I have of sometimes profiling cool projects and finding some performance lying in the floor. Especially in light of this https://tonsky.me/blog/datascript-2/ I was wondering if these findings could be useful and if there's interest in applying the results

❤️ 1
Niki 2022-08-22T12:35:11.820499Z

Looks great! Could you make a PR?

Niki 2022-08-22T12:37:12.390039Z

Re: precompile queries, I had a similar idea. I have a query parser that I wrote for query engine rewrite, but I didn’t finished it. Old query engine doesn’t use it, and new one I didn’t finish. The parser is the only part of the rewrite that is finished and used for validation right now

Ben Sless 2022-08-22T14:08:34.774179Z

I could make a PR, but I want better test cases first to have a better picture

Ben Sless 2022-08-22T14:08:44.654259Z

I'd also want to bench both clj and cljs

Ben Sless 2022-08-22T14:08:50.907819Z

to make sure I don't cause damage

Niki 2022-08-22T15:01:42.057379Z

There are some basic benchmark in scripts folder, take a look at them. I think they should be easy to run, but if not, let me know

Ben Sless 2022-08-22T15:47:45.362349Z

Are they representative of real use cases?

Niki 2022-08-22T15:52:19.537459Z

Not really, but it's something

Niki 2022-08-22T15:52:53.515749Z

I mean if they become stricly better than probably real use-cases will become better too

Ben Sless 2022-08-22T15:56:45.845709Z

Alright, I'll start and split the changes up

Ben Sless 2022-08-23T13:55:33.658229Z

Right, after a few rounds of golf serious work the first round of improvements looks like: (with profiling on, java 8, linux)

add-1                 372.2 ms/op  256.9 
add-5                 616.8 ms/op  503.9 
add-all               602.6 ms/op  518.0 
freeze                629.8 ms/op  582.6 
init                   23.9 ms/op   19.4 
pull-many               1.2 ms/op    1.1 
pull-many-entities      3.2 ms/op    2.9 
pull-one              0.633 ms/op  0.580 
pull-one-entities       1.1 ms/op  0.994 
pull-wildcard           1.5 ms/op    1.4 
q1                      1.4 ms/op    1.1 
q2                      3.5 ms/op    2.7 
q3                      5.2 ms/op    4.2 
q4                      8.2 ms/op    5.9 
qpred1                  4.9 ms/op    4.7 
qpred2                 14.8 ms/op   10.2 
retract-5             486.8 ms/op  383.6

Ben Sless 2022-08-23T14:35:00.002159Z

Wow, here's something cool I just pulled off, q1 0.685

Ben Sless 2022-08-23T14:41:59.009129Z

q4 down to 5ms

Ben Sless 2022-08-23T15:00:27.659709Z

I think that without changing some of the essential algorithms there, it's the best an initial round of optimizations can get

Ben Sless 2022-08-23T16:39:49.821369Z

If you feel like these improvements are worth it I can push the branch and we'll review the changes, I did a few things to help the JIT along which are weird

Ben Sless 2022-08-23T17:21:37.453499Z

👀 I managed even better

q1    0.667 ms/op
q2      2.1 ms/op
q3      3.0 ms/op
q4      4.4 ms/op

🔥 1
Ben Sless 2022-08-23T17:39:23.510619Z

@tonsky this is the one thing I'm really not sure about, besides the fact it causes a test to fail, because it's perverse code

(defmacro defcomp
  [sym [arg1 arg2] & body]
  `(def ~sym
     (reify
       java.util.Comparator
       (compare [~'_ ~(with-meta arg1 {}) ~(with-meta arg2 {})]
         (let [~arg1 ~arg1 ~arg2 ~arg2] ~@body))
       clojure.lang.IFn
       (invoke [~'this ~(with-meta arg1 {}) ~(with-meta arg2 {})]
         (.compare ~'this ~(with-meta arg1 {}) ~(with-meta arg2 {})))
       IFn$OOL
       (invokePrim [~'this ~(with-meta arg1 {}) ~(with-meta arg2 {})]
         (.compare ~'this ~(with-meta arg1 {}) ~(with-meta arg2 {}))))))
But it gives a direct code path with primitives all the way out to compare

Ben Sless 2022-08-23T17:43:03.850339Z

All comparison combinations happen on the stack, which is nice

Niki 2022-08-23T17:48:18.181099Z

I don’t see why you would call it perverse. I think it’s pretty clear what it does. Maybe move (with-meta arg1/2 {}) outside quote and give it another name, so that macro body reads easier, otherwise looks nice. Which test fails?

Niki 2022-08-23T17:52:28.012129Z

Also I don’t get what’s the idea with (let [~arg1 ~arg1 ~arg2 ~arg2]?

Niki 2022-08-23T18:02:11.683869Z

Otherwise happy to review PR

Ben Sless 2022-08-23T18:14:54.555899Z

The test which fails:

ERROR in (test-uncomparable-356) (RT.java:1249)
Uncaught exception, not in assertion.
expected: nil
  actual: java.lang.IllegalArgumentException: Value out of range for int: 2829092613
 at clojure.lang.RT.intCast (RT.java:1249)
    datascript.db$reify__583.compare (db.cljc:419)
    me.tonsky.persistent_sorted_set.Leaf.searchFirst (Leaf.java:54)
    me.tonsky.persistent_sorted_set.PersistentSortedSet.slice (PersistentSortedSet.java:70)
    me.tonsky.persistent_sorted_set.PersistentSortedSet.slice (PersistentSortedSet.java:49)
    me.tonsky.persistent_sorted_set$slice.invokeStatic (persistent_sorted_set.clj:30)
    me.tonsky.persistent_sorted_set$slice.invoke (persistent_sorted_set.clj:24)
    datascript.db.DB._search (db.cljc:558)
    datascript.db$fsearch.invokeStatic (db.cljc:515)
    datascript.db$fsearch.invoke (db.cljc:514)
    datascript.db$transact_add.invokeStatic (db.cljc:1345)
    datascript.db$transact_add.invoke (db.cljc:1335)
    datascript.db$transact_tx_data.invokeStatic (db.cljc:1601)
    datascript.db$transact_tx_data.invoke (db.cljc:1426)
    datascript.core$with.invokeStatic (core.cljc:263)
    datascript.core$with.invoke (core.cljc:256)
    datascript.core$with.invokeStatic (core.cljc:258)
    datascript.core$with.invoke (core.cljc:256)
    datascript.core$db_with.invokeStatic (core.cljc:270)
    datascript.core$db_with.invoke (core.cljc:266)
    datascript.test.transact$fn__9660.invokeStatic (transact.cljc:421)
    datascript.test.transact/fn (transact.cljc:402)
    clojure.test$test_var$fn__9763.invoke (test.clj:717)
    clojure.test$test_var.invokeStatic (test.clj:717)
    clojure.test$test_var.invoke (test.clj:708)
    clojure.test$test_vars$fn__9789$fn__9794.invoke (test.clj:735)
    clojure.test$default_fixture.invokeStatic (test.clj:687)
    clojure.test$default_fixture.invoke (test.clj:683)
    clojure.test$test_vars$fn__9789.invoke (test.clj:735)
    clojure.test$default_fixture.invokeStatic (test.clj:687)
    clojure.test$default_fixture.invoke (test.clj:683)
    clojure.test$test_vars.invokeStatic (test.clj:731)
    clojure.test$test_all_vars.invokeStatic (test.clj:737)
    clojure.test$test_ns.invokeStatic (test.clj:758)
    clojure.test$test_ns.invoke (test.clj:743)
    clojure.core$map$fn__5885.invoke (core.clj:2757)
    clojure.lang.LazySeq.sval (LazySeq.java:42)
    clojure.lang.LazySeq.seq (LazySeq.java:51)
    clojure.lang.ChunkedCons.chunkedNext (ChunkedCons.java:59)
    clojure.core$chunk_next.invokeStatic (core.clj:710)
    clojure.core$reduce1.invokeStatic (core.clj:944)
    clojure.core$reduce1.invokeStatic (core.clj:936)
    clojure.core$merge_with.invokeStatic (core.clj:3063)
    clojure.core$merge_with.doInvoke (core.clj:3055)
    clojure.lang.RestFn.applyTo (RestFn.java:139)
    clojure.core$apply.invokeStatic (core.clj:669)
    clojure.test$run_tests.invokeStatic (test.clj:768)
    clojure.test$run_tests.doInvoke (test.clj:768)
    clojure.lang.RestFn.applyTo (RestFn.java:137)
    clojure.core$apply.invokeStatic (core.clj:667)
    clojure.test$run_all_tests.invokeStatic (test.clj:780)
    clojure.test$run_all_tests.invoke (test.clj:780)
    datascript.test$test_clj$fn__10311.invoke (test.cljc:46)
    datascript.test.core$wrap_res.invokeStatic (core.cljc:41)
    datascript.test.core$wrap_res.invoke (core.cljc:39)
    datascript.test$test_clj.invokeStatic (test.cljc:46)
    datascript.test$test_clj.invoke (test.cljc:45)
    clojure.lang.Var.invoke (Var.java:380)
    user$eval144.invokeStatic (form-init836534765378270893.clj:1)
    user$eval144.invoke (form-init836534765378270893.clj:1)
    clojure.lang.Compiler.eval (Compiler.java:7181)
    clojure.lang.Compiler.eval (Compiler.java:7171)
    clojure.lang.Compiler.load (Compiler.java:7640)
    clojure.lang.Compiler.loadFile (Compiler.java:7578)
    clojure.main$load_script.invokeStatic (main.clj:475)
    clojure.main$init_opt.invokeStatic (main.clj:477)
    clojure.main$init_opt.invoke (main.clj:477)
    clojure.main$initialize.invokeStatic (main.clj:508)
    clojure.main$null_opt.invokeStatic (main.clj:542)
    clojure.main$null_opt.invoke (main.clj:539)
    clojure.main$main.invokeStatic (main.clj:664)
    clojure.main$main.doInvoke (main.clj:616)
    clojure.lang.RestFn.applyTo (RestFn.java:137)
    clojure.lang.Var.applyTo (Var.java:705)
    clojure.main.main (main.java:40)

Ben Sless 2022-08-23T18:15:30.322039Z

let [~arg1 ~arg1 ~arg2 ~arg2]
is for metadata

Ben Sless 2022-08-23T18:15:37.588439Z

to propagate the Datom type hint

Ben Sless 2022-08-23T18:16:11.475109Z

I can probably refactor the macro to look more readable, if it's okay to move it past the wild experiment phase

Ben Sless 2022-08-23T18:17:04.317919Z

I guess one of the comparisons doesn't return 0,1-1

Niki 2022-08-23T18:19:12.667369Z

> is for metadata Oh I think I get it! Clever

Niki 2022-08-23T18:19:57.933749Z

I’m ok with the macro, as long as you figure out what’s going on with the test and if it needs fixing

Ben Sless 2022-08-23T18:42:40.629829Z

Found what was failing the test, the last case of value-compare

Ben Sless 2022-08-23T19:17:11.457299Z

Here we go https://github.com/tonsky/datascript/pull/436

Ben Sless 2022-08-23T19:20:41.421259Z

I'll update the benchmark results soon with java 8, 11, 17

Niki 2022-08-23T19:26:50.809279Z

Awesome, will take a look soon!

Ben Sless 2022-08-23T19:44:22.949719Z

I think the coolest part for me was realizing I can build up a chain of eductions during -collect which finally let me do a everything in a single lazy pass. It reduces straight over the sorted set with everything else being a transducer on top of it, which is cache friendly 🙂

Ben Sless 2022-08-23T19:44:34.196529Z

now just to figure out a way to further optimize join

Ben Sless 2022-08-23T20:16:09.691499Z

This is a bit of trivia, but adding vpred had an incredible effect on query performance

Darrick Wiebe 2022-08-24T02:13:42.775479Z

@ben.sless Sorry to butt in here. I was a bit confused by this code and had a couple of questions. While working through it I think I may have answered them but I'll post it anyway with what seems to me to be a better variation of it below:

(defmacro defcomp
  [sym [arg1 arg2] & body]
  `(def ~sym
     (reify
       java.util.Comparator
       (compare [~'_ ~(with-meta arg1 {}) ~(with-meta arg2 {})]
         (let [~arg1 ~arg1 ~arg2 ~arg2] ~@body))
       clojure.lang.IFn
       (invoke [~'this ~(with-meta arg1 {}) ~(with-meta arg2 {})]
         (.compare ~'this ~(with-meta arg1 {}) ~(with-meta arg2 {})))
       IFn$OOL
       (invokePrim [~'this ~(with-meta arg1 {}) ~(with-meta arg2 {})]
         (.compare ~'this ~(with-meta arg1 {}) ~(with-meta arg2 {}))))))
1. Why are you removing the metadata from the arguments using ~(with-meta arg1 {})? 2. Is the idea of (let [~arg1 ~arg1 ~arg2 ~arg2] ~@body) to restore the metadata back to the arguments? 3. Given that you aren't exposing the arguments, why not make the macro both more standard and safer with this form instead?
(defmacro defcomp
  [sym [arg1 arg2] & body]
  `(def ~sym
     (reify
       java.util.Comparator
       (compare [_# arg1# arg2#]
         (let [~arg1 arg1# ~arg2 arg2#] ~@body))
       clojure.lang.IFn
       (invoke [this# arg1# arg2#]
         (.compare this# arg1# arg2#))
       IFn$OOL
       (invokePrim [this# arg1# arg2#]
         (.compare this# arg1# arg2#)))))

Darrick Wiebe 2022-08-24T02:16:18.340059Z

I didn't know that you can implement primitive variants of IFn like IFn$OOL 🤔

Ben Sless 2022-08-24T04:18:51.925229Z

@darrickw I'm happy for anyone with good feedback to butt in, thank you. You're right the macro can be written like that, I ended up with something in between which made it into the PR. Regarding invoke prim, there are many of illegal fun things you can get the compiler and runtime to do

Ben Sless 2022-08-24T05:22:59.650449Z

This is the current version in HEAD

#?(:clj
   (defmacro defcomp
     [sym [arg1 arg2] & body]
     (let [a1 (with-meta arg1 {})
           a2 (with-meta arg2 {})]
       `(def ~sym
          (reify
            java.util.Comparator
            (compare [~'_ ~a1 ~a2]
              (let [~arg1 ~arg1 ~arg2 ~arg2]
                ~@body))
            clojure.lang.IFn
            (invoke [~'this ~a1 ~a2]
              (.compare ~'this ~a1 ~a2))
            IFn$OOL
            (invokePrim [~'this ~a1 ~a2]
              (.compare ~'this ~a1 ~a2)))))))

Darrick Wiebe 2022-08-24T05:59:57.818229Z

I think this block is technically wrong, though it will probably work ok for now. It should really refer to the a1 and a2 args.

(let [~arg1 ~arg1 ~arg2 ~arg2]
  ~@body)

Niki 2022-08-21T15:32:45.524179Z

Of course!

Ben Sless 2022-08-21T18:31:01.994939Z

With a couple of caveats: I profiled relatively small data sets, didn't run all the tests, and looked at a particular scenario ---- For example, I found that often there's a lot of duplicated work and redundant collections which are created, take hash-join and collapse-rels for example That whole code path would benefit greatly from an efficient implementation which gives both the keys' intersection and difference between the rels. hand rolling those gives quite a performance boost (vs. using (set (keys ,,)))

(defn vintersect-keys [attrs1 attrs2]
  (persistent! (reduce-kv (fn [vec k _] (if (attrs1 k) (conj! vec k) vec)) (transient []) attrs2))
  #_(set/intersection (set (keys attrs1))
                    (set (keys attrs2))))

;; (vintersect-keys {:a 1 :b 2} {:b 2})

(defn keys-intersect?
  [m1 m2]
  (if (> (count m2) (count m1))
    (recur m2 m1)
    (reduce-kv (fn [_ k _] (if (contains? m1 k) (reduced true) false)) false m2)))

(defn -group-by
  [f init coll]
  (persistent!
   (reduce
    (fn [ret x]
      (let [k (f x)]
        (assoc! ret k (conj (get ret k init) x))))
    (transient {}) coll)))

(defn hash-attrs [key-fn tuples]
  (-group-by key-fn '() tuples))

(defn hash-join [rel1 rel2]
  (let [tuples1       (:tuples rel1)
        tuples2       (:tuples rel2)
        attrs1        (:attrs rel1)
        attrs2        (:attrs rel2)
        common-attrs  (vintersect-keys (:attrs rel1) (:attrs rel2))
        common-gtrs1  (mapv #(getter-fn attrs1 %) common-attrs)
        common-gtrs2  (mapv #(getter-fn attrs2 %) common-attrs)
        keep-attrs1   (keys attrs1)
        ;; keep-attrs2   (vec (set/difference (set (keys attrs2)) (set keep-attrs1))) ; keys in attrs2-attrs1
        keep-attrs2   (persistent! (reduce-kv (fn [vec k _] (if (attrs1 k) vec (conj! vec k))) (transient []) attrs2)) ; keys in attrs2-attrs1
        keep-idxs1    (to-array (vals attrs1))
        keep-idxs2    (to-array (->Eduction (map attrs2) keep-attrs2)) ; vals in attrs2-attrs1 by keys
        key-fn1       (tuple-key-fn common-gtrs1)
        hash          (hash-attrs key-fn1 tuples1)
        key-fn2       (tuple-key-fn common-gtrs2)
        new-tuples    (->>
                        (reduce (fn [acc tuple2]
                                  (let [key (key-fn2 tuple2)]
                                    (if-some [tuples1 (get hash key)]
                                      (reduce (fn [acc tuple1]
                                                (conj! acc (join-tuples tuple1 keep-idxs1 tuple2 keep-idxs2)))
                                              acc tuples1)
                                      acc)))
                          (transient []) tuples2)
                        (persistent!))]
    (Relation.
     (into
      {}
      (map-indexed (fn [x y] [y x]))
      (->Eduction cat [keep-attrs1 keep-attrs2]))
     #_(zipmap (concat keep-attrs1 keep-attrs2) (range))
               new-tuples)))
Another one is silly, since join-tuples suffers when the tuple is an array, just add a predicate if it's an array and run another loop instead:
(defn join-tuples [t1 #?(:cljs idxs1
                         :clj  ^{:tag "[[Ljava.lang.Object;"} idxs1)
                   t2 #?(:cljs idxs2
                         :clj  ^{:tag "[[Ljava.lang.Object;"} idxs2)]
  (let [l1  (alength idxs1)
        l2  (alength idxs2)
        res (da/make-array (+ l1 l2))]
    (if (.isArray (.getClass ^Object t1))
      (dotimes [i l1]
        (aset res i (aget ^objects t1 (aget idxs1 i))))
      (dotimes [i l1]
        (aset res i (#?(:cljs da/aget :clj get) t1 (aget idxs1 i)))))
    (if (.isArray (.getClass ^Object t2))
      (dotimes [i l2]
        (aset res (+ l1 i) (get ^objects t2 (aget idxs2 i))))
      (dotimes [i l2]
        (aset res (+ l1 i) (#?(:cljs da/aget :clj get) t2 (aget idxs2 i)))))
    res))
Another place which kept popping up in profiling was looks-like?. big optimization would be to precompile patterns such that they won't have to be interpreted at query time, but even cutting out last/butlast where vectors are involved and avoiding sequence comparisons gave quite a boost:
(declare looks-like?)

(defn seq-looks-like? [[x & xs] [y & ys]]
  (if (looks-like? x y)
    (if (and xs ys)
      (recur xs ys)
      true)
    false))

(defn symbol-looks-like?
  [pattern form]
  (if (= '_ pattern)
    true
    (= form pattern)))

(defn sequence-looks-like?
  [pattern form]
  (if (and (== (count pattern) 1) (= '* (first pattern)))
    (sequential? form)
    (let [vec? (vector? pattern)
          _last (if vec? peek last)
          _butlast (if vec? pop butlast)]
      (if (= (_last pattern) '*)
        (and (sequential? form)
             (seq-looks-like? (_butlast pattern) form))
        (and (sequential? form)
             (= (count form) (count pattern))
             (seq-looks-like? pattern form))))))

(defn- looks-like? [pattern form]
  (cond
    (symbol? pattern) (symbol-looks-like? pattern form)
    (sequential? pattern) (sequence-looks-like? pattern form)
    :else (pattern form)))
These were all quick and dirty experiments, not a methodological study, but they look promising enough to pursue things further. Next items would be: proper methodology, good test cases, ClojureScript (which I pretty much neglected here), balancing readability and performance, and what counts as good enough improvements to justify ugly/hand-rolled code

Ben Sless 2022-08-24T09:36:10.837449Z

Technically you're right, but this macro is very anaphoric, they all represent the same symbols behind the scenes

Ben Sless 2022-08-24T16:45:48.538719Z

I'm not averse to changing it 🤷

Darrick Wiebe 2022-08-24T17:14:46.530519Z

I'm a mere nosy onlooker here! My personal style is to only inject non-gensym symbols in a macro in places they are absolutely required, but this isn't my project. I'm looking forward to these speedups, though! Great work! I use this project in many contexts...

Ben Sless 2022-08-24T19:11:07.216939Z

It's welcome feedback, I can't claim to be sure regarding the proper style here

Darrick Wiebe 2022-08-24T19:21:35.560689Z

Using gensym or sym# protects you from odd bugs. For instance, what if I call one of my arguments this? You could argue that it's not a good idea, but it would break this macro.

Niki 2022-08-24T19:26:44.581329Z

Oh you are right, I missed it. Luckily this macros is only used internally, but I’ll change ~'this to this# in the next version

👍 1
Niki 2022-08-24T19:30:11.171229Z

Thanks @ben.sless this is a fantastic contribution! Well-presented, too

Ben Sless 2022-08-24T19:34:12.549339Z

🙏

Ben Sless 2022-08-24T19:35:04.562069Z

partyparrot

Ben Sless 2022-08-24T19:35:22.569319Z

Thank you, I hope we can find more optimization opportunities there

Niki 2022-08-24T19:35:57.646319Z

100%, let me know if you spot anything else

Niki 2022-08-24T19:37:47.246369Z

BTW I never used Eduction before, nice to see an example

Ben Sless 2022-08-24T19:38:46.584109Z

Eductions are awesome if you can guarantee you'll consume them once, maximal laziness on one hand and operators fusion on the other

Ben Sless 2022-08-25T04:22:01.003159Z

It's interesting how different jvm versions have such widely varying results