Fork me on GitHub
#datascript
<
2022-08-20
>
Ben Sless06:08:56

It's a bad habit I have of sometimes profiling cool projects and finding some performance lying in the floor. Especially in light of this https://tonsky.me/blog/datascript-2/ I was wondering if these findings could be useful and if there's interest in applying the results

❤️ 1
Niki15:08:45

Of course!

Ben Sless18:08:01

With a couple of caveats: I profiled relatively small data sets, didn't run all the tests, and looked at a particular scenario ---- For example, I found that often there's a lot of duplicated work and redundant collections which are created, take hash-join and collapse-rels for example That whole code path would benefit greatly from an efficient implementation which gives both the keys' intersection and difference between the rels. hand rolling those gives quite a performance boost (vs. using (set (keys ,,)))

(defn vintersect-keys [attrs1 attrs2]
  (persistent! (reduce-kv (fn [vec k _] (if (attrs1 k) (conj! vec k) vec)) (transient []) attrs2))
  #_(set/intersection (set (keys attrs1))
                    (set (keys attrs2))))

;; (vintersect-keys {:a 1 :b 2} {:b 2})

(defn keys-intersect?
  [m1 m2]
  (if (> (count m2) (count m1))
    (recur m2 m1)
    (reduce-kv (fn [_ k _] (if (contains? m1 k) (reduced true) false)) false m2)))

(defn -group-by
  [f init coll]
  (persistent!
   (reduce
    (fn [ret x]
      (let [k (f x)]
        (assoc! ret k (conj (get ret k init) x))))
    (transient {}) coll)))

(defn hash-attrs [key-fn tuples]
  (-group-by key-fn '() tuples))

(defn hash-join [rel1 rel2]
  (let [tuples1       (:tuples rel1)
        tuples2       (:tuples rel2)
        attrs1        (:attrs rel1)
        attrs2        (:attrs rel2)
        common-attrs  (vintersect-keys (:attrs rel1) (:attrs rel2))
        common-gtrs1  (mapv #(getter-fn attrs1 %) common-attrs)
        common-gtrs2  (mapv #(getter-fn attrs2 %) common-attrs)
        keep-attrs1   (keys attrs1)
        ;; keep-attrs2   (vec (set/difference (set (keys attrs2)) (set keep-attrs1))) ; keys in attrs2-attrs1
        keep-attrs2   (persistent! (reduce-kv (fn [vec k _] (if (attrs1 k) vec (conj! vec k))) (transient []) attrs2)) ; keys in attrs2-attrs1
        keep-idxs1    (to-array (vals attrs1))
        keep-idxs2    (to-array (->Eduction (map attrs2) keep-attrs2)) ; vals in attrs2-attrs1 by keys
        key-fn1       (tuple-key-fn common-gtrs1)
        hash          (hash-attrs key-fn1 tuples1)
        key-fn2       (tuple-key-fn common-gtrs2)
        new-tuples    (->>
                        (reduce (fn [acc tuple2]
                                  (let [key (key-fn2 tuple2)]
                                    (if-some [tuples1 (get hash key)]
                                      (reduce (fn [acc tuple1]
                                                (conj! acc (join-tuples tuple1 keep-idxs1 tuple2 keep-idxs2)))
                                              acc tuples1)
                                      acc)))
                          (transient []) tuples2)
                        (persistent!))]
    (Relation.
     (into
      {}
      (map-indexed (fn [x y] [y x]))
      (->Eduction cat [keep-attrs1 keep-attrs2]))
     #_(zipmap (concat keep-attrs1 keep-attrs2) (range))
               new-tuples)))
Another one is silly, since join-tuples suffers when the tuple is an array, just add a predicate if it's an array and run another loop instead:
(defn join-tuples [t1 #?(:cljs idxs1
                         :clj  ^{:tag "[[Ljava.lang.Object;"} idxs1)
                   t2 #?(:cljs idxs2
                         :clj  ^{:tag "[[Ljava.lang.Object;"} idxs2)]
  (let [l1  (alength idxs1)
        l2  (alength idxs2)
        res (da/make-array (+ l1 l2))]
    (if (.isArray (.getClass ^Object t1))
      (dotimes [i l1]
        (aset res i (aget ^objects t1 (aget idxs1 i))))
      (dotimes [i l1]
        (aset res i (#?(:cljs da/aget :clj get) t1 (aget idxs1 i)))))
    (if (.isArray (.getClass ^Object t2))
      (dotimes [i l2]
        (aset res (+ l1 i) (get ^objects t2 (aget idxs2 i))))
      (dotimes [i l2]
        (aset res (+ l1 i) (#?(:cljs da/aget :clj get) t2 (aget idxs2 i)))))
    res))
Another place which kept popping up in profiling was looks-like?. big optimization would be to precompile patterns such that they won't have to be interpreted at query time, but even cutting out last/butlast where vectors are involved and avoiding sequence comparisons gave quite a boost:
(declare looks-like?)

(defn seq-looks-like? [[x & xs] [y & ys]]
  (if (looks-like? x y)
    (if (and xs ys)
      (recur xs ys)
      true)
    false))

(defn symbol-looks-like?
  [pattern form]
  (if (= '_ pattern)
    true
    (= form pattern)))

(defn sequence-looks-like?
  [pattern form]
  (if (and (== (count pattern) 1) (= '* (first pattern)))
    (sequential? form)
    (let [vec? (vector? pattern)
          _last (if vec? peek last)
          _butlast (if vec? pop butlast)]
      (if (= (_last pattern) '*)
        (and (sequential? form)
             (seq-looks-like? (_butlast pattern) form))
        (and (sequential? form)
             (= (count form) (count pattern))
             (seq-looks-like? pattern form))))))

(defn- looks-like? [pattern form]
  (cond
    (symbol? pattern) (symbol-looks-like? pattern form)
    (sequential? pattern) (sequence-looks-like? pattern form)
    :else (pattern form)))
These were all quick and dirty experiments, not a methodological study, but they look promising enough to pursue things further. Next items would be: proper methodology, good test cases, ClojureScript (which I pretty much neglected here), balancing readability and performance, and what counts as good enough improvements to justify ugly/hand-rolled code

Niki12:08:11

Looks great! Could you make a PR?

Niki12:08:12

Re: precompile queries, I had a similar idea. I have a query parser that I wrote for query engine rewrite, but I didn’t finished it. Old query engine doesn’t use it, and new one I didn’t finish. The parser is the only part of the rewrite that is finished and used for validation right now

Ben Sless14:08:34

I could make a PR, but I want better test cases first to have a better picture

Ben Sless14:08:44

I'd also want to bench both clj and cljs

Ben Sless14:08:50

to make sure I don't cause damage

Niki15:08:42

There are some basic benchmark in scripts folder, take a look at them. I think they should be easy to run, but if not, let me know

Ben Sless15:08:45

Are they representative of real use cases?

Niki15:08:19

Not really, but it's something

Niki15:08:53

I mean if they become stricly better than probably real use-cases will become better too

Ben Sless15:08:45

Alright, I'll start and split the changes up

Ben Sless13:08:33

Right, after a few rounds of golf serious work the first round of improvements looks like: (with profiling on, java 8, linux)

add-1                 372.2 ms/op  256.9 
add-5                 616.8 ms/op  503.9 
add-all               602.6 ms/op  518.0 
freeze                629.8 ms/op  582.6 
init                   23.9 ms/op   19.4 
pull-many               1.2 ms/op    1.1 
pull-many-entities      3.2 ms/op    2.9 
pull-one              0.633 ms/op  0.580 
pull-one-entities       1.1 ms/op  0.994 
pull-wildcard           1.5 ms/op    1.4 
q1                      1.4 ms/op    1.1 
q2                      3.5 ms/op    2.7 
q3                      5.2 ms/op    4.2 
q4                      8.2 ms/op    5.9 
qpred1                  4.9 ms/op    4.7 
qpred2                 14.8 ms/op   10.2 
retract-5             486.8 ms/op  383.6

Ben Sless14:08:00

Wow, here's something cool I just pulled off, q1 0.685

Ben Sless14:08:59

q4 down to 5ms

Ben Sless15:08:27

I think that without changing some of the essential algorithms there, it's the best an initial round of optimizations can get

Ben Sless16:08:49

If you feel like these improvements are worth it I can push the branch and we'll review the changes, I did a few things to help the JIT along which are weird

Ben Sless17:08:37

👀 I managed even better

q1    0.667 ms/op
q2      2.1 ms/op
q3      3.0 ms/op
q4      4.4 ms/op

🔥 1
Ben Sless17:08:23

@U050UBKAA this is the one thing I'm really not sure about, besides the fact it causes a test to fail, because it's perverse code

(defmacro defcomp
  [sym [arg1 arg2] & body]
  `(def ~sym
     (reify
       java.util.Comparator
       (compare [~'_ ~(with-meta arg1 {}) ~(with-meta arg2 {})]
         (let [~arg1 ~arg1 ~arg2 ~arg2] ~@body))
       clojure.lang.IFn
       (invoke [~'this ~(with-meta arg1 {}) ~(with-meta arg2 {})]
         (.compare ~'this ~(with-meta arg1 {}) ~(with-meta arg2 {})))
       IFn$OOL
       (invokePrim [~'this ~(with-meta arg1 {}) ~(with-meta arg2 {})]
         (.compare ~'this ~(with-meta arg1 {}) ~(with-meta arg2 {}))))))
But it gives a direct code path with primitives all the way out to compare

Ben Sless17:08:03

All comparison combinations happen on the stack, which is nice

Niki17:08:18

I don’t see why you would call it perverse. I think it’s pretty clear what it does. Maybe move (with-meta arg1/2 {}) outside quote and give it another name, so that macro body reads easier, otherwise looks nice. Which test fails?

Niki17:08:28

Also I don’t get what’s the idea with (let [~arg1 ~arg1 ~arg2 ~arg2]?

Niki18:08:11

Otherwise happy to review PR

Ben Sless18:08:54

The test which fails:

ERROR in (test-uncomparable-356) (RT.java:1249)
Uncaught exception, not in assertion.
expected: nil
  actual: java.lang.IllegalArgumentException: Value out of range for int: 2829092613
 at clojure.lang.RT.intCast (RT.java:1249)
    datascript.db$reify__583.compare (db.cljc:419)
    me.tonsky.persistent_sorted_set.Leaf.searchFirst (Leaf.java:54)
    me.tonsky.persistent_sorted_set.PersistentSortedSet.slice (PersistentSortedSet.java:70)
    me.tonsky.persistent_sorted_set.PersistentSortedSet.slice (PersistentSortedSet.java:49)
    me.tonsky.persistent_sorted_set$slice.invokeStatic (persistent_sorted_set.clj:30)
    me.tonsky.persistent_sorted_set$slice.invoke (persistent_sorted_set.clj:24)
    datascript.db.DB._search (db.cljc:558)
    datascript.db$fsearch.invokeStatic (db.cljc:515)
    datascript.db$fsearch.invoke (db.cljc:514)
    datascript.db$transact_add.invokeStatic (db.cljc:1345)
    datascript.db$transact_add.invoke (db.cljc:1335)
    datascript.db$transact_tx_data.invokeStatic (db.cljc:1601)
    datascript.db$transact_tx_data.invoke (db.cljc:1426)
    datascript.core$with.invokeStatic (core.cljc:263)
    datascript.core$with.invoke (core.cljc:256)
    datascript.core$with.invokeStatic (core.cljc:258)
    datascript.core$with.invoke (core.cljc:256)
    datascript.core$db_with.invokeStatic (core.cljc:270)
    datascript.core$db_with.invoke (core.cljc:266)
    datascript.test.transact$fn__9660.invokeStatic (transact.cljc:421)
    datascript.test.transact/fn (transact.cljc:402)
    clojure.test$test_var$fn__9763.invoke (test.clj:717)
    clojure.test$test_var.invokeStatic (test.clj:717)
    clojure.test$test_var.invoke (test.clj:708)
    clojure.test$test_vars$fn__9789$fn__9794.invoke (test.clj:735)
    clojure.test$default_fixture.invokeStatic (test.clj:687)
    clojure.test$default_fixture.invoke (test.clj:683)
    clojure.test$test_vars$fn__9789.invoke (test.clj:735)
    clojure.test$default_fixture.invokeStatic (test.clj:687)
    clojure.test$default_fixture.invoke (test.clj:683)
    clojure.test$test_vars.invokeStatic (test.clj:731)
    clojure.test$test_all_vars.invokeStatic (test.clj:737)
    clojure.test$test_ns.invokeStatic (test.clj:758)
    clojure.test$test_ns.invoke (test.clj:743)
    clojure.core$map$fn__5885.invoke (core.clj:2757)
    clojure.lang.LazySeq.sval (LazySeq.java:42)
    clojure.lang.LazySeq.seq (LazySeq.java:51)
    clojure.lang.ChunkedCons.chunkedNext (ChunkedCons.java:59)
    clojure.core$chunk_next.invokeStatic (core.clj:710)
    clojure.core$reduce1.invokeStatic (core.clj:944)
    clojure.core$reduce1.invokeStatic (core.clj:936)
    clojure.core$merge_with.invokeStatic (core.clj:3063)
    clojure.core$merge_with.doInvoke (core.clj:3055)
    clojure.lang.RestFn.applyTo (RestFn.java:139)
    clojure.core$apply.invokeStatic (core.clj:669)
    clojure.test$run_tests.invokeStatic (test.clj:768)
    clojure.test$run_tests.doInvoke (test.clj:768)
    clojure.lang.RestFn.applyTo (RestFn.java:137)
    clojure.core$apply.invokeStatic (core.clj:667)
    clojure.test$run_all_tests.invokeStatic (test.clj:780)
    clojure.test$run_all_tests.invoke (test.clj:780)
    datascript.test$test_clj$fn__10311.invoke (test.cljc:46)
    datascript.test.core$wrap_res.invokeStatic (core.cljc:41)
    datascript.test.core$wrap_res.invoke (core.cljc:39)
    datascript.test$test_clj.invokeStatic (test.cljc:46)
    datascript.test$test_clj.invoke (test.cljc:45)
    clojure.lang.Var.invoke (Var.java:380)
    user$eval144.invokeStatic (form-init836534765378270893.clj:1)
    user$eval144.invoke (form-init836534765378270893.clj:1)
    clojure.lang.Compiler.eval (Compiler.java:7181)
    clojure.lang.Compiler.eval (Compiler.java:7171)
    clojure.lang.Compiler.load (Compiler.java:7640)
    clojure.lang.Compiler.loadFile (Compiler.java:7578)
    clojure.main$load_script.invokeStatic (main.clj:475)
    clojure.main$init_opt.invokeStatic (main.clj:477)
    clojure.main$init_opt.invoke (main.clj:477)
    clojure.main$initialize.invokeStatic (main.clj:508)
    clojure.main$null_opt.invokeStatic (main.clj:542)
    clojure.main$null_opt.invoke (main.clj:539)
    clojure.main$main.invokeStatic (main.clj:664)
    clojure.main$main.doInvoke (main.clj:616)
    clojure.lang.RestFn.applyTo (RestFn.java:137)
    clojure.lang.Var.applyTo (Var.java:705)
    clojure.main.main (main.java:40)

Ben Sless18:08:30

let [~arg1 ~arg1 ~arg2 ~arg2]
is for metadata

Ben Sless18:08:37

to propagate the Datom type hint

Ben Sless18:08:11

I can probably refactor the macro to look more readable, if it's okay to move it past the wild experiment phase

Ben Sless18:08:04

I guess one of the comparisons doesn't return 0,1-1

Niki18:08:12

> is for metadata Oh I think I get it! Clever

Niki18:08:57

I’m ok with the macro, as long as you figure out what’s going on with the test and if it needs fixing

Ben Sless18:08:40

Found what was failing the test, the last case of value-compare

Ben Sless19:08:41

I'll update the benchmark results soon with java 8, 11, 17

Niki19:08:50

Awesome, will take a look soon!

Ben Sless19:08:22

I think the coolest part for me was realizing I can build up a chain of eductions during -collect which finally let me do a everything in a single lazy pass. It reduces straight over the sorted set with everything else being a transducer on top of it, which is cache friendly 🙂

Ben Sless19:08:34

now just to figure out a way to further optimize join

Ben Sless20:08:09

This is a bit of trivia, but adding vpred had an incredible effect on query performance

Darrick Wiebe02:08:42

@UK0810AQ2 Sorry to butt in here. I was a bit confused by this code and had a couple of questions. While working through it I think I may have answered them but I'll post it anyway with what seems to me to be a better variation of it below:

(defmacro defcomp
  [sym [arg1 arg2] & body]
  `(def ~sym
     (reify
       java.util.Comparator
       (compare [~'_ ~(with-meta arg1 {}) ~(with-meta arg2 {})]
         (let [~arg1 ~arg1 ~arg2 ~arg2] ~@body))
       clojure.lang.IFn
       (invoke [~'this ~(with-meta arg1 {}) ~(with-meta arg2 {})]
         (.compare ~'this ~(with-meta arg1 {}) ~(with-meta arg2 {})))
       IFn$OOL
       (invokePrim [~'this ~(with-meta arg1 {}) ~(with-meta arg2 {})]
         (.compare ~'this ~(with-meta arg1 {}) ~(with-meta arg2 {}))))))
1. Why are you removing the metadata from the arguments using ~(with-meta arg1 {})? 2. Is the idea of (let [~arg1 ~arg1 ~arg2 ~arg2] ~@body) to restore the metadata back to the arguments? 3. Given that you aren't exposing the arguments, why not make the macro both more standard and safer with this form instead?
(defmacro defcomp
  [sym [arg1 arg2] & body]
  `(def ~sym
     (reify
       java.util.Comparator
       (compare [_# arg1# arg2#]
         (let [~arg1 arg1# ~arg2 arg2#] ~@body))
       clojure.lang.IFn
       (invoke [this# arg1# arg2#]
         (.compare this# arg1# arg2#))
       IFn$OOL
       (invokePrim [this# arg1# arg2#]
         (.compare this# arg1# arg2#)))))

Darrick Wiebe02:08:18

I didn't know that you can implement primitive variants of IFn like IFn$OOL :thinking_face:

Ben Sless04:08:51

@U01D37REZHP I'm happy for anyone with good feedback to butt in, thank you. You're right the macro can be written like that, I ended up with something in between which made it into the PR. Regarding invoke prim, there are many of illegal fun things you can get the compiler and runtime to do

Ben Sless05:08:59

This is the current version in HEAD

#?(:clj
   (defmacro defcomp
     [sym [arg1 arg2] & body]
     (let [a1 (with-meta arg1 {})
           a2 (with-meta arg2 {})]
       `(def ~sym
          (reify
            java.util.Comparator
            (compare [~'_ ~a1 ~a2]
              (let [~arg1 ~arg1 ~arg2 ~arg2]
                ~@body))
            clojure.lang.IFn
            (invoke [~'this ~a1 ~a2]
              (.compare ~'this ~a1 ~a2))
            IFn$OOL
            (invokePrim [~'this ~a1 ~a2]
              (.compare ~'this ~a1 ~a2)))))))

Darrick Wiebe05:08:57

I think this block is technically wrong, though it will probably work ok for now. It should really refer to the a1 and a2 args.

(let [~arg1 ~arg1 ~arg2 ~arg2]
  ~@body)

Ben Sless09:08:10

Technically you're right, but this macro is very anaphoric, they all represent the same symbols behind the scenes

Ben Sless16:08:48

I'm not averse to changing it 🤷

Darrick Wiebe17:08:46

I'm a mere nosy onlooker here! My personal style is to only inject non-gensym symbols in a macro in places they are absolutely required, but this isn't my project. I'm looking forward to these speedups, though! Great work! I use this project in many contexts...

Ben Sless19:08:07

It's welcome feedback, I can't claim to be sure regarding the proper style here

Darrick Wiebe19:08:35

Using gensym or sym# protects you from odd bugs. For instance, what if I call one of my arguments this? You could argue that it's not a good idea, but it would break this macro.

Niki19:08:44

Oh you are right, I missed it. Luckily this macros is only used internally, but I’ll change ~'this to this# in the next version

👍 1
Niki19:08:11

Thanks @UK0810AQ2 this is a fantastic contribution! Well-presented, too

Ben Sless19:08:22

Thank you, I hope we can find more optimization opportunities there

Niki19:08:57

100%, let me know if you spot anything else

Niki19:08:47

BTW I never used Eduction before, nice to see an example

Ben Sless19:08:46

Eductions are awesome if you can guarantee you'll consume them once, maximal laziness on one hand and operators fusion on the other

Ben Sless04:08:01

It's interesting how different jvm versions have such widely varying results