Fork me on GitHub
#datomic
<
2022-09-24
>
hanDerPeder21:09:10

I’ve modelled a linked list thing and, given a list element, want to find the head of the list. Attributes in play: • list/next ref to next elem in list • list/head a string placed on the first elem by convention I’ve made a recursive rule which walks the list backwards until it finds an entity with a list/head attribute

(def list-head-rule
  '[[(list-head ?x ?head)
     [?head :list/head]
     [(= ?x ?head)]]
    [(list-head ?x ?head)
     [?y :list/next ?x]
     [list-head ?y ?head]]])
And call the query as follows
(d/q '[:find [?head ...]
       :in $ % ?elem
       :where
       [list-head ?elem ?head]]
     (d/db conn)
     list-head-rule
     some-ent-in-list)
if some-ent-in-list is the head entity the query takes 2ms. if it’s the 100th it takes some-ent-in-list it takes 66ms. I find this surprisingly slow. If I use a recursive pull from the head I get the whole list in ~1ms. I though walking references backwards had the same perf as forward. Am I doing something terrible in the code above?

hanDerPeder21:09:04

using this pull gives me the result I want in ~8ms. Albeit, not pretty

(loop [obj (d/pull (d/db conn) '[* {[:list/_next :as :prev] ...}] some-ent-in-list)]
  (if-let [prev (some-> obj :prev first)]
    (recur prev)
    (dissoc obj :list/next)))

favila21:09:29

The first rule variant says [?head :list/head] but head is never bound when the rule is evaluated

favila21:09:02

So that realizes all head entities

favila21:09:38

Try [?x :list/head][(identity ?x) ?head]

favila21:09:46

Is so critical for perf to know what is bound so you can do clause ordering that it's nearly always unsafe to declare rules without that special “require bound variable” syntax

favila21:09:08

So I recommend something like (list-head [?x] ?head) in your definitions

hanDerPeder21:09:06

reading up on that now. what query time would you say is to be expected here? roughly same magnitude as the pull or slower?

favila21:09:27

Roughly same

hanDerPeder22:09:42

hmm, using

'[[(list-head [?x] ?head)
     [?x :list/head]
     [(identity ?x) ?head]]
    [(list-head [?x] ?head)
     [?y :list/next ?x]
     [list-head ?y ?head]]]
and query
(d/q '[:find ?head .
        :in $ % ?elem
        :where
        [?head :list/head]
        [list-head ?elem ?head]]
      (d/db conn)
      list-head-rule
      item-99-in-list)
I’m still getting ~66ms

hanDerPeder22:09:42

schema definition:

[#:db{:ident :list/head,
      :valueType :db.type/string,
      :cardinality :db.cardinality/one,
      :unique :db.unique/identity}
 #:db{:ident :list/next,
      :valueType :db.type/ref,
      :cardinality :db.cardinality/one}]

favila22:09:39

Is this db small enough to fit in memory?

hanDerPeder22:09:55

yes, toy example

hanDerPeder22:09:15

though currently using dev storage transactor

favila22:09:57

Do you see a difference if you def the query body? Perhaps there's some constant overhead

favila22:09:06

Otherwise idk. There is a tiny difference in index use (pull uses eavt for the first and last lookup; query generally uses aevt if a is static) but that's not a difference I'd expect t matter here

hanDerPeder22:09:55

no difference if I def the query body. Here’s a complete example if you want to try: https://gist.github.com/handerpeder/d6db931cd8fceca629ea4b42102d6f05 either way, thanks for the help!

favila01:09:23

The gist still has the problem of a too-early [?head :list/head] (line 48, that should just be removed), but that doesn’t matter at this data size.

favila01:09:33

AFAIKT recursive rules are just really slow…

favila01:09:49

This is an equivalent rule:

favila01:09:37

(defn heads [db elem]
  ;; same thing as the list-head rule,
  ;; including no assumptions about number of matches
  ;; and fully evaluating each branch.
  (-> #{}
      (into (map :e) (d/datoms db :aevt :list/head elem))
      (into (mapcat #(heads db (:e %)))
            (d/datoms db :vaet elem :list/next))))

(def list-head
  ['[(list-head [?x] ?head)
     [?x :list/head]
     [(identity ?x) ?head]]
   '[(list-head [?x] ?head)
     [?y :list/next ?x]
     (list-head ?y ?head)]
   ;; function call instead to avoid recursion
   ['(list-head2 [?x] ?head)
    [(list `heads '$ '?x) '[?head ...]]]])

favila01:09:11

(time
 (d/q '[:find ?head
        :in $ % ?value
        :where
        [?elem :list/value ?value]
        (list-head ?elem ?head)]
      (d/db conn)
      list-head
      99))
"Elapsed time: 40.233292 msecs"
=> #{[17592186045418]}
(time
 (d/q '[:find ?head
        :in $ % ?value
        :where
        [?elem :list/value ?value]
        (list-head2 ?elem ?head)]
      (d/db conn)
      list-head
      99))
"Elapsed time: 1.5925 msecs"
=> #{[17592186045418]}

favila01:09:55

but the rule is probably stack overflow resistant, my fn is not