This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2021-12-01
Channels
- # adventofcode (93)
- # announcements (44)
- # asami (23)
- # aws (1)
- # babashka (48)
- # beginners (112)
- # calva (26)
- # cider (57)
- # clj-kondo (17)
- # cljfx (5)
- # cljs-dev (21)
- # clojure (124)
- # clojure-europe (19)
- # clojure-hungary (40)
- # clojure-nl (3)
- # clojure-spec (7)
- # clojure-uk (3)
- # clojurescript (3)
- # cursive (81)
- # datalog (11)
- # events (21)
- # exercism (1)
- # fulcro (37)
- # graalvm (1)
- # introduce-yourself (8)
- # jobs (1)
- # lsp (1)
- # malli (5)
- # membrane-term (17)
- # minecraft (3)
- # nextjournal (5)
- # off-topic (14)
- # other-lisps (14)
- # polylith (58)
- # reagent (16)
- # reclojure (3)
- # reitit (6)
- # remote-jobs (1)
- # shadow-cljs (55)
- # spacemacs (15)
- # testing (2)
- # tools-build (7)
- # tools-deps (191)
Can somebody explain why these two seemingly identical queries produce different results? Specifically, second result is out of order:
(d/transact conn
[{:db/id :tg/node-root :children [:tg/node-1]}
{:db/id :tg/node-1 :children [:tg/node-2]}
{:db/id :tg/node-2 :children [:tg/node-3]}
{:db/id :tg/node-3 :children [:tg/node-4]}
{:db/id :tg/node-4}])
(d/q '[:find ?parents
:in $
:where [?parents ?a+ :tg/node-4] [?parents :children]]
(d/db conn))
;; => ([:tg/node-root] [:tg/node-1] [:tg/node-2] [:tg/node-3])
(d/q '[:find ?parents
:in $ ?child
:where [?parents ?a+ ?child] [?parents :children]]
(d/db conn) :tg/node-4)
;; => ([:tg/node-root] [:tg/node-2] [:tg/node-1] [:tg/node-3])
It happens for nodes 3 and 4.
One thing I've noticed is they have different execution plans. Setting :planner :user
on the first query makes the plans and results the same (out-of-order). Okay, so re-arranging the vectors in the :where
clause of the second query and using :planner :user
returns the correct result. But, this is not an efficient execution plan, it seems to me (there are more nodes with :children
than nodes which have :tg/node-4
for an attribute's value).
What to do?The order of results is not specified. Despite being returned as a seq, it’s defined as set operations, so order should not matter. (I guess this can be addressed with an :order-by
clause, like SQL has).
The reason the order is different is because the query optimizer sees the :where
clause differently, since it doesn’t know what ?child
is being bound to
That could be addressed by taking bindings to scalars (like this), and rewriting the query before the planner is executed
Hm, okay, I guess I was misled by the first couple of example results being ordered in the https://github.com/threatgrid/asami/wiki/6.-Querying#transitive-attributes. So, just to confirm I'm not misunderstanding, there is no way to obtain the ordered path using transitive attributes (in a single query)?
There should be…
Actually, I was really thinking of it as a standard pattern query, and not transitive. Looking at it, I’m surprised that there is a difference in the result. One thing I’ll note though is that even with ?child
bound, you’re asking for an unbound attribute, which is a bit of a wildcard. This causes a breadth-first search across the database, which is kind of expensive! It’s always better to bind the attribute to a value or set of values, if that’s feasible
(I realize that this isn’t the same thing. I’m just wondering that’s the reason for the lack of order)
So, we have the chain
:tg/node-3 :children :tg/node-ARRAY
:tg/node-ARRAY :tg/contains :tg/node-4
About binding, not sure how I would re-write it. The idea is to have a function which returns the ordered path back to the root from given a node.
Hmm, OK.
I would definitely want it to be evaluating the [?parents :children]
first with :planner :user
. The reason is because without it, it could find:
:tg/node-3 :children :tg/node-ARRAY
:tg/node-ARRAY :tg/contains :tg/node-4
:tg/node-ARRAY :tg/first :tg/node-4
Worse… say there was a node-5 that was also in the array along with node-4:
:tg/node-3 :children :tg/node-ARRAY
:tg/node-ARRAY :tg/first :tg/node-4
:tg/node-ARRAY :tg/rest :tg/node-ARRAY-2
:tg/node-ARRAY :tg/contains :tg/node-4
:tg/node-ARRAY :tg/contains :tg/node-5
:tg/node-ARRAY-2 :tg/first :tg/node-5
This ends up being a lot of things to traverse before it filters it down to paths from the :tg/node-3
, which occurs when the [?parents :children]
pattern is brought in. But if it starts from that pattern, then the paths are only started from those nodes that have children.
It’s possibly this variety of paths that leads to the lack of ordering, since it does breadth-first searching (meaning that it will include all the :tg/first
and :tg/rest
paths before going down the :tg/contains
path). This will lead to lots of duplicates, but they will be removed a the end. So some of the nodes could be encountered through unexpected paths, and printed early, and then they don’t get printed again when they show up in the normal path because they’ve already appeared once.I've realised I don't have a mental model of how transitive attributes work. I'll think it through. Thank you for the discussion and your help :)
Well, it would be most welcome 🙂 So, I can understand something like
[?e :name "Washington Monument"] [?e :is-in+ ?e2]
or
[?parents ?a+ :tg/node-4]
(I just go backwards, starting from node-4). That last one has 9 results, so in that way I can see your point about starting from [?parent :children]
.If I think about doing a merge join of the two result sets, ok. But if I think about resolving something like [:tg/node-3 ?a+ :tg/node-4]
, which would be a one of the steps in doing a regular join... Well, since the join is on ?parents
, :tg/node-3
will be a result of the complete query if [:tg/node-3 ?a+ :tg/node-4]
is non-empty? Not thinking algorithmically, but I'm doing better 🙂
It still seems like ordering is purely accidental, Matter of fact, by re-ordering the original transaction data, I can produce different orderings.
(I'm starting to imagine people looking forward to new Asami features are getting really annoyed with me wasting your time like this... 🙂 And I'm one of them!)
I’m not spending much time on Asami in recent weeks, but I do hope to pick up on it again after this weekend. It's not a great month though (Christmas, vacation, and all that)