Fork me on GitHub
#xtdb
<
2023-01-09
>
phill17:01:12

Another perspective on explicit Datalog term order. I had a query that ran fast and returned a few things too many. I added one teeny-tiny term to the where clause, just to filter the results that had already been obtained. But the query then took seemingly infinite time.

refset17:01:18

Hey @U0HG4EHMH was the filter a < or > predicate, out of interest?

refset17:01:18

If you can share the query that would be really helpful. We are keen to make sure the query planner is both reliable and generally useful, so understanding the scenarios in which it makes the wrong assumptions is important for us to make the experience nicer.

refset17:01:15

Did you work around the issue in the end?

phill10:01:20

The additional "teeny-tiny" term was [e a v], and the attribute (and that particular value for the attribute) were very common throughout the documents. So I worked around it by making the query pull that attribute and programming a post-query filter.

phill10:01:37

The query also involved a recursive rule that traced attributes that were not very common in the data. If I squint, I can imagine the planner thinking "OMG recursion, this other thing is simpler, maybe I'll start with the simple part"

refset18:01:24

Ah good to know, and yes triple clauses kind of have this complected semantic that can making the optimizer go down the wrong path. I wonder if you may have been able to instead use [(get-attr ?e :my-att) [?v ...]] [(= ?v :whatever-filter-val)]

refset18:01:06

actually I am just going to add this to the docs as a general tip 💡

🎯 2