Fork me on GitHub
#xtdb
<
2021-05-30
>
rschmukler15:05:02

Hey guys. I’m working on a custom indexer for Crux based somewhat loosely off of the code seen in crux-lucene. I’m still getting familiar with the vocabulary and architecture used in Crux, so please forgive me if I ask a silly question. I believe I may have encountered an issue where the query planner doesn’t make use of logic variables returned from a predicate constraint to more efficiently join. Mapping this back into the crux-lucene, it’d be something like the following (note: I haven’t tested this issue in crux-lucene but I believe that it would have the same behavior):

{:find [?a-name ?b-age]
 :limit 5
 :where [[_ :person/name ?a-name]
         [(text-search :person/name ?a-name) [[?b]]
         [?b :person/age ?b-age]]}
In a node with ~800K records this query times out. I believe it’s because it is first using the triple store to determine all records records for ?b , instead of taking into account the :limit and then joining later… but I’m not too sure.

nivekuil15:05:14

actually yeah, :limit always happens last afaik

nivekuil15:05:33

what are you trying to take the first 5 of? probably have to have an attribute that supports range queries in there first

rschmukler15:05:52

The example is, perhaps, a bit contrived. But basically I could imagine displaying a page of records to a user, traversing via a predicate w/ a custom index, and then wanting a nested property off of joined record

nivekuil15:05:21

I think the custom index clause would have to be a subquery then?

✔️ 3
rschmukler15:05:18

That’s an interesting thought, I’ll look into that!

rschmukler15:05:02

It’d be cool if the query planner could potentially be informed by the number of results anticipated from a predicate so that it could use that when planning the join order…

rschmukler15:05:15

Eg. lucene queries that have a finite limit specified, etc.

nivekuil15:05:35

yeah I could really use that too. I wonder if the new hll attribute counter path is generic enough to extend to our own counters

rschmukler15:05:05

It does indeed use the two tripples and then attempt to join after it looks like

rschmukler15:05:41

I also wonder if predicates could potentially have the opportunity to modify join order…

rschmukler15:05:34

I figured I’d come here before logging an issue on GH to make sure I could describe it in a useful way