Fork me on GitHub
#datomic
<
2023-06-09
>
favila21:06:20

Is there any supported way to encourage a database datasource to use a particular index? I just wrangled with a query where I had the pattern [?e :ref-attr ?v] and both ?e and ?v are bound, and it chose VAET then unified against ?e, which was…not the right choice (4 million rows vs 1 row). My workaround was to use (datomic.api/datoms $ :aevt :ref-attr ?e ?v) to force the index used, but wondered if there was a less hacky way

favila21:06:53

And honestly it still wasn’t great because it migrated the clause down further than I wanted

dazld12:06:41

pass that index as datoms, perhaps? not sure

dazld12:06:12

not sure I like this idea much ^

Keith18:06:42

Hey @U09R86PA4, just trying to understand the scenario here. The following information would be very helpful to give me enough context to understand what's going on here: • The query you were running • https://docs.datomic.com/pro/api/query-stats.html + https://docs.datomic.com/pro/api/io-stats.html • Cardinalities of values for respective E/A (rough estimate is fine) • Schema for attributes used in the query To answer your question, boundness of your logic variables (along with other criteria) is what determines which index is consulted for answers, and you there currently isn't a way to influence which index the query engine consults. If you've encountered a scenario where the decision made by the query engine is highly insufficient, I think that's a very interesting scenario that I really want to better understand.

favila20:06:47

It’s a lot for me to put together. I can paste you the relevant stuff I still had in a buffer, it’s really quite straightforward

favila20:06:44

{:find [?wfs-idx ?story-pos ?entity-id],
 :in [$ % [?type ...] [[?wfs-idx ?wfs]]],
 :where [[?wfs :workflow-state/type ?type]
         [(my-ns/wfs->story+pos $ ?wfs) [[?entity-id ?story-pos]]]]}

favila20:06:53

is the query

favila20:06:59

?wfs and ?type are both bound

favila20:06:08

This is the choice it made:

favila20:06:17

:clauses [{:clause [(ground $__in__3) [?type ...]],
           :rows-in 0,
           :rows-out 3,
           :binds-in (),
           :binds-out [?type],
           :expansion 3}
          {:clause [?wfs :workflow-state/type ?type],
           :rows-in 3,
           :rows-out 4312467,
           :binds-in [?type],
           :binds-out [?wfs],
           :expansion 4312464}
          {:clause [(ground $__in__4) [[?wfs-idx ?wfs]]],
           :rows-in 4312467,
           :rows-out 9,
           :binds-in [?wfs],
           :binds-out [?wfs ?wfs-idx]}

favila20:06:29

Note the expansion to 4million rows

favila20:06:59

this is a cardinality-one ref attribute

favila20:06:09

the trouble is ?type is an enum entity shared by the entire db

favila20:06:14

?wfs is a particular record

favila20:06:39

if ?wfs is known, then AEVT is 1 row

favila20:06:53

would changing the argument order of :in influence this?

favila20:06:07

so it does the third clause first instead of what it just did here?

Keith20:06:56

Oh very interesting! :bind-in [?type] implies that the result set contained 3 rows consisting of only ?type #{[?type] [?type] [?type]} Do you observe different behavior if you pass ?wfs in directly like this: :in [$ % [?type ...] ?wfs-idx ?wfs] ?

favila20:06:16

I’m not sure what you mean by “directly”, it’s already an in-var; your proposal just removes the destructuring?

favila20:06:37

I have many (but not 4 million!) ?wfs

Keith20:06:37

Correct - removing the destructuring. "directly" wasn't the best word to describe that

favila20:06:47

I think 8 in the query I was testing

Keith20:06:12

You might also be able to force the query engine to bind ?wfs early by placing a ground clause in the beginning (just to better understand if the delayed binding is what's causing the problematic join behavior). Something like: [(identity ?wfs) ?wfs2] then use ?wfs2 in subsequent clauses Obviously this wouldn't be a solution, but more to better understand what we're seeing here

favila20:06:40

I’m going to have to get back to you on that, but thanks for the new things to try

👍 2
Keith20:06:59

Just fyi for when you get some time to test: I did a small test just now and it seems to have the effect we were (or I was) hoping to achieve. Curious to hear your findings.