Fork me on GitHub
#xtdb
<
2021-04-05
>
seeday21:04:10

Is it possible/reasonable to use range predicates on mixed-type fields? Mixing strings/longs causes type errors during the queries, but I don’t like wrapping it in some safe-gte function that has a try/catch because I think crux optimizes ranges and I’d prefer to keep that.

refset21:04:31

Sadly not, since the encoding for each value is prefixed with a type id, which means the internal sorting can only be per-type. Are you consciously wanting to avoid enforcing the type of each attribute before submitting?

seeday22:04:55

Yeah, the data I’m working with has a pretty loose schema. I’m fine enforcing the type of the attribute on query time though - as in, I query for a range of ints, I only get back ints. I think that would work with what you said about the index?

seeday22:04:38

I dunno if that’s reasonable behavior for crux to have in general though. The classcastexception is jarring but at least it’s exact.

refset22:04:17

That's fair, we're not completely satisfied with how the range predicates work today, for instance there's this open issue: https://github.com/juxt/crux/issues/1298 You can however explicitly use the actual Clojure comparison functions by using the fully-qualified names (e.g. clojure.core/>) and this will workaround all of Crux's Datalog range query logic, at the cost of potentially not using the indexes efficiently. > I’m fine enforcing the type of the attribute on query time though - as in, I query for a range of ints, I only get back ints. I think that would work with what you said about the index? I think that will be okay, yep, just make sure to write some decent tests otherwise I know from experience that it can easily get confusing 🙂

seeday13:04:50

Yeah that’s an interesting one, especially the Big* because those are a pain to promote into. Is the expected index bump because the idea is to add a type/subtype column so that all number types get grouped together? Then you’d just need a suitable comparator for all those number types.

refset14:04:50

Something like that, yep, although I don't know exactly what the solution in mind would look like. We try to avoid asking users to re-build indexes/nodes when possible