Fork me on GitHub
#xtdb
<
2021-08-06
>
Samuel McHugh10:08:15

any recommendations on grabbing all entity ids for documents containing any key from a list of keys? my naive approach doesn't work because I guess you can't use a variable for the attribute keys.

(q '{:find [e]
     :in [[k ...]]
     :where [[e k _]]}
    [:a/id :b/id :c/id])

jarohen10:08:39

> I guess you can't use a variable for the attribute keys. That's correct, yep If you know the set of keys ahead of query time (i.e. you're not looking to read the keys themselves from Crux), you could generate a query with an or clause (it's just a Clojure data structure)

jarohen10:08:17

Something like this, maybe:

{:find '[?e]
 :where [(cons 'or (for [k ks]
                     ['?e k '_]))]}

refset11:08:47

I wrote something to do this using subqueries inside of a top-level query a few months back (which I just updated as I spotted a few easy improvements - still can't think of a good name though) https://gist.github.com/refset/93135adcbf41fccab9b641638ab10997

refset11:08:04

Broadly speaking, all of the options - or-clauses / subqueries / multiple queries (when using an open-db) - should have roughly the same performance

markaddleman15:08:49

I've been imagining an architecture for Crux on AWS. LMDB allows for multiple processes to read a single index file. What do you think of putting a single and shared Crux index on EFS to be read by several nodes?

jarohen15:08:01

This will likely depend on the latency of requests to the networked drive - Crux's query engine is quite chatty to the index storage, which works fine when the disk is fast and local, but not sure whether the performance will be acceptable when the disk is remote. Worth a try, certainly 🙂

refset20:08:04

Hi Mark, note that RocksDB also allows for multiple readers https://github.com/facebook/rocksdb/wiki/Secondary-instance I have looked at this "shared index store" concept in the context of Redis (https://github.com/crux-labs/crux-redis) and conceptually it should work correctly if you can make sure of only having single a writer, but enabling multiple writers (as would be possible with Redis) almost certainly opens the door to various complex race conditions. I have maintained various related "remote KV" options in an here https://github.com/juxt/crux/issues/617 (feel free to comment there too!)

markaddleman20:08:03

It's great that you're looking at this. Thanks!

🙏 3