Fork me on GitHub
#xtdb
<
2021-08-11
>
Tuomas07:08:37

I'm trying to do full text search, but with leading and trailing wildcard. It seems lucene module only supports trailing wildcard, so I am doing regular queries. The problem I have is, that if I use a limit and predicate the limit has little impact, because it seems like that all the docs with the attribute that is used in the pred need to be realised anyway.

(time (q '{:find [e s]
             :in [search]
             :where [[e :person/last-name s]
                     [(clojure.string/lower-case s) lcs]
                     [(clojure.string/includes? lcs search)]]
             :limit 10}
           "est")) ; "Elapsed time: 6203.292439 msecs"
  (time (q '{:find [e s]
             :in [search]
             :where [[e :person/last-name s]
                     [(= s search)]]
             :limit 10}
           "test")) ; "Elapsed time: 6238.996598 msecs"
  (time (q '{:find [e search]
             :in [search]
             :where [[e :person/last-name search]]
             :limit 10}
           "test")) ; "Elapsed time: 1.007032 msecs"
Is there a way to check for the pred while realising rows and stop when limit is reached?

Tuomas08:08:57

I guess open-q is my best bet. At least if the limit fills up fast

(for [search ["aateli" "est"]]
    (let [n 10]
      (time
       (with-open [res (crux/open-q (crux/db @node)
                                    '{:find [e s]
                                      :where [[e :person/last-name s]]})]
         (reduce (fn [ac [e s]]
                   (if (string/includes? s search)
                     (if (= (count ac) (- n 1))
                       (reduced (conj ac [e s]))
                       (conj ac [e s]))
                     ac))
                 []
                 (iterator-seq res)))))) ; "Elapsed time: 2.488118 msecs" "Elapsed time: 2001.318958 msecs"