This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2020-11-26
Channels
- # announcements (3)
- # babashka (28)
- # beginners (21)
- # cider (29)
- # clojars (10)
- # clojure (14)
- # clojure-australia (3)
- # clojure-europe (48)
- # clojure-nl (3)
- # clojure-sanfrancisco (4)
- # clojure-uk (54)
- # clojurescript (34)
- # cryogen (12)
- # cursive (7)
- # datomic (14)
- # devcards (1)
- # fulcro (23)
- # helix (2)
- # java (5)
- # jobs (1)
- # kaocha (15)
- # malli (13)
- # minimallist (1)
- # off-topic (8)
- # pathom (7)
- # pedestal (1)
- # rdf (10)
- # reagent (18)
- # shadow-cljs (58)
- # spacemacs (3)
- # tools-deps (1)
- # vim (6)
- # xtdb (37)
here's something weird I'm experiencing, still on 20.11. await-tx here blocks forever:
(crux/await-tx node
(crux/submit-tx
node
[[:crux.tx/put {:crux.db/id 14 :whatever/id "a"}]]))
not an :whatever/idfoo
or a :whatever/idbar
, just if (name attr)
is id
, including :id
now, I know that some other docs with a :foo/id
attribute have transacted into the entirely in-memory stores just fine, but at some point this seems to have broken
;; returns true
(crux/tx-committed? node
(crux/await-tx node
(crux/submit-tx
node
[[:crux.tx/put {:crux.db/id 512 :oaeueoaueoa/id 1}]])))
;; blocks forever
(crux/tx-committed? node
(crux/await-tx node
(crux/submit-tx
node
[[:crux.tx/put {:crux.db/id 512 :oaeueoaueoa/id "1"}]])))
;; returns true
(crux/tx-committed? node
(crux/await-tx node
(crux/submit-tx
node
[[:crux.tx/put {:crux.db/id 512 :oaeueoaueoa/idd "1"}]])))
@kevin842 Have raised an issue the attribute being :id
- https://github.com/juxt/crux/issues/1274
Although your exact example is fixed by https://github.com/juxt/crux/pull/1273
If you try this snapshot, it should have rhe fixes [juxt/crux-lucene "20.09-1.11.1-alpha-SNAPSHOT"]
works great :) I noticed something with the performance; using or
with even a single clause slows it down dramatically. I was hoping to query two attributes in one query, but it looks like I'm better off making two crux/qs?
user> (quick-bench (q '{:find [?e ?v] :where [(or [(text-search :view/feed "xk*") [[?e ?v]]] ) [?e :view/id]]})) Evaluation count : 6 in 6 samples of 1 calls. Execution time mean : 245.380112 ms Execution time std-deviation : 217.685410 ms Execution time lower quantile : 151.759149 ms ( 2.5%) Execution time upper quantile : 622.555545 ms (97.5%) Overhead used : 1.788354 ns Found 1 outliers in 6 samples (16.6667 %) low-severe 1 (16.6667 %) Variance from outliers : 83.1035 % Variance is severely inflated by outliers nil user> (quick-bench (q '{:find [?e ?v] :where [[(text-search :view/feed "xk*") [[?e ?v]]] [?e :view/id]]})) Evaluation count : 108 in 6 samples of 18 calls. Execution time mean : 1.339781 ms Execution time std-deviation : 28.631001 µs Execution time lower quantile : 1.313759 ms ( 2.5%) Execution time upper quantile : 1.385600 ms (97.5%) Overhead used : 1.788354 ns Found 1 outliers in 6 samples (16.6667 %) low-severe 1 (16.6667 %) Variance from outliers : 13.8889 % Variance is moderately inflated by outliers nil
oh, the search value can't be parameterized with :in
?
(db/q '{:find [?id ?v] :in [input] :where [[(text-search :view/feed input) [[?e ?v]]] [?e :view/id ?id]]} "xk*") ;; throws class crux.query.VarBinding cannot be cast to class java.lang.String
resorting to this for now
(db/q
`{:find [?id ?v]
:where [[(~(symbol "text-search") :view/feed ~input) [[?e ?v]]]
[?e :view/id ?id]]})
(q '{:find [?e ?v]
:where [[(text-search :view/title "/") [[?e ?v]]]
[?e :crux.db/id]]})
;; throws
2. Unhandled org.apache.lucene.queryparser.classic.ParseException
Cannot parse '/': Lexical error at line 1, column 2. Encountered: <EOF>
after : ""
QueryParserBase.java: 114 org.apache.lucene.queryparser.classic.QueryParserBase/parse
lucene.clj: 135 crux.lucene/search
lucene.clj: 126 crux.lucene/search
lucene.clj: 150 crux.lucene/full-text
lucene.clj: 149 crux.lucene/full-text
lucene.clj: 163 crux.lucene/pred-constraint/pred-get-attr-constraint
query.clj: 1161 crux.query/constrain-join-result-by-constraints/fn
core.clj: 2681 clojure.core/every?
core.clj: 2672 clojure.core/every?
query.clj: 1160 crux.query/constrain-join-result-by-constraints
query.clj: 1158 crux.query/constrain-join-result-by-constraints
query.clj: 1501 crux.query/build-sub-query/constrain-result-fn
query.clj: 1512 crux.query/build-sub-query
query.clj: 1474 crux.query/build-sub-query
query.clj: 1674 crux.query/query
query.clj: 1654 crux.query/query
query.clj: 1792 crux.query.QueryDatasource/fn
query.clj: 1791 crux.query.QueryDatasource/openQuery
query.clj: 1754 crux.query.QueryDatasource/query
api.clj: 374 crux.api/eval18945/fn
api.clj: 249 crux.api/eval18783/fn/G
api.clj: 337 crux.api/q
api.clj: 331 crux.api/q
RestFn.java: 425 clojure.lang.RestFn/invoke
REPL: 15 app.db/q
REPL: 13 app.db/q
REPL: 496 app.db/eval122334
REPL: 496 app.db/eval122334
Compiler.java: 7177 clojure.lang.Compiler/eval
Compiler.java: 7132 clojure.lang.Compiler/eval
core.clj: 3214 clojure.core/eval
core.clj: 3210 clojure.core/eval
interruptible_eval.clj: 91 nrepl.middleware.interruptible-eval/evaluate/fn
main.clj: 437 clojure.main/repl/read-eval-print/fn
main.clj: 437 clojure.main/repl/read-eval-print
main.clj: 458 clojure.main/repl/fn
main.clj: 458 clojure.main/repl
main.clj: 368 clojure.main/repl
RestFn.java: 137 clojure.lang.RestFn/applyTo
core.clj: 665 clojure.core/apply
core.clj: 660 clojure.core/apply
regrow.clj: 20 refactor-nrepl.ns.slam.hound.regrow/wrap-clojure-repl/fn
RestFn.java: 1523 clojure.lang.RestFn/invoke
interruptible_eval.clj: 84 nrepl.middleware.interruptible-eval/evaluate
interruptible_eval.clj: 56 nrepl.middleware.interruptible-eval/evaluate
interruptible_eval.clj: 155 nrepl.middleware.interruptible-eval/interruptible-eval/fn/fn
AFn.java: 22 clojure.lang.AFn/run
session.clj: 190 nrepl.middleware.session/session-exec/main-loop/fn
session.clj: 189 nrepl.middleware.session/session-exec/main-loop
AFn.java: 22 clojure.lang.AFn/run
Thread.java: 832 java.lang.Thread/run
1. Caused by org.apache.lucene.queryparser.classic.TokenMgrError
Lexical error at line 1, column 2. Encountered: <EOF> after : ""
QueryParserTokenManager.java: 1119 org.apache.lucene.queryparser.classic.QueryParserTokenManager/getNextToken
QueryParser.java: 822 org.apache.lucene.queryparser.classic.QueryParser/jj_scan_token
QueryParser.java: 666 org.apache.lucene.queryparser.classic.QueryParser/jj_3R_3
QueryParser.java: 702 org.apache.lucene.queryparser.classic.QueryParser/jj_3_1
QueryParser.java: 646 org.apache.lucene.queryparser.classic.QueryParser/jj_2_1
QueryParser.java: 225 org.apache.lucene.queryparser.classic.QueryParser/Query
QueryParser.java: 215 org.apache.lucene.queryparser.classic.QueryParser/TopLevelQuery
QueryParserBase.java: 109 org.apache.lucene.queryparser.classic.QueryParserBase/parse
lucene.clj: 135 crux.lucene/search
lucene.clj: 126 crux.lucene/search
lucene.clj: 150 crux.lucene/full-text
lucene.clj: 149 crux.lucene/full-text
lucene.clj: 163 crux.lucene/pred-constraint/pred-get-attr-constraint
query.clj: 1161 crux.query/constrain-join-result-by-constraints/fn
core.clj: 2681 clojure.core/every?
core.clj: 2672 clojure.core/every?
query.clj: 1160 crux.query/constrain-join-result-by-constraints
query.clj: 1158 crux.query/constrain-join-result-by-constraints
query.clj: 1501 crux.query/build-sub-query/constrain-result-fn
query.clj: 1512 crux.query/build-sub-query
query.clj: 1474 crux.query/build-sub-query
query.clj: 1674 crux.query/query
query.clj: 1654 crux.query/query
query.clj: 1792 crux.query.QueryDatasource/fn
query.clj: 1791 crux.query.QueryDatasource/openQuery
query.clj: 1754 crux.query.QueryDatasource/query
it does not seem to like slashes in the search. done poking around for now, thanks @U050DD55V
Have raised here: https://github.com/juxt/crux/issues/1278
> using `or` with even a single clause slows it down dramatically. I was hoping to query two attributes in one query, but it looks like I'm better off making two crux/qs?
In your example the contents of the or
will get treated as a subquery that implicitly executes after its sibling clauses. This means the engine will actually be scanning through all [e :view/id]
entities and running the text search for each. In theory the query engine could be smart enough to lift the contents of the or
, given that it only has one leg, and avoid the scanning altogether, but I'm sure that idea would involve other trade-offs.
Is this essentially the question you are hoping to model: "Find all view entities (which necessarily have a :view/id
) with their :view/feed
value where that value begins with xk*
"?
Or is it something more subtle?
@U899JBRPF: I want to find all entities that match some query with either of two attributes:
(db/q `{:find [?id ?v ?s] :where [(~(symbol "or") [(~(symbol "text-search") :view/feed ~input) [[?e ?v ?s]]] [(~(symbol "text-search") :view/title ~input) [[?e ?v ?s]]]) [?e :view/id ?id]]})
Thanks, will give this some thought 🙂
Please see our conclusions on querying with /
https://github.com/juxt/crux/issues/1278#issuecomment-734396067 (documentation to follow!)
This or
should be faster: https://github.com/jonpither/crux/blob/337a443175592463dc13163d814ea6a95ff17fc3/crux-lucene/test/crux/lucene_test.clj#L230-L254 🙂
from what I can see or-text-search is searching from one of many values? My use case is maybe a little odd, in that I want to search one value from many attributes
I think the many value case might be covered with a regex? But I definitely don't have a clear view of lucene.. lots going on with the tokenizers and analyzers
Well, I don't think it's that odd of a use case really. You could think of a google search result as a title and a description that a query is run against
Ah yep, that makes sense, it would be like or-wildcard-search
I guess! The idea with that defmethod approach is you can define these search predicates as a user and make it work exactly as you wish. We've only just done the refactoring to make this possible though, and we will try to get it merged in time for the new release (within the next few days)
in summary, a doc with a tuple where (= (name attr) "id")
and value is a string will never be indexed. My other domain entities with :foo/id
point to uuids so they were fine, but one uses a string value for :bar/id
There's no compression explicit in the Crux DocumentStore
protocol or our doc-store module implementations. However RocksDB, for instance, certainly provides a lot of useful low-level compression by default (with many config options available also). I don't have data points to offer on exactly how well Rocks or any of the other doc-store backends typically fair with our Nippy-ified documents.
I believe there's a lot of theoretical scope for improvement for doc-store implementations to implement some document-level structural sharing before any low-level compression takes place, i.e. do things more like git. Working in this area is not a near-term priority for us at the moment, but it's certainly important in the long run 🙂