This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2021-08-16
Channels
- # announcements (3)
- # babashka (48)
- # beginners (35)
- # calva (3)
- # chlorine-clover (5)
- # clj-kondo (9)
- # cljdoc (20)
- # cljsrn (1)
- # clojure (55)
- # clojure-europe (33)
- # clojure-nl (3)
- # clojure-norway (6)
- # clojure-spec (7)
- # clojure-uk (27)
- # clojurescript (95)
- # closh (1)
- # conjure (1)
- # cursive (16)
- # datomic (30)
- # emacs (7)
- # honeysql (1)
- # hugsql (2)
- # introduce-yourself (2)
- # jobs (1)
- # lsp (30)
- # malli (22)
- # nbb (11)
- # news-and-articles (1)
- # off-topic (8)
- # pathom (21)
- # polylith (41)
- # portal (4)
- # practicalli (4)
- # protojure (1)
- # re-frame (14)
- # releases (1)
- # restql (1)
- # reveal (24)
- # sci (1)
- # sql (21)
- # vim (11)
- # xtdb (33)
Hey guys. I am trying to construct a function that would permit me to pass vector of attribute value tuples and build a query from those, but I am getting a lot of complaints from the query spec. Any ideas of how to achieve this?
(defn q
[where-clauses]
(crux/q
(crux/db node)
'{:find [(pull ?i [*])]
:where (mapv #(vec (cons '?i %)) where-clauses)}))
(q '[[:id ?id] [:sub-type :start]])
Hey @U09UV3WP6 🙂 I think you just need to adjust your quoting so the mapv gets evaluated before handing over to crux/q, i.e.
(defn q
[where-clauses]
(crux/q
(crux/db node)
{:find '[(pull ?i [*])]
:where (mapv #(vec (cons '?i %)) where-clauses)}))
Is there an explanation for why using pull [*]
to pull a large set of documents is slower than mapping over ids and calling (crux.api/entity conn id)
Well, certainly pull
has a the full machinery of the query engine surrounding it (unlike entity
), but ideally it wouldn't be that much slower. Some of the overhead could be down to all the set comparisons involved, because a query will naturally return a set (I guess you could try open-q
to see if that's faster). Can you say roughly how large the result set is in both count and KB?
~30 records actually, not huge
It was a difference between ~200 ms (mapping over entity
) vs 2 seconds
the query was doing the (cons 'or (map ...))
trick suggested above for matching on a set of ids
ah, if there's an or
clause with many legs involved then the slowness is a lot more understandable as each leg will generate intermediate set operations (and subqueries, behind the scenes). Have you tried supplying the ids via an :in
binding? e,g, https://opencrux.com/reference/1.18.0/queries.html#_collection_binding
you could also pass the ids in via a literal set directly in a clause like https://github.com/juxt/crux/blob/1619657cb472179bb57a6beb8d774f5673014d41/crux-test/test/crux/query_test.clj#L1037-L1040
I've tried unsuccessfully with an in binding
I will try the (==
approach
if you can share the query I'd be happy to help get it working. It should look roughly like this: https://github.com/juxt/crux/blob/1619657cb472179bb57a6beb8d774f5673014d41/crux-test/test/crux/query_test.clj#L268-L273
the use of :in [$ [name ...]]
in the documentation is confusing
is that a literal ...
or is it implying that I would care about other fields?
the ==
predicate in that other example is orthogonal, the main point is that you can embed sets directly in the e
or v
position of a triple clause (there wasn't a test showing this directly, sorry 🙂)
I can get you the exact query in a couple moments
Here's an abbreviated version:
{:find [?doc],
:in [$ [?doc-id ...]],
:where
[[?doc
:crux.db/id
#{:RIDzY97NM6RLLGMqiAax9HO
:RIDzXptwzORzPnJTUHxAPbC
:PAS0SWNaJSZLHNkMJYlfDzR
:PAS0S9GI8PM10KMWcZdmsPF
:PAS0SYMjiLyrt7irJNnRHBf
:0QyNgGPNKYywttc30I3C
:RIDzXYos2NLq8sb6z7BSQI3
:RIDzXnztkT9FFcrFAtIe6xc
:RIDzXn46mrvs1Z1sC0jYMDk
:0RPXWayax0danE62OVdv}]]}
these are valid crux ids and I get no results for this
I will try the == predicate
Ah I see I did something different in my implementation on that so that explains the no results issue.
What's the upper limit on these operators? Can I pass thousands of values in the set?
This works and takes 1615ms to run:
{:find [?doc],
:where
[[?doc :crux.db/id ?doc-id]
[(==
?doc-id
#{:RIDzY97NM6RLLGMqiAax9HO
:RIDzXptwzORzPnJTUHxAPbC
:PAS0SWNaJSZLHNkMJYlfDzR
:PAS0S9GI8PM10KMWcZdmsPF
:PAS0SYMjiLyrt7irJNnRHBf
:0QyNgGPNKYywttc30I3C
:RIDzXYos2NLq8sb6z7BSQI3
:RIDzXnztkT9FFcrFAtIe6xc
:RIDzXn46mrvs1Z1sC0jYMDk
:0RPXWayax0danE62OVdv})]]}
(mapv #(crux/entity (crux/db crux-node) %)
id-or-ids)
this takes 75ms> What's the upper limit on these operators? Can I pass thousands of values in the set? Interesting question...I'd guess it's just a question of heap space To keep things like-for-like, how long does this take:
{:find [(pull ?doc [*])],
:where
[[?doc :crux.db/id
#{:RIDzY97NM6RLLGMqiAax9HO
:RIDzXptwzORzPnJTUHxAPbC
:PAS0SWNaJSZLHNkMJYlfDzR
:PAS0S9GI8PM10KMWcZdmsPF
:PAS0SYMjiLyrt7irJNnRHBf
:0QyNgGPNKYywttc30I3C
:RIDzXYos2NLq8sb6z7BSQI3
:RIDzXnztkT9FFcrFAtIe6xc
:RIDzXn46mrvs1Z1sC0jYMDk
:0RPXWayax0danE62OVdv}]]}
I'm not sure I understand how your ==
example is compiling, let alone returning results, as the predicate should need wrapping in a vector :thinking_face:
Yeah I fixed it after pasting
Ah, that's more like it! The query engine batches doc fetches so is probably able to be a little faster than individual entity calls
IMO the use of set literals should be on the documentation. Comparing a value against a known set is more common than comparing with an explicit value "Ivan"
> use of set literals should be on the documentation. fixed (pending release): https://github.com/juxt/crux/commit/6be07b23e636e83293297213ec32028f1d84267f#diff-868d3597c7324a08da9c6f15712d4d98972d2f582e9b1314dcc0c53b5c096fc5R276 thanks for the prompt!
Is there a way to stop a Jetty server started with :crux.http-server/server
?
I use the reloaded pattern and I do call .close
on the node on stop but it seems the Http server is still running on restart and I get back a Address already in use
error