Fork me on GitHub
#xtdb
<
2020-12-03
>
seeday15:12:57

Is there a way to join against the keys of a map? It looks like eql/project is too explicit about the fields it wants and I can't quite get the obvious [(?es :crux.db/id (set (keys ?m)))] to work

jarohen15:12:17

Hey @U012QF1QVQV - would you mind posting an example? I'm not quite sure what you're aiming to do

seeday15:12:19

Yeah sure, so given something like

(crux/submit-tx node [[:crux.tx/put {:crux.db/id :linka :relations {:linkb "relation descriptor"
                                                                    :linkc "relation descriptor 2"}}]
                      [:crux.tx/put {:crux.db/id :linkb :value "hello"}]
                      [:crux.tx/put {:crux.db/id :linkc :value "world"}]])
I'd expect to be able to run a query like
(crux/q (crux/db node) '{:find [?e ?e2] :where [[?e :crux.db/id :linka]
                                                [?e :relations ?ls]
                                                [(?e2 :crux.db/id (set (keys ?ls)))]]})
but it claims that ?e2 is undefined, which I guess is true, but declaring it with something like [?e2 :crux.db/id _] just causes it to return the whole db

seeday15:12:42

It would be super easy to do this in two queries, but I'd obviously like to do it in one

seeday15:12:15

The docs suggest expanding the map into something like {:relations/linkb "value" :relations/linkc "value"} but then you'd have to specify the names of those relations explicitly in an eql/project

seeday23:12:15

I ended up coming up with this query, but obviously this does a full table scan of ids for contains? instead of a nice hashed lookup

(crux/q (crux/db node) '{:find [(eql/project ?e [*]) (eql/project ?e2 [*])] :where [[?e :crux.db/id :linka]
                                                                [?e :relations ?ls]
                                                                [?e2 :crux.db/id _]
                                                                [(contains? ?ls ?e2)]
                                                                ]})

seeday23:12:29

And the expected alternative of {:find [?e (eql/project ?e2 [*])] :where [[?e :crux.db/id :linka] [?e :relations ?ls] [?e2 :crux.db/id (set (keys ?ls))]]} with or without the (set (keys )) both just return an empty map

jarohen03:12:30

If you'd like to be able to model properties on the edge, you might consider having an entity for the edge itself

(import 'java.util.UUID)

(let [rel-ab (UUID/randomUUID)
      rel-ac (UUID/randomUUID)]
  (crux/submit-tx node [[:crux.tx/put {:crux.db/id :linka}]
                        [:crux.tx/put {:crux.db/id rel-ab
                                       :descriptor "relation descriptor"
                                       :source :linka
                                       :destination :linkb}]
                        [:crux.tx/put {:crux.db/id rel-ac
                                       :descriptor "relation descriptor 2"
                                       :source :linka
                                       :destination :linkc}]
                        [:crux.tx/put {:crux.db/id :linkb :value "hello"}]
                        [:crux.tx/put {:crux.db/id :linkc :value "world"}]]))
you can then query this with
(crux/q (crux/db node)
        '{:find [(eql/project ?e [*])
                 (eql/project ?e2 [*])]
          :where [[?e :crux.db/id :linka]
                  [?rel :source ?e]
                  [?rel :destination ?e2]]})

jarohen03:12:29

in this, I've assumed: * your edges are directed - if not, then a set of :nodes #{:linka :iinkb} on the relation would work better, with [?rel :nodes ?e], [?rel :nodes ?e2], [(not= ?e ?e2]) in your :where clause * you can have multiple edges between the same two nodes - if not, you might want to prevent multiple edges using a map as the entity id for the edge, like {:e1 :linka, :e2 :linkb} (ensuring that e1 < e2), which will ensure uniqueness

seeday16:12:48

Yeah that makes sense - I was trying to get away from having edge entities as well, but I think it's the cleanest solution so far

practicalli-johnny17:12:28

Any examples of using Crux in the general realm of data science, time series, etc? I'm learning more about data science and wondered if it were use cases / stories about where Crux has been used in this context. I find the movie database for the reClojure workshop very useful to help me start with Crux. Thanks.

jjttjj22:12:27

I looked into time series things with crux briefly a while ago. This issue might be informative: https://github.com/juxt/crux/issues/129 As well as the ts_,*.clj examples here: https://github.com/juxt/crux/tree/master/crux-bench/src/crux/bench

jjttjj22:12:05

for what it's worth I'm pretty sure that crux isn't particularly ideal as a time series database (nor is it designed to be) though you could definitely make it work for small datasets. I'm very far from an expert on crux though and would be happy to be wrong though :)

3
jjttjj22:12:29

And it may well be good for other kinds of data science after outgrowing csvs

practicalli-johnny22:12:50

Thanks for sharing your thoughts and the link, very helpful.

refset22:12:04

Sorry to chime in late - but yeah Crux isn't competitive for classic time series analysis today, due to the structure of the current indexes and because there is no columnar compression going on. The ts_devices and ts_weather tests also demonstrate very clearly that running point-in-time Datalog queries is not a very concise way of doing analysis across time 🙂

practicalli-johnny23:12:44

Thanks, that’s really useful to know.

🙏 3