Fork me on GitHub
#xtdb
<
2020-09-16
>
Nejc09:09:02

Hi, great work on new version :thumbsup:! Is there any update on OUTER JOIN functionality. e.g.:

{:find [foo, bar, foo_bar]
          :where [
                  [rel_0 :start foo]
                  [rel_0 :eType "RELATED_TO"]
                  [rel_0 :end bar]
                  [foo :eType "Foo"]
                  [ean :eType "Bar"]
                  [ean :id "some-id"]
                  [rel_1 :start foo]
                  [rel_1 :eType "BELONGS_TO"]
                  [rel_1 :end foo_bar]
                  [foo_bar :eType "some-group"]
                  ]
          :full-results? true}
Is it possible to return foo and bar when BELONGS_TO relationship does not exists? The idea is to get a response in the following way:
=> #{[:Foo/_some-id :Bar/_some-id null]}

refset11:09:16

Thanks - the actual release is hopefully happening tomorrow 🤞 We haven't added anything recently that makes representing OUTER JOIN type queries any simpler, but I think or-join is what you are looking for: e.g. https://github.com/juxt/crux/blob/master/crux-test/test/crux/query_test.clj#L1661 If you are able to share an example of the documents I think the intent of your query will be clearer to me and I can help construct the right query. Is ean in your query a typo? Is it supposed to be bar? If not, what does it mean?

refset11:09:48

A related issue with an example of a LEFT OUTER JOIN: https://github.com/juxt/crux/issues/835

Nejc11:09:55

yes, ean should be bar Let me prepare a test set

👍 3
🙏 3
Nejc12:09:06

With the following dataset:

crux/submit-tx node
                [[:crux.tx/put
                  {:crux.db/id :foo/some-id
                   :eType "foo"
                   :name "Foo"}]

                 [:crux.tx/put
                  {:crux.db/id :bar/some-id
                   :eType "bar"
                   :name "Bar" }]

                 [:crux.tx/put
                  {:crux.db/id :category/foo
                   :eType "category"
                   :name "Foo Category" }]

                 [:crux.tx/put
                  {:crux.db/id :category/bar
                   :eType "category"
                   :name "Bar Category" }]

                 [:crux.tx/put
                  {:crux.db/id :related_to/foo_some-id-bar_some-id
                   :eType "RELATED_TO"
                   :start :foo/some-id
                   :end :bar/some-id }]

                 [:crux.tx/put
                  {:crux.db/id :belongs_to/bar_some-id-category_foo
                   :eType "BELONGS_TO"
                   :start :bar/some-id
                   :end :category/foo }]

                 [:crux.tx/put
                  {:crux.db/id :belongs_to/category_foo-category_bar
                   :eType "BELONGS_TO"
                   :start :category/foo
                   :end :category/bar }]]
you can check that foo is RELATED_TO bar and that foo BELONGS_TO foo category and that foo category BELONGS_TO bar category Basically:
foo-RELATED_TO->bar
bar-BELONGS_TO->foo_category
foo_category-BELONGS_TO->bar_category
So if I run this query:
(crux/q (crux/db node)
        '{:find [foo, bar, c_0, c_1, b_0, b_1]
          :where [
                  [foo :name "Foo"]
                  [foo :eType "foo"]
                  [r_0 :start foo]
                  [r_0 :eType "RELATED_TO"]
                  [r_0 :end bar]

                  [b_0 :start bar]
                  [b_0 :eType "BELONGS_TO"]
                  [b_0 :end c_0]
                  [c_0 :eType "category"]

                  [b_1 :start c_0]
                  [b_1 :eType "BELONGS_TO"]
                  [b_1 :end c_1]
                  [c_1 :eType "category"]]
          :full-results? true})
I get all of those entities out. And I get non if I run:
(crux/q (crux/db node)
        '{:find [foo, bar, c_0, c_1, b_0, b_1]
          :where [
                  [foo :name "Foo"]
                  [foo :eType "foo"]
                  [r_0 :start foo]
                  [r_0 :eType "RELATED_TO"]
                  [r_0 :end bar]

                  [b_0 :start bar]
                  [b_0 :eType "BELONGS_TO"]
                  [b_0 :end c_0]
                  [c_0 :eType "unknown-category"]

                  [b_1 :start c_0]
                  [b_1 :eType "BELONGS_TO"]
                  [b_1 :end c_1]
                  [c_1 :eType "category"]]
          :full-results? true})
I would like to outer join the first part with the category part to still get foo and bar . I'm now trying to build that outer join query with the help of issue/835

Nejc12:09:31

@U899JBRPF I think I'm gonna need some help here. That's as close as I got:

(crux/q (crux/db node)
        '{:find          [e x]
          :where         [[e :name "Foo"]
                          [e :eType "foo"]
                          [e :crux.db/id e]
                          (or-join [e x]
                                   [x :start e]
                                   ;[x :eType "RELATED_TO"]
                                   (and [(identity :none) x]
                                        (not [e :name])))]
          :full-results? true})
As soon as I add another where statement in or-join (un-commenting the "RELATED_TO" statement) I get an error. I than started looking into into this example, but I can't figure out how to apply this to my case:
{:find [name planet id0]
 :where [(or (and
               [id :name name]
               [id :planet planet]
               [id0 :name name0]
               [id0 :climate climate]
               [(= planet name0)])
             (and
               [id :name name]
               [id :planet planet]
               (not-join [id]
                         [id :name name]
                         [id :planet planet]
                         [id0 :name name0]
                         [id0 :climate climate]
                         [(= planet name0)])
               [(identity :none) climate]
               [(identity :none) id0]
               [(identity :none) name0]))]}

Nejc12:09:14

Are there any good tutorials on datalog or documentation which could be studied.. I was looking at this https://docs.datomic.com/on-prem/query.html

refset15:09:06

Hey again, sorry for the delay, you can add another no-op leg like so:

'{:find [e x]
   :where [[e :name "Foo"]
           [e :eType "foo"]
           [e :crux.db/id e]
           (or-join [e x]
                    [x :start e]
                    (and [x :eType "RELATED_TO"]
                         [(identity e)])
                    (and [(identity :none) x]
                         (not [e :name])))]
   :full-results? true}
But I'm not really sure what you want to be returned still. Can you show me a couple of expected results? In terms of explanatory resources, we're still very early on in the journey of producing advanced tutorials for our query language. Fortunately the capabilities are very similar to the other Clojure Datalog engines so most of the existing resources out there on the web apply quite happily.

refset15:09:44

Also, I'm not sure if you're signed up to our Zulip org or not (see the link in the topic above this channel), but the Zulip threading model makes it much easier to read and discuss code compared to Slack threads. I'm happy to continue here if you are though 🙂

Nejc11:09:49

Sorry for my late reply, I was unfortunately AFK few days. As you suggested I joined Zulip and posted my question there also tagging you along. Thank you again for all the efforts!

🙏 3