Fork me on GitHub
#datalog
<
2022-04-28
>
vlad_poh12:04:12

Hi All Learning datalog and trying to figure out how to do the equivalent of a full outer join In the example below I pull a summary of player1's victories over player 2

(def played 
  '[[(played ?p1 ?p2)
   [?e :winner_name ?p1]
   [?e :loser_name ?p2]]
  [(played ?p2 ?p1)
   [?e :winner_name ?p1]
   [?e :loser_name ?p2]]])

 (d/q '[:find ?t (count ?e)
        :in $ % ?p1 ?p2
        :where
        (played ?p1 ?p2)
        [?e :tourney_name ?t]
        [?e :winner_name ?p1]
        [?e :loser_name ?p2]]
      db played "Roger Federer" "Novak Djokovic")
this returns the following
(["Cincinnati Masters" 3]
 ["Dubai" 3]
 ["Australian Open" 1]
 ["Davis Cup WG PO: SUI vs SCG" 1]
 ["Canada Masters" 1]
 ["Basel" 1]
 ["Tour Finals" 4]
 ["Wimbledon" 1]
 ["Monte Carlo Masters" 3]
 ["Shanghai Masters" 2]
 ["US Open" 3]
 ["Roland Garros" 1])
then I do the reverse and i get the following
(["Cincinnati Masters" 1]
 ["Dubai" 1]
 ["Australian Open" 5]
 ["Canada Masters" 1]
 ["Paris Masters" 2]
 ["Basel" 1]
 ["Tour Finals" 4]
 ["Wimbledon" 4]
 ["Indian Wells Masters" 3]
 ["Rome Masters" 3]
 ["Miami Masters" 1]
 ["US Open" 3]
 ["Roland Garros" 1])
What i would ultimately like is a listing of wins vs losses
(["Cincinnati Masters" 3 1]
 ["Dubai" 3 1]
 ["Australian Open" 1 5]
 ["Rome Masters" 0 3]
...
How do i create a query that returns this

refset14:04:00

It's not necessarily the most efficient approach (especially depending on which DB you're discussing!) but I think this should work and hopefully give a clue on how to improve it:

(d/q '[:find ?t (count ?e) (count ?e2)
        :in $ % ?p1 ?p2
        :where
        (played ?p1 ?p2)
        [?e :tourney_name ?t]
        [?e :winner_name ?p1]
        [?e :loser_name ?p2]
        [?e2 :tourney_name ?t]
        [?e2 :winner_name ?p2]
        [?e2 :loser_name ?p1]]
      db played "Roger Federer" "Novak Djokovic")

vlad_poh14:04:24

@U899JBRPF It returns what seems like an inner join and a product of the wins and losess

(["Cincinnati Masters" 3 3]
 ["Dubai" 3 3]
 ["Australian Open" 5 5]
 ["Canada Masters" 1 1]
 ["Basel" 1 1]
 ["Tour Finals" 16 16]
 ["Wimbledon" 4 4]
 ["US Open" 9 9]
 ["Roland Garros" 1 1])

refset16:04:45

Ah of course, sorry 😅 I wasn't paying enough attention earlier and missed that your rule was already reflexive. Thinking about it again now though, I don't think you can really avoid issuing multiple queries here (or subqueries, if supported), since you can't count values that don't exist, i.e. to get ["Rome Masters" 0 3] can't work because (count ?e) needs to have at least 1 value to aggregate over (so will never produce 0!)

vlad_poh16:04:00

@U899JBRPF thanks didn't know i could do subqueries. Going to look that up now.

🙏 1
refset16:04:10

Well you can in XT at least :)

😅 1