Fork me on GitHub
#xtdb
<
2023-09-15
>
Martynas Maciulevičius18:09:46

I'm trying to read documentation on subqueries because I want to write an aggregate subquery with a default value: https://docs.xtdb.com/language-reference/datalog-queries/#where-subqueries But these subqueries are very basic. I tried "just writing" the query but then I don't know how to deal with a list of tuples properly because then I still don't know how to do the default value for an aggregation. Do you think it's better if I'd do several separate queries instead? I want to calculate one value for each record row.

Martynas Maciulevičius19:09:30

I already implemented it in the basic query way. The subquery is too much for me.

refset08:09:08

Hey @U028ART884X you should be able to achieve just about anything within a single Datalog query, but there are definitely some complexity/verbosity hoops (and potentially performance hoops) when attempting to do so. If you can share some sort of ~minimal example of what you're trying to achieve I would be happy to help offer solutions

Martynas Maciulevičius08:09:17

I was trying to fetch a list of documents and sum its likes in one query. So the outer query is a normal one but the inner query can have no registered likes and I want that case to default to zero. This is my summation query:

{:find (sum vote-value)
 :where [[vote :vote/value vote-value]
         [vote :vote/target-id target-id]]
 :in [target-id]}
Then after this query I do this: (or vote-result 0) And the outer query just queries the list of these target-id docs. So currently I run these small queries separately because it's less complicated and I don't know how to handle missing vote values.

refset22:09:04

I think you can do something like replace the current :where vector with:

[(or-join [vote-value]
          (and [vote :vote/value vote-value]
               [vote :vote/target-id target-id])
          [(identity 0) vote-value])]

Martynas Maciulevičius09:09:07

I tried this. And I think there is a problem. My vote/value is a number -1, 0 or 1. If there are more than two votes then or-join returns a set of results which means that I can't sum them up and produce a vote count. It's because my intermediate result with or-join becomes this #{-1 0 1} but not this: [-1 -1 -1 0 1] Currently my additional query works well. This also seems to work:

(sum ?vote-value)
...
                         '(or (and [?vote :vote/target-id ?target]
                                   [?vote :vote/value ?vote-value])
                              (and [(identity nil) ?vote]
                                   [(identity 0) ?vote-value]))
But if I omit sum then it doesn't return a set of values and instead returns distinct ?vote-value list.

Martynas Maciulevičius10:09:32

I refactored my queries to use the or but I think they're a little less performant now. I'll see what happens.

refset10:09:28

I think you should be able to force it to not deduplicate by returning the vote entity also (and then just don't use it)

refset10:09:46

or-join is good for performance if or is slow - it's just a matter of making sure which vars are in scope - you could also consider using a rule instead, which allows for binding hints to ensure best performance

👍 1
Martynas Maciulevičius10:09:28

Ok. Now I understand how to write or-join 😄

🙌 1