Fork me on GitHub
#xtdb
<
2022-07-01
>
ksd02:07:57

does anyone have recommendations for how to get acquainted with the symbols and terminology used in papers such as this one https://arxiv.org/pdf/1310.3314.pdf? it is one of the Ngo papers from the XTDB bibliography

Martynas Maciulevičius05:07:29

Would you like to give some example terms?

chucklehead07:07:21

just from skimming over the paper and without knowing what background you already have, you might check out some of the videos from this playlist (specifically lecture 2 covers the relational algebra operators I think you might be referring to, and lectures 4 and 11-13 cover other relevant topics): https://www.youtube.com/playlist?list=PLSE8ODhjZXjYutVzTeAds8xUt1rcmyT7x

👍 1
ksd19:07:24

this, for instance, is mostly incomprehensible to me

chucklehead20:07:25

I'll probably mess this up a little, but a couple pages before it defines Q as the join of three relations where the relational algebra translates roughly to:

SELECT R.*, S.*, T.*
FROM R NATURAL JOIN S NATURAL JOIN T;
then it's defining an intermediate relation that's basically all the (B,C)-tuples of Q for a given A value:
SELECT B, C
FROM Q
WHERE A = ?;
and comparing the number of rows of output to the number of rows in:
SELECT * FROM 
R NATURAL JOIN T WHERE A = ?;
and if the second has more rows, then that value for A is heavy.

😯 1
ksd20:07:03

thanks! I'll have to read what you just wrote a few times... definitely need to brush up on the basics, but it's slowly making more and more sense

ksd20:07:51

so Q is like all the "loops" that can be traced from A to B to C in this diagram?

ksd20:07:20

and then SELECT B, C FROM Q WHERE A = ? could be broken into WHERE A = ?, which are all the "loops" for that include the given A, and then SELECT B, C grabs the values of B and C for all of those loops?

ksd20:07:59

and isn't R JOIN S a mistake in "In other words, the value a_i is heavy if its contribution to the size of intermediate relation R JOIN S is greater than its contribution to the size of the output"? shouldn't it be R JOIN T?

chucklehead20:07:27

I think the inequality should've been for R JOIN S instead of R JOIN T but I would need to read the rest more carefully. Either way pretty sure you're right that it's a mistake and that text and the left side of the inequality should refer to the same two relations.

ksd20:07:28

thanks!

1