Fork me on GitHub
#xtdb
<
2022-05-16
>
Hukka06:05:47

https://github.com/xtdb/xtdb/pull/1054 > the transaction function is pure and deterministic (we can replace a call to such a function with its result) But is it? While the function's own code is in the DB, it can call libraries that might not be the same version/implementation in different times / nodes?

tatut06:05:39

I've thought that it is "up to you" to make sure that you don't have tx functions that call code that changes with time

✔️ 2
refset08:05:12

^ That's correct, XT doesn't attempt to analyse a transaction function for purity, and instead relies on the user to make this determination before using it.

Hukka08:05:23

I was actually a bit surprised that the function cannot for example generate new entities with random uuids

Hukka08:05:25

I kinda assumed that they are more direct counterparts to stored procedures in SQL dbs. While of course they are run on the nodes, I hadn't realized that they might be run many times, on multiple nodes, and they need to return the same result

👍 1
refset08:05:25

> Perhaps worth mentioning in [the docs] Agreed, that is definitely a good idea - I'll make a note to the board so we/I don't forget. And yep, generating a UUID inside a function would definitely count as an impure side-effect that undermines the overall consistency.

Hukka11:05:51

Is it possible to do cascading, atomic deletes? So if you have typeA with many-to-many relationshop with typeB, with the modelled relationship as it's own type/document, you could make sure that when one A or B is deleted, also all the many-to-many links are deleted? If this is not possible, I think we have to always ::xt/match for A and B too, not just the link, to make sure that we are not operating on dangling relationships

Hukka11:05:52

Maybe more concretely, if we delete an instance of type A, and want to delete all links too, how can we make sure that nobody added more links between querying for existing links, and issuing the delete transaction?

Hukka11:05:53

Hm, perhaps this is the wrong way to look into it. In my code handling type A, I have no idea what all other places might be linking to A. I cannot do the deletion reliably

Martynas Maciulevičius11:05:13

Did you try transaction functions?

Hukka11:05:49

How would the code handling the deletion know what all functions to "call"?

Martynas Maciulevičius11:05:58

You would write your own transaction function from which you would return xt/put calls that would delete your docs.

Hukka11:05:04

Yes, but how would the namespace that handles creating and removing type A know what all transaction functions to call, to delete all the other documents that refer to type A?

Hukka11:05:10

It's starting to look like that cascades is a fools errand, and every place that cares about the links has to check if the linked documents exist, not just rely on the link field

Martynas Maciulevičius11:05:04

I didn't even think that you would want to have more than one transaction function for deletions. I thought that you would walk the DB graph inside of this single function. Which essentially would mean "has to check if the linked documents exist".

Hukka11:05:54

Even that would require knowing what the name of the link field is, for doing the query

Hukka11:05:27

Sure, it could be a convention, but that is very different level of checks compared to a DB enforced foreign key

refset11:05:09

Interesting :thinking_face: I think despite my reply in the other thread about the importance of keeping transaction functions pure, this may be a valid exception to that rule given XT's current APIs. You should be able to achieve what you want by enumerating the output of attribute-stats (which unfortunately necessitates referring to the node somehow!) - this code may help: https://gist.github.com/refset/25d8ae03c69993831c9c23cf7e12a537

refset11:05:31

Given how transaction functions are executed during indexing, this should be completely safe. I wouldn't recommend doing it for general querying though, as the output from attribute-stats can change even with an otherwise stable db basis (because indexing happens concurrently)

Hukka11:05:33

I would say that since the db is an input, it is still pure. With identical db, the resulting output of operations should not change

refset11:05:17

Another workaround could be to enumerate your list of system-wide attributes in an additional entity (i.e. stored explicitly in userspace)

Hukka11:05:30

What is $ in that gist?

refset11:05:40

It's the implicit db basis for that query. In Datomic/DataScript I think it always has to be passed in via :in but I might be wrong

Hukka12:05:47

And yeah, attribute-stats would work. A bit heavy, but works

refset12:05:51

Another workaround is to hard code the list of all known attributes directly into the transaction function, and then just create new functions each time your schema evolves

Hukka12:05:27

That certainly works, but makes a bit too much spaghetti. Trying to find ways where the namespaces don't need to know about internals in other namespaces

thinking-face 1
👍 1
Hukka12:05:33

Hmh, althought that might be a bit overreaching

Hukka12:05:22

So for comparison, sometimes in SQL you would refer to another table without a foreign key, to avoid the cascades

tatut12:05:59

fwiw, I would not create a tx function that indiscrimnately walks the db and deletes stuff... rather would have something per entity type like :customer/delete which would know statically which subentities it needs to cascade to

tatut12:05:44

but if you really have such free form data that you don't know which attributes in which entities are links, then you may need to walk

Hukka12:05:50

Yeah, some kind of convention where it's easy to test that the cascading delete really happens, and is hard to trigger accidentally when just wanting a "normal" link

tatut12:05:41

in a deeply nested structure that is shredded to the database as multiple documents, there might be some documents that are links that should not be deleted and some that should

tatut12:05:46

{:order/id 1 :order/items [{:order-item/id 1} ...] :order/customer {:customer/id 1}]
If you have an order entity, I would expect that you want to delete the order items when you delete the order... but not delete the customer

tatut12:05:40

same as patching, no perfect completely generic solution exists...

Hukka12:05:08

Yeah, although in this case the question was particularly about many-to-many link documents, that is, the document doesn't serve any other purpose except to make the many-to-many relationship indexed. Then it's pretty clear that if one of the linked documents is deleted, all links should be too. But indeed, how to recognize link documents from the rest…

Hukka12:05:11

Something like :customer/delete-me would work as the field name 😉 Though have to think a bit more about the naming.

tatut12:05:02

yeah, consistent naming of the link attributes would help

Hukka12:05:22

Such lovely queries

'{:find [(pull order [*])]
  :where [[link :do-not-delete-this-customer customer-id]
          [link :delete-this-order order]]
  :in [customer-id]}

Hukka12:05:21

Well, not even that, since you would not delete an "order" if only one of many "customers" would be deleted, just the link. Names are hard

Hukka12:05:42

:delete-me-not-this-customer

🙂 1
tatut12:05:14

if you do all entities that need cascade delete with backlinks and have a single name for the link, then when you are deleting customer c you can just query for [to-delete :customer/link c] to find what needs to be deleted

Hukka12:05:15

Could be global name too, something like :our-app/link, and use generic cascading delete functions for all kinds of entities

Hukka12:05:50

Well, needs at least two of them to make the many-to-many links work.

tatut12:05:50

you mean a link doc is {:our-app/link1 doc1 :our-app/link2 doc2} ?

tatut12:05:12

why not {:our-app/link [doc1 doc2]}?

tatut12:05:21

would be found when either is deleted

Hukka12:05:39

I guess, if you can generate the docs in the right order 🙂

Hukka12:05:00

Well, same problem with two keys, which is which

lgessler15:05:56

I've been using the strategy @U11SJ6Q0K suggests--having a tx function per entity type that implements its own deletion policy for related records

refset15:05:25

>> why not {:our-app/link [doc1 doc2]}? >> I guess, if you can generate the docs in the right order 🙂 XT is also happy with {:our-app/link #{doc1 doc2}} (p.s. great discussion!)

😅 1
Hukka17:05:39

> XT is also happy with {:our-app/link #{doc1 doc2}} But how would you query "all links to/from doc1"?

refset21:05:27

[link :our-app/link doc1]

refset21:05:20

or if you mean "all docs linked to/from doc1", then:

[link :our-app/link doc1]
[link :our-app/link other-doc]
[(!= doc1 other-doc)]

Hukka04:05:17

Oh, I had no idea that the normal triple clause will also match values inside sets. Is it just sets, or all kinds of collections? But it isn't indexed, right?

Hukka04:05:16

Then again, reading https://docs.xtdb.com/language-reference/1.21.0/datalog-queries/#_maps_and_vectors_in_data again I notice that even with vectors I could query without caring about the order

Hukka04:05:57

Because if the vectors actually would be indexed, I could just put the many to many relationship directly into a field of doc1

✔️ 1
refset10:05:00

top-level sets and vectors get decomposed into triples, fully indexed 🙂 https://docs.xtdb.com/language-reference/datalog-transactions/#indexing

Hukka10:05:32

Ok, that's explicit. We had read > The same thing should apply for maps so instead on https://docs.xtdb.com/language-reference/datalog-queries/#_maps_and_vectors_in_data which sounded like maps and vectors work the same: they need to be manually flattened

refset10:05:24

Ah, yes that could/should definitely be clearer :thinking_face: As an aside, taking that idea to the extreme looks like... https://gist.github.com/refset/ead5cc5ab36c0726b85c1713f0455509

Hukka10:05:38

But how to name them with nested vectors 😉 Well, if you assume that maps don't have keys like :0 then you can just put numbers. And of course better hope that the maps don't have string keys 😬

🙂 1
Bhougland16:05:33

I am thinking of using xtdb for a software implementation I am working on. The client is making updates to approved "gold standard" master data files and we need be able to do the following:

Bhougland16:05:07

• Identify differences from the previous gold standard file after each extract is taken from the DB

Bhougland16:05:36

• Review each change and apply a comment reason that supports each change

Bhougland16:05:51

Would this be a good usecase for xtdb?

Bhougland16:05:25

Rollback to a previous state if needed

Bhougland20:05:45

After looking at the Datomic documentation, I think I am looking for something along the lines of Redundancy Elimination.

Bhougland20:05:59

Does that exist in xtdb?

Adrian Smith21:05:44

Hey is https://github.com/xtdb-labs/crux-console still the recommended “UI” for xtdb? Or is something else being used as a GUI on to the DB?

lgessler21:05:29

the http-server module has a GUI component to it: https://docs.xtdb.com/extensions/http/ I don't know if the docs do a super good job of advertising it but I've been using it in my own projects--if you navigate in a browser to the :port specified in :xtdb.http-server/server you should see it

refset21:05:51

We lost momentum to get the http server GUI to a 'finished' state, but it works for some things 🙂 There's also @@tatut's inspector https://docs.xtdb.com/ecosystem/ (which is newer, and arguably a much slicker architecture than the http server's cljs-heavy re-frame app)

richiardiandrea23:05:03

FWIW We are using the console atm but it feels unfinished indeed. We are planning to move to tatut's work soon-ish

tatut07:05:56

I'm happy to take any suggestion on how to improve the XTDB inspector... so far it has been mostly me scratching my own itch

Adrian Smith19:05:23

Hi @U11SJ6Q0K if I want to bring inspector into my project using deps what would the package name be?:

<<what.goes.here?>> {:git/url ""
                             :git/sha "c3f80a545ab095184a13fc2445a96459de56e26f"}

tatut04:05:04

tatut/xtdb-inspector

tatut04:05:10

for example

Dmitri Akatov10:07:12

Hi @U11SJ6Q0K - I was able to bring in tatut/xtdb-inspector as a dependency into my own project by creating a ring adapter that checks the URL and either dispatches to my default handler or to xtdb-inspector.core/inspector-handler. All “static” URLs work fine, but I’m having trouble getting the __ripley-live?... URLs to work properly. The WS connection is closed every time with a 404. It doesn’t look like the ripley.live.context/connection-handler is even being used. Could this have something to do with me using the jetty server in my project, whereas xtdb-inspector may rely on http-kit for the WS connection to work properly?

tatut04:08:33

hi, sorry for the late answer, just now back at a machine… I haven’t tried with jetty, but you could raise an issue in github describing the setup