Fork me on GitHub
#xtdb
<
2022-06-07
>
zeitstein04:06:00

Still haven't decided which durable Datalog db to use, so I'm working with Datascript for now. I'm finding myself using entity and datoms a lot. As far as I was able to find, those are not supported in XTDB. (To get ahead of a question I've seen @taylor.jeremydavid ask – this is meant to be production code 🙂) In general, am I going to have a hard time translating Datascript code to XTDB?

tatut04:06:09

xt/entity is supported, but datoms obviously not as it is document based not datom based

zeitstein05:06:35

Ah, the https://github.com/xtdb/xtdb/issues/1497 is about the "lazy" version. Thanks. So, when dealing with writing, say, transaction functions featuring a lot of recursion and small lookups (reverse lookup on an attribute or just a couple of attributes from an entity) at each step, what's the preferred API to use? Just query and pull? I'd wonder about performance.

tatut05:06:50

I would say if you are using a couple of attributes, then those should be queried so that it uses indexes... but you should measure performance

jarohen08:06:29

+1 for query + pull, yep

jarohen08:06:15

XT has had lazy queries from the start (`xt/open-q`) so a lot of the use cases for datoms fall under that instead

zeitstein08:06:54

Alright, thank you both!

refset09:06:30

It's also worth knowing that the performance of running lots of small queries within a single open-db is going to be about as good as any 'native' lazy-`entity` implementation that we could construct in the core. Also, for extra context. the reason we can't implement lazy-`entity` like-for-like in XT as a first class API is that you need schema information to handle cardinality and attribute references properly.

refset09:06:41

Somewhat related and potentially of interest is datafy/nav - I did see this recently: https://github.com/FiV0/datafy-nav-demo/blob/master/src/datafy_nav_demo.clj

zeitstein10:06:11

Thanks for the extra info! Just dipping my toes and immediately hit upon this > you need schema information to handle cardinality and attribute references properly Somehow didn't think of this before. My db (graph) uses a lot of refs. So, with XT, I'd have to manage these manually? That might be a deal-breaker 😞

jarohen10:06:05

refs are dynamic in XT - if you ask to join two documents in a query (either by pull or by :where clauses) XT will know that you intend that value to refer to another document

✔️ 1
jarohen10:06:06

so long as I've understood you correctly regarding 'manually managing refs', this isn't necessary 🙂

zeitstein10:06:56

Yes, I've seen that query and pull work like that. What I meant is adding/deleting refs in transactions. Example: deleting a doc referenced in several places, means I need to manually delete it from those places. Guess I'll have to think if the trade-off is worth it. Currently, for my use case: • XT > Datahike because of integration with Lucene. (Also docs, and seems more widely used, etc.) • XT > Datalevin because of history. • both Datalevin and Datahike: ◦ have simpler architecture – perhaps a downside with XT when building self-hosted apps ◦ are closer to Datascript which is what my code is currently written with. Realising the mismatch between Datascript and XT is making me try to decide if I should commit to XT before writing any more Datascript code 🙂

👍 1
jarohen10:06:15

> Example: deleting a doc referenced in several places, means I need to manually delete it from those places. yes, this is true - because we don't have explicit refs, we also don't have an equivalent for SQL's 'on delete cascade'

gratitude 1
jarohen10:06:21

on the query side, in particular, XT is (deliberately) very similar to the other EDN Datalog databases

jarohen10:06:59

on the ingest side, we have some additions/differences due to our bitemporal support

refset10:06:10

Atomic retractions of references can be done by querying within transaction functions. Here is a old example I was toying with to implement a more "datom-oriented" userspace API (just an experiment...not necessarily a good idea!): https://gist.github.com/refset/a00be06443bc03ccc84a2874af3cdb8a

gratitude 1
refset10:06:35

> the reason we can't implement lazy-`entity` like-for-like in XT as a first class API is that you need schema information to handle cardinality and attribute references properly Concrete examples to clarify: 1. if you have a {:xt/id "foo" :ref "bar"} document, and {:xt/id "bar"}, and you want to do the equivalent of (:ref (d/entity db "foo")) (i.e. lazy-`entity`) - without a full write-time knowledge of the schema, like whether :ref accepts a scalar (cardinality-one) or a collection (cardinality-many), you'd always have to return a set of entities 2. similarly, without knowing for certain whether :ref is actually a reference attribute, you might accidentally de-reference a scalar value as a document ID (i.e. a kind of collision) when really it was only ever meant to be a scalar

zeitstein11:06:54

Have you thought about auto-resolving e.g. [:xt/id "foo"] as refs? IIRC, Asami does that while being schema-less. I guess managing refs in transactions might not be that big of an issue. One potential upside: as far as I understand it vectors are just stored as such, so on pull or entity access the order in the vector is retained (set semantics otherwise)? Thank you both for being so helpful!

Steven Deobald12:06:07

@mithrandir03 :drum_with_drumsticks: Since I'm guessing you might have more questions in the future, I would strongly encourage the official channels: Zulip (https://juxt-oss.zulipchat.com/#narrow/stream/194466-xtdb) or Discourse (https://discuss.xtdb.com/). Slack works, but it's an unofficial support channel that some of us would love to kill off. 😉 Zulip and Discourse have a few nice properties. They are both: a) open source (self-host-able) b) public by default (index-able by search engines) c) threaded by default (old conversations can be resurfaced) d) comfortable for non-Clojure programmers Of course, we all have Slack open all the time, so it's hard not to appreciate its convenience. But you're also welcome to fork a Slack conversation by linking to the thread in Zulip or Discourse if you start a conversation in here and later decide it's useful to a wider audience. 🙂

refset12:06:59

> Have you thought about auto-resolving e.g. [:xt/id "foo"] as refs? IIRC, Asami does that while being schema-less. It's an interesting idea. I can imagine introducing a new edn reader tag for something like this (e.g. :ref #xt/ref "bar"), but I think that would then need to be applied universally (like, that should be the only way to indicate a reference), and therefore make it harder for us to cater for JSON-first (non-Clojure) users. There are probably other implications too :thinking_face: > as far as I understand it vectors are just stored as such, so on pull or entity access the order in the vector is retained (set semantics otherwise)?

refset12:06:06

Funnily enough, there's some interesting active discussion about the merits of the entity API over on #datomic https://clojurians.slack.com/archives/C03RZMDSH/p1654604789954149?thread_ts=1654525204.879369&amp;cid=C03RZMDSH

👍 1
zeitstein12:06:54

@U01AVNG2XNF I understand and support that. You guys are incredibly responsive, so asking questions from a smaller community might not matter much 👍 You should go the draconian route and close the Slack channel 😅 > I think that would then need to be applied universally Yup. This also means you could do :refs [[:xt/id "foo"] [:xt/id "bar"]] and it should be able to resolve it into a multi-cardinality ref. And also support managing refs during transactions? (Maybe https://github.com/threatgrid/asami/issues/223 would be needed?) No idea about JSON consumers, but enforcing a key by convention (like you do with :xt/id) doesn't seem too novel a concept?

👑 1
refset13:06:45

It would definitely be a fun change to contemplate! For context, this is probably the last juncture at which some of "lookup refs" were given any serious thought: https://github.com/xtdb/xtdb/issues/25 🙂 For now though, I can only recommend seeing what's already possible in userspace via transaction functions (e.g. write-time schema validation also!).

Martynas Maciulevičius08:06:00

Q1: Does XTDB DI framework support resource closing? For instance I found that some older code from the project that I work on returns a Closeable from the initialization functions. Do they get closed one-by-one?

jarohen08:06:52

yep - we close Closeable resources one-by-one in reverse startup order

Martynas Maciulevičius08:06:08

Q2: Are there any plans of extracting the dependency injection framework from XTDB? IMO component is quite verbose when it's enough to have the kind of verbosity that XTDB one has.

jarohen08:06:14

not currently, no - tbh I'd now recommend James Reeves's https://github.com/weavejester/integrant library instead

Martynas Maciulevičius06:06:24

I looked into Integrant and I think that XTDB's framework is better. It's better because it doesn't abuse keywords and I can jump to implementations that I care about. I can also have an if to swap for a different module config if I care about it instead of inheriting a keyword. Also I don't like that integrant allows to inherit keywords. It's not good because then you not only can't jump to the implementation and understand it but if you don't import it then you'll crash without knowing how to fix it. By importing functions you not only load code but you also can jump to that code when you use an IDE. So if you use keyword-only configs you not only write parameter lists in the DI config but you also couple them to a place in code which you can't link directly to (yes they use spec for it but then why can't I simply jump?). It's probably ok if it's .edn file but it's not ok if it's done in .clj source files

Steven Deobald12:06:07

@mithrandir03 :drum_with_drumsticks: Since I'm guessing you might have more questions in the future, I would strongly encourage the official channels: Zulip (https://juxt-oss.zulipchat.com/#narrow/stream/194466-xtdb) or Discourse (https://discuss.xtdb.com/). Slack works, but it's an unofficial support channel that some of us would love to kill off. 😉 Zulip and Discourse have a few nice properties. They are both: a) open source (self-host-able) b) public by default (index-able by search engines) c) threaded by default (old conversations can be resurfaced) d) comfortable for non-Clojure programmers Of course, we all have Slack open all the time, so it's hard not to appreciate its convenience. But you're also welcome to fork a Slack conversation by linking to the thread in Zulip or Discourse if you start a conversation in here and later decide it's useful to a wider audience. 🙂