Fork me on GitHub
#xtdb
<
2023-02-02
>
Hukka06:02:38

Has anyone had to dig into an unknown XTDB database, and how did you do it? I don't have the need, but I wonder what the approach would be compared to SQL where I would first study the schema.

tatut06:02:38

I actually made some code that queries the database and generates pretty graphviz diagrams from the “document types”

tatut06:02:01

it isn’t fully generic sadly so I haven’t published it, maybe I should at some point

tatut06:02:16

it works for our db, because we model “entity type” in the :xt/id (eg. {:person #uuid "…"} ) so it can build model by looking at the docs

👀 2
tatut06:02:06

but you can also use XTDB inspector to explore a database in the browser

refset19:02:48

I feel like Malli's inference capabilities must be able to help out here.

refset19:02:09

the attribute-stats API may help, but I don't think you can really avoid a full scan of the data to understand the range of values and figuring out what may or may not be ~FKs (foreign keys)

Hukka08:02:42

Have you considered exposing RocksDB checkpointing via API? I'm thinking of doing something like starting a new node on demand, and creating a checkpoint for that node by calling a function, instead of just configuring constant, time based snapshots. If I had a handle to the RocksKV, I could call the save-checkpoint easily. Or perhaps even have a direct handle to the RocksDB db instance. Though is that safe while XTDB is also using the handle :thinking_face:

tatut09:02:33

this would be good

tatut09:02:17

another use case, migrating lots of data from another system… I want to make a snapshot immediately after the migration is done, and only start new nodes after that

tatut11:02:58

also, would be good to trigger a manual checkpoint just before doing an upgrade so new nodes don’t have to work so much

refset20:02:02

However, there has been reluctance to blindly implement and offer something like this without having a more holistic understanding of the use-case(s), as it is likely that a more comprehensive high-level solution is the better route.

refset20:02:21

(I'm going to copy these messages there and link back here.)

Hukka06:02:36

Looks good, seems like a good workaround

Hukka15:02:49

Feels like I've seen this discussed, but at least couldn't find it… (need to move it into discuss if I get an answer!). Is there a performance difference between having a document with a vector of values, vs splitting them to many documents that refer to the original document? I suppose that's assuming it's a flat list, because if they aren't then the split to many documents will fill indexes a lot more. I'm thinking if there's a trade-off between the write amplification caused by changing the elements (or just appending) vs possibly slower queries (if querying for all elements from the parent)

refset21:02:55

I can't think of more specific guidance right now (sorry!), but I suppose running some experiments could make for an interesting blog post :thinking_face:

Hukka06:02:47

Heh, yeah. If we had time for a technical blog, XTDB has a lot of new ground to cover 😉

marciol19:02:33

Hey, a really quick question that should be easier to be answered by anyone here. XTDB has the same way to bring the client computational power to help query scalability/performance as Datomic peers?

refset22:02:31

Hey @U28A9C90Q you may want to give this a read https://docs.xtdb.com/resources/faq/#comparisons but the short answer is that XT uses a full replica of all the data locally, which is subtly different from the "peer" model that I believe dynamically manages a cache of the working set. So the scaling is different.

marciol22:02:31

Yes, but in general it's better to use it through JVM so that we can have all power provided when data is local through this replica right?

refset22:02:20

Correct, yep. The main complication is how closely you really want to couple the "application" to the "database" as the line can become very blurred. Sometimes this is good, but frequent redeploying/rebuilding of the database has operational costs that can add up in the long run.

marciol22:02:00

Nice, thanks for the explanation.

🙏 2
marciol22:02:56

I was thinking about how Rails is still a thing and how usable would be XTDB from a JRuby app with Java bindings

marciol22:02:29

But I'm landing on offtopic field

refset11:02:08

It would be great to see XT become a defacto choice for another ecosystem. Have you been looking at other interesting alternative Rails backends for comparison?

refset11:02:43

I was thinking briefly about how/why Django is still so prominent the other day. That would be another target in this vein.

marciol00:02:36

Hey @U899JBRPF for sure, but I have seem that Rails has still great reach in the webdev space among startups that need to bootstrap a product quickly

👍 2
marciol00:02:41

And I don't know about Python, but Ruby has a great history in the JVM, even more with what is coming regarding truffle ruby who has direct development support from shopify

💯 2
marciol00:02:57

So it'd be something to pay attention

marciol00:02:12

Elixir is another great platform, but since it's based on Erlang we can only use http wrappers. There are some in the wild, a friend of mine @U884GE2FJ did this one: https://github.com/naomijub/translixir

🙂 2
marciol02:02:34

hey, good catch, gonna see it

marciol19:02:51

What I'm trying to figure out is if there is any advantage to use the JVM client vs the http interface

refset22:02:34

If you have lots of small queries, the roundtrips to the RocksDB instance sat next to the JVM will be much faster than running those same queries over HTTP

metal 2