xtdb 2022-04-05 | Slack Archive

Martynas Maciulevičius07:04:31

Can I do this in :find (the map syntax instead of a vector tuple (I can write a layer on top of it but first I want to ask directly))?

[{:result         (pull panda [*])
  :something-else (distinct something-else-id)}]

refset08:04:05

In short, no, but you could use a subquery or custom function

jarohen08:04:30

if you're looking to output maps, you can do this with :keys :

{:find [(pull panda [*]) (distinct something-else-id)
 :keys [:result :something-else]}

🙂 1

👍 1

Martynas Maciulevičius08:04:22

Correction (I can't use keywords there and I have to reuse the symbols):

{:find [(pull panda [*]) (distinct something-else-id)
 :keys [panda something-else-id]}

Docs: https://docs.xtdb.com/language-reference/datalog-queries/#return-maps

Martynas Maciulevičius09:04:15

I don't yet know if it's useful for me though. I wanted to group the arguments so that they would be grouped for composability. I.e. if I have multi-level transformation of the query that may introduce more result variables and then take them away as the parsing function stack unrolls. Currently I only want one of these "levels" and I don't yet know what's the best approach. It's good that this mapping approach exists though. But I could also use numbers just fine. I have to decide what I want first. Thanks.

👍 1

Martynas Maciulevičius10:04:26

Is it possible to only have current version of a document without any history at all? No time travel -- no future, no past. I.e. completely empty entity history that won't accumulate but with updating of the doc.

tatut10:04:23

well you can evict old versions

tatut10:04:36

but there are datalog implementations (like datalevin) that don't have history, so those may be more suitable for your use case

Steven Deobald11:04:34

@U028ART884X It depends if you're referring to storage or query, I guess. If you're referring to storage, @U11SJ6Q0K is right — Datalevin is probably the only Datalog database that does what you want (all others keep history). If you're referring to query, xtdb behaves this way by default. If you don't specify any historical or bitemporal parameters, time is completely ignored and it behaves as any other "now-focused" mutable database.

✅ 1

Martynas Maciulevičius12:04:47

Hey. I was looking for :args documentation and I can't find it. I looked here: https://docs.xtdb.com/language-reference/1.20.0/datalog-queries/

Steven Deobald12:04:11

:args is deprecated so the kw you're looking for is :in

lepistane14:04:44

Maybe this is too basic question to ask but i will do it anyway. I am putting some documents in XTDB. I am setting :xt/id (random-uuid) And everything works fine. But i am thinking should i just use plain string

:xt/id (str (random-uuid))

I can't think of any benefits of having UUID except as hustle of having to convert it back and forth to/from string. Am i missing something? Are there any wins by having uuid as id?

refset14:04:04

Nothing is too basic around here! 🙂 Actual UUIDs take up many fewer bits, both when on disk and in memory - these kinds of things can definitely add up

lepistane14:04:14

jesus christ... i didnt know Is this correct? uuid - 128 string - 16 per character so in my case times 49 = 784 bits That's quite a big difference

refset15:04:06

😄 that sounds about right

🙌 1

refset15:04:19

this is a good example of why so many post-JSON people are drawn towards protobuf

👍 2

💯 1

lepistane15:04:33

nice, thanks for this piece of info

☺️ 1

Steven Deobald17:04:15

I would add that the string representation of the uuid is its derivative form, not the other way around. One could argue that's splitting hairs in this case, but the UUID itself is a real type with meaningful component parts. If you ever want the UUID interface for any reason, you're forced to manually parse it "back" into its source form. A string UUID can be thought of in a couple ways: either as a serialized form of a true UUID or as an untyped UUID representation. Either way, string uuids as ids always feels wrong to me on some fundamental, philosophical level. 🙂

Hukka05:04:43

String uuids, as in string that happens to be an uuid representation, or something that is defined to be one? In our case we internally store real UUIDs, but in the API we stringify it (well, of course, since it's JSON), and never promise anything else about the id except that it's a string, so we can change our mind in the future without breaking changes. Perhaps we are being too careful? External systems shouldn't be using our id as an index anyway, since they have their own primary keys.

Steven Deobald14:04:46

> String uuids, as in string that happens to be an uuid representation, or something that is defined to be one? The latter. > Perhaps we are being too careful? Depends on your definition of "careful"? 😉 That is roughly what I do as well. We have a documented assumption that string UUIDs we provide over HTTP/JSON are, in fact, parseable as UUIDs. But that's not enforced anywhere except at the time they're generated and the client(s) have no way of notifying the server that a UUID is breaking that soft contract. Because we only ever type UUIDs as UUIDs internally, the string representation only exists at the boundary, just before it goes out on the wire. Thus it's pretty unlikely that contract will ever be broken. If you were passing around string ids internally instead of UUIDs, I'd lean toward toward the "we only ever said this was a string" promise. It might be six of one, half a dozen of another, though. > External systems shouldn't be using our id as an index anyway, since they have their own primary keys. Agreed, so maybe I'm overthinking it in your case. (Our clients don't have their own primary keys, so a stricter API promise is helpful.)

sheluchin14:04:44

I'm finding it difficult to arrive at a query composition strategy. I end up with a whole lot of query functions that seem to get too specific and many repeating clauses. I know rules can be used to help, but they seem to have some performance gotchas and I haven't managed to use them effectively to arrive at elegant query composition. Any tips or example repos to help with this? Using Pathom in a similar fashion to https://github.com/wilkerlucio/pathom-datomic seems like an interesting approach, but I think I need something simpler for now, if possible.

refset15:04:41

I don't have a handy library or small example to point you at, but you might be interested to poke around Site also (e.g. https://github.com/juxt/site/blob/9769aa364933535ec54cdf51a93ed5f8eeea7cad/test/juxt/site/graphql_test.clj, https://github.com/armincerf/site-wordle/tree/main/src/operations) to see how we map (many) XT requests to GraphQL in that world

sheluchin15:04:23

Thanks @U899JBRPF, I've been meaning to check out Site at some point anyhow. Will take a look at that.

☺️ 1

PB18:04:06

Sorry if I'm missing something obvious. I'm looking to experiment with XTDB. I cannot find any documentation on how to setup

Steven Deobald18:04:25

https://docs.xtdb.com/administration/installing/ https://docs.xtdb.com/guides/quickstart/

PB19:04:10

Thank you

PB19:04:52

I was more looking for resources on how this would be deployed

Steven Deobald19:04:44

Ah, gotcha. More or less in the same way. Maybe if you describe your specific scenario we can help point you to something?

Steven Deobald19:04:34

At the moment, there are a lot of ways to deploy xtdb (probably too many, tbh) so it might help to know what you're hoping to achieve.

PB19:04:01

So aside from the data source, there doesn't appear to be a "transactor" if I were to talk in datomic terms. How does it deal with the challenges we would see in a distributed environment?

PB19:04:30

I guess that's kinda what I'm missing

refset20:04:13

Each XT node is an isolated island of local, deterministic indexes and an XTDB system is only distributed by virtue of a shared transaction log and document store (...and index checkpoints), therefore all of the hard durability and consistency challenges are taken care of by the underlying storage implementations, e.g. Kafka, S3 Which kind of challenges were you thinking of specifically? I suppose one challenge example I can think of that came up today in a very tangential context (there's a Zulip thread) is the notion of "where the default valid-time clock source originates from" - see https://github.com/xtdb/xtdb/issues/1665

richiardiandrea23:04:48

Hi there, has anybody ever tried using the https://impossibl.github.io/pgjdbc-ng driver with XTDB?

richiardiandrea23:04:04

I did 😄 It seems like I can establish a db connection but then I get:

Node out of sync - requested '#:xtdb.api{:tx-id 156, :tx-time #inst "2022-04-05T23:18:59.920-00:00"}', available 'null'

In the DB I do see:

156 |                                          | 2022-04-05 23:18:59.920625+00 | txs

Not sure what the deal is there

refset07:04:35

Interesting! I haven't heard of anyone trying that before though. What's the motivation for wanting to use it?

richiardiandrea17:04:26

@U899JBRPF it has some extended functionality like LISTEN/NOTIFY listeners However I see too many blockers at the moment for it (xtdb aside) so I might just abandon the idea

👍 1

richiardiandrea17:04:45

For instance there is no implementation for PGObject

refset17:04:19

Ah cool, thanks for updating us - that's a shame though, it does seem interesting!

2022-04-05

Channels