xtdb

tatut 2025-07-02T07:06:32.883699Z

the intro docs seems to use integers as _id are there any restrictions on what can be used? UUIDs seem to work, DATE or URIs etc not

tatut 2025-07-02T07:07:38.849199Z

it seems you can use different _id types in the same table, any problems with that?

tatut 2025-07-02T07:09:30.536709Z

if I have a table with UUID and integer ids, then I can't use non-equality comparisons (like > ) where clause, as I get

ERROR:  class java.lang.Long cannot be cast to class java.nio.ByteBuffer (java.lang.Long and java.nio.ByteBuffer are in module java.base of loader 'bootstrap')

tatut 2025-07-02T07:09:44.384129Z

makes sense, probably not a good idea to have different _id types in the same table

refset 2025-07-02T08:33:01.987719Z

UUIDv4 is the default recommendation for _id, per https://github.com/xtdb/xtdb/blame/3ca573544b121860977710912f1289b383724948/docs/src/content/docs/intro/data-model.adoc#L13 Integers are really only used in the examples for brevity. You can cast to strings if you need to compare across different _id types The exhaustive list of currently supported _id types hasn't been documented yet but we'll fix that

jarohen 2025-07-02T08:42:13.512559Z

> are there any restrictions on what can be used? yes, there's a spec in the code somewhere that I'll dig out. we started very restrictive on this and haven't yet released the ropes again 🙂 anything else you'd particularly want?

jarohen 2025-07-02T08:42:29.102749Z

> it seems you can use different _id types in the same table, any problems with that? nope, should be the same as putting different types in any other col 🙂

jarohen 2025-07-02T08:43:05.358659Z

> if I have a table with UUID and integer ids, then I can't use non-equality comparisons (like > ) where clause that's true, but I was expecting a different error message there - we usually get 'compare not supported on types integer and uuid'

tatut 2025-07-02T08:45:01.476079Z

well UUID makes sense for many tables, some could have a DATE id as well (one entry per day type of things)

jarohen 2025-07-02T08:45:12.000659Z

but yeah, as @taylor.jeremydavid says, best to use UUIDv4s if you've no other preference - in spite of all of the usual database warnings about "don't use random UUIDs in indices!", "if you must use UUIDs make sure they're the timestamp ones" etc - XT works best with a well distributed id space

tatut 2025-07-02T08:45:15.569269Z

but for integers you'd really need a sequence/auto_increment functionality

tatut 2025-07-02T08:46:12.434389Z

so random UUIDs are actually the best id type for xtdb?

tatut 2025-07-02T08:47:02.178579Z

I was searching for some discussion of _id in the documentation, but couldn't find it. Best practices for those would be valuable in the docs

👍 1
jarohen 2025-07-02T08:47:06.295959Z

yep, contrary to all the usual database advice 🙂 it's because the index we keep on _id is a hash trie rather than the usual b-tree

✅ 1
jarohen 2025-07-02T08:49:19.769529Z

you can also (advanced, optional) use that to your advantage if you curate the first few bits/bytes of the UUID to be a partition key - e.g. if you have a parent/child relationship between entities, giving the child UUID a few bits from the parent's UUID will naturally group all of the children for a given parent in the index; they'll be co-located

jarohen 2025-07-02T08:49:29.618049Z

definitely one for a best practices doc

tatut 2025-07-02T08:53:47.475139Z

hmm... wrangling bits in a UUID might be cumbersome

jarohen 2025-07-02T08:54:16.881649Z

yeah, agreed, partly why we haven't promoted it as a recommendation as yet

jarohen 2025-07-02T08:54:23.262369Z

needs some utilities, really

jarohen 2025-07-02T08:55:01.939579Z

now that we have a database actually working E2E with reasonable performance and stability it's time for us to start dotting these i's and crossing these t's

tatut 2025-07-02T08:55:03.363939Z

or even a database supported "type" for it

jarohen 2025-07-02T08:56:41.794949Z

ooh, I wonder what that would look like...

tatut 2025-07-02T08:56:59.195649Z

🤷

tatut 2025-07-02T08:57:09.890709Z

but some blessed "best" ID type would be neat

tatut 2025-07-02T08:57:42.102819Z

also what I would like, in addition to the SQL time travel, would be a git like "time proof" reference type

jarohen 2025-07-02T08:58:00.581159Z

I mean, we could also go for supporting composite IDs, getting the user to declare it with lightweight DDL (obv keeping to our schemaless roots)

tatut 2025-07-02T08:58:01.301489Z

like, "references table X, row Y at tx Z" in a neatly packed single value

🤔 1
jarohen 2025-07-02T09:00:26.602279Z

so yeah, in git that's objects with hashes... ok... 💭

2025-07-02T11:40:07.308279Z

jarohen 2025-07-02T11:41:18.124009Z

o..k..

😅 1
jarohen 2025-07-02T11:44:04.120499Z

looks like a github employee. we're investigating.

jarohen 2025-07-02T11:52:16.279559Z

confirmed, was a github employee. working on getting it reinstated

jarohen 2025-07-02T11:52:33.377259Z

just what you need on a wet Wednesday! 🤯

tatut 2025-07-02T12:08:54.004189Z

wth, github employees can delete your repos at will?

jarohen 2025-07-02T12:09:03.121419Z

evidently!

jarohen 2025-07-02T12:09:16.537319Z

the guy was only trying to delete a test security advisory

tatut 2025-07-02T12:09:30.999019Z

scary

jarohen 2025-07-02T12:09:41.357709Z

repo mostly reinstated, few details still to iron out but looks like no loss of data. heart rates slowly descending 😵‍💫

😅 3
Samuel Ludwig 2025-07-02T14:30:12.370279Z

that is... scary; thats an HN/Lobsters front-page bait if i've ever seen one

jarohen 2025-07-02T14:33:41.292469Z

ha, if you'd asked me a couple of hours ago you could probably have tempted me, but thankfully calm prevailed 🧘‍♂️ there's a nice cold beer in the fridge with my name on it for the end of the day though 🍺

❤️ 7