Fork me on GitHub
#xtdb
<
2024-04-13
>
Panel01:04:53

Many flavours of uuid out there, some are more human friendly (smaller / no hyphens) Some have a “type” / prefix ( user-37b2e-…) Some are sortable like uuid v7-8 Any opinions regarding xtdb v2 and what uuid should I use for :xt/id ?

mikejcusack03:04:40

The most commonly used for IDs is type 4

mikejcusack03:04:34

The ones that are more "human friendly" are going to be less unique due to less bits available for the randomness. They may also require additional logic to handle. Generating type 4 is trivial and is included in Java.

👍 1
jarohen07:04:45

Yep, type 4 is definitely a good default 🙂 Main thing for XT2 is to use UUID objects rather than strings (the former we store as 16 bytes, and use it directly for our internal IDs; the latter we store as a string and have to generate our own internal ID - i.e. avoid the prefix 🙂). Also to ensure that they're well distributed - type 4 are good for this out of the box; 3 and 5 are hashes so these work well too if you want to generate them from a natural key. 7 and 8 are also acceptable but you're more in charge of making sure they're well distributed.

👍 2
littleli08:04:17

there are some articles out there discussing why uuids are wrong for db keys. If I remember correctly the argument was more about uniformity of distribution. generation time and such things. Alternatives: ulid, nanoid, https://github.com/hden/cuid2 The choice is yours of course.

jarohen08:04:56

Yep, agreed, it's important to use them in the right contexts and not in the wrong ones - there are plenty of databases where they're the wrong choice, as you say. In the case of XT2, though, the primary index has been chosen/designed so that UUID v4 (or any other well-distributed UUID version) works best 🙂

👍 1
jarohen08:04:22

Particularly, it's preferable not to use string IDs (like ulid, nanoid, cuid) in XT2 (unless they're natural keys) for the reasons above

littleli08:04:14

It's good to know

Panel08:04:02

While probably very inefficient, can build a uuid UX similar to git, where given the first few chars of a uuid I could find matching xt/id ? Using v2

1
jarohen08:04:02

we probably could, yeah - good idea 🙇 not even sure it'd be that inefficient, they're stored sorted already

jarohen08:04:52

we'd have to do the collision detection, same as git, but I doubt that'll set the profiler on fire

jarohen08:04:37

interesting 🙂

littleli08:04:11

Looks like I revived discussion. I'd say sorry but good discussion should go on I guess clojure-spin 😇

😄 2
nivekuil21:04:33

(xt/submit-tx (::db env)
		[[:put-docs :resources
		  {:scv/data {:driver :macvlan}
		   :xt/id "//127.0.0.1/etc/containers/networks/local-macvlan.json"}
		  {:scv/type :systemd,
		   :xt/id "//127.0.0.1/etc/containers/systemd/jellyfin.container"}]])
results in
=> Execution error (UnsupportedOperationException) at org.apache.arrow.vector.complex.impl.NullableStructWriter/ (NullableStructWriter.java:264).
Unknown type: EXTENSIONTYPE
it seems the nested map with a keyword key is causing problems (is xt2 doing something fancier here?), but when it's the only doc in the tx the tx goes through successfully, seems like a bug?

nivekuil21:04:52

with a number key the error becomes invalid struct key: '1' but it works with a string key

jarohen06:04:09

hey @U797MAJ8M 👋 thanks for the report I wonder if it's related to https://github.com/apache/arrow/issues/40773? :thinking_face: If so, it'll be the keyword value causing issues - could you swap :macvlan for a string to confirm? If that works, then yep, it's another one we'll raise upstream with the Arrow folks 🙂

nivekuil07:04:56

yeah it works as a string -- meant to say string value above, not string key

👌 1