Fork me on GitHub
#xtdb
<
2020-05-23
>
rschmukler18:05:04

Hey Crux folks! I was wondering what the design reasoning was behind enforcing that IDs be keywords and not allowing strings? Since Clojure keywords are interned doesn't this mean that a stream ingesting an infinite sequence of data will eventually run out of memory as the keywords for each ID are never released from the runtime?

jarohen18:05:03

Hey @rschmukler 🙂 Keywords are just one of the ID types we support - currently we also support UUIDs, URLs and URIs. We're also actively considering how best to support string IDs as part of our upcoming index changes - currently strings are handled differently in the index because of the ability to efficiently range search over them (which the other ID types don't have)

rschmukler18:05:42

Thanks for the quick reply! That makes a lot of sense. I've currently just been wrapping transactions with something that generically encodes strings to keywords - but supporting range queries would definitely be nice. Similarly, it may be worth considering integer based primary keys which also may have a range aspect. One other thing that I've been doing is allowing for tuples so that I can use other service providers' primary keys as IDs. Eg, if I'm ingesting a tweet I use [:twitter/id 12345] and then I have a function that encodes that into a keyword of :db.id.twitter/12345 - this feels especially weird since that keyword is not syntactically valid (ie. you couldn't write :db.id.twitter/12345 and have clojure pick it up) despite the fact that you can create it with the keyword function

4
teodorlu22:05:28

I'm curious about what people are doing for foreign keys / composite keys like this. I guess URI encoding could work? Coming from datascript, composite keys like [:twitter/id 1234] smell familiar ...

teodorlu22:05:38

Idea: Edit: the #crux/id reader doesn't seem to like this.

refset22:05:44

I can see range searches for IDs being handy - will give it some thought. @U3X7174KS for composite IDs - are maps not sufficient for your use-case?

👍 8
rschmukler04:05:08

Just tossing in my $.02 on this thread. I'd lean away from URI-based IDs to solve this - it feels like moving towards strings over data structures. Not saying that they don't have a place in general, but data structures feel better when possible (eg. structured editing), Regarding maps vs tuples, personally I lean towards tuples because they feel more datomic like, but perhaps it's just a preference on my end.

👍 4
teodorlu07:05:31

Maps seem to be sufficient :thumbsup: I guess it's largely a matter of familiarity. {crux.db.id #crux/id {:twitter/id 1234}} was a little surprising to me.

🙂 4
👍 4
rschmukler18:05:14

I do definitely think it's useful to retain the compare-based aspects of ID order - eg. Reddit's pagination API uses the order of the IDs with an :after parameter. While that can be re-implemented in Crux elsewhere, I think having it baked into the notion of identity is powerful