Fork me on GitHub

I’ve been playing with Crux the past few days and I can’t quite figure out how to handle the following case. Say I’m storing a bunch of user accounts and each account has an account name that can be changed, stored with a :account attribute. The account name is NOT used as the :crux.db/id. This ensures that joins are stable in the face of users changing their name. How do I ensure the uniqueness of the :account? I can query to determine if somebody else is using the desired account name and then, if not, generate a transaction to update the user’s account, but then there’s a race condition where two users could both choose “bob” as their account name at the same time. I can’t use a CAS transaction because it doesn’t allow me to query. It simply allows me to check that the document I’m updating is the same as I thought it should be. In Datomic, I would handle this sort of thing in the transactor.


Regarding uniqueness, this discussion might be useful:


Hi @U7BEY9U10 ^that recommendation in the chat history is still the answer I'd give today (thx @U3SG7RX7A) - there's probably enough consensus for it now that I will add it to the docs somewhere.


Hm. OK, thanks. I see what’s going on. Given that crux.db/id has very set formats, how is this done in general (for numbers, arbitrary strings, etc.)? In the previous chat you reference, the programmer wants to make email IDs unique, which you suggest to convert to a mailto: URL. That’s fine for email, but it won’t work if I want an arbitrary string with spaces and nasty characters. Yes, I could hash it somehow, etc, but at some point you just wonder if being able to enforce some sort of constraints is just a necessary function for a DB to have.


For converting arbitrary strings to IDs you could base-64 encode and make a keyword (prefixed with a non-numeric character). Maybe there are simpler ways though...I would be happy to continue brainstorming! As for enforcing constraints, the core model of Crux relies on not being able to peek inside of documents in order to process transactions, as this is what enables eviction to occur whilst retaining an immutable tx-log. There may be domain-specific cases where eviction isn't actually needed, in which case you can safely build a constraint layer around the core, but for the general case it may simply not be a resolvable problem, the jury is still out 🙂


@U899JBRPF, I think there are a couple of things that are interacting here. There’s a need for some of these atomic actions, where I need to query the DB to find out something, then transact based on that, atomically. A typical SQL database would allow you to have constraints on a column to help with some of that or sometimes wrap a whole transaction around it. Datomic handles it by allowing you to send functions to the transactor where the transactor will execute them “between” other transactions and they are therefore atomic. The Crux mechanism documented here is obviously a workaround and it doesn’t seem like this sort of functionality was really designed in from the start.


There is actually a not-so-secret transaction function feature already, but we have it hidden behind a feature flag for now because it exposes the unresolved consistency problems relating to use of eviction (as mentioned):


In regards the uniqueness workaround, I see it as a natural pattern that emerges from the design rather than a mere hack. This is good feedback though, thank you, and we're certainly open to ideas!


OK, thanks, good to know.

🙂 4

OK, I implemented this today for my test and it wasn’t too bad. One thing that’s mildly annoying is that :crux.db/id does not accept an arbitrary string, but only keywords, UUIDs, URIs, URLs, and maps. It would be nice if you could just give it an arbitrary Clojure data type, including strings and numbers, without having to turn those into the supported types. Sure, a map is probably simple, but say I want to make the DB ID an email like “<mailto:[email protected]|[email protected]>”, why force me to turn that into a URI (“<mailto:[email protected]>”) or a map ({:email “<mailto:[email protected]|[email protected]>”})? Is there are particular reason that IDs can’t be strings or numbers directly?


I believe the theory is that using strings/integers for IDs would increase complexity in the long term (both for the core team and for users), as it suddenly becomes hard to know which values might be references, i.e. it's valuable to impose a semantic distinction. I think the current indexes could hypothetically cope with allowing strings/indexes (because of how everything is hashed), but we don't want to preclude using more succinct indexes in the future


OK. You might want to add some discussion to the docs about why things are the way they are. Obviously, URIs/URLs are strings of a sort, as are keywords. So it seems odd to not have just strings or numbers, particularly since maps are allowed and maps can contain strings and numbers. I guess in terms of enforcing uniqueness, that was one of the parts of the workaround that seemed like needless steps. Again, not fatal, but odd, particularly when there isn’t any sort of explanation.

👍 4

In SQL, I’d use a uniqueness constraint on a column and the update would fail if somebody already chose that name.


Is there an easy way to force Crux to return all the entity IDs in use (`crux.db/id`)? I tried to query for it using a :where [[e a v]], but Crux doesn’t seem to allow the attribute to be a wildcard, just entity and value. If there’s no common attribute for all entities, how do I return all entities. What’s the equivalent of SQLs SELECT * FROM Foo?


You can do this 🙂

(crux/q (crux/db db)
             {:find '[id]
              :where '[[id :crux.db/id _]]})

🙏 4

Ah, thanks.


The CAS example at doesn’t actually show what the documents look like (it just has {…} for each). Is the crux.db/id part of the expected document?


The {…} represents a full document, so :crux.db/id and all