This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2023-11-08
Channels
- # babashka (18)
- # beginners (35)
- # biff (15)
- # cider (24)
- # clj-commons (26)
- # clj-kondo (12)
- # clojure (18)
- # clojure-austin (1)
- # clojure-dev (2)
- # clojure-europe (15)
- # clojure-losangeles (1)
- # clojure-nl (1)
- # clojure-norway (88)
- # clojure-seattle (2)
- # clojure-spec (14)
- # clojure-uk (27)
- # clojuredesign-podcast (5)
- # clojurescript (25)
- # cursive (3)
- # datahike (26)
- # datalevin (13)
- # datomic (39)
- # etaoin (19)
- # events (1)
- # fulcro (12)
- # graphql (1)
- # hyperfiddle (40)
- # introduce-yourself (3)
- # joyride (8)
- # lsp (53)
- # missionary (7)
- # nyc (1)
- # off-topic (31)
- # overtone (10)
- # reitit (6)
- # shadow-cljs (9)
- # slack-help (9)
- # thejaloniki (1)
- # tools-deps (12)
We are still getting exceptions often, when trying to get a database value (with d/db
) for the 1st time after a Datomic Cloud Ion deployment:
Execution error (ExceptionInfo) at datomic.core.anomalies/throw-if-anom (anomalies.clj:94).
Loading database
clojure.lang.ExceptionInfo: Loading database #:cognitect.anomalies{:category :cognitect.anomalies/unavailable, :message "Loading database"}
at datomic.core.anomalies$throw_if_anom.invokeStatic(anomalies.clj:94)
at datomic.core.anomalies$throw_if_anom.invoke(anomalies.clj:88)
at datomic.core.anomalies$throw_if_anom.invokeStatic(anomalies.clj:89)
at datomic.core.anomalies$throw_if_anom.invoke(anomalies.clj:88)
at datomic.cloud.client.local.Client$thunk__29787.invoke(local.clj:175)
at datomic.cloud.client.local$create_db_proxy.invokeStatic(local.clj:282)
at datomic.cloud.client.local$create_db_proxy.invoke(local.clj:280)
at datomic.cloud.client.local.Connection.db(local.clj:103)
at datomic.client.api$db.invokeStatic(api.clj:181)
at datomic.client.api$db.invoke(api.clj:170)
Is there any way to avoid this?
Is this a known issue?
Do we really need to put a retry around the d/db
call too,
like we did for the d/connect
, as recommended by
https://github.com/Datomic/ion-starter/blob/master/src/datomic/ion/starter.clj#L21-L24?I could use some advice on best practices for sensitive personal account data in #C03RZMDSH. I'm currently using CouchDB because it's very easy to keep user account information segregated via one database per user. Now I know nubank uses it for banking info, so datomic is clearly good enough for the job. The only thing that concerns me a bit is I find it a little dicey leaving all of the work of data segregation and access controls up to the application layer. So for instance -- would you make a database per user, or would you simply be very careful in the application layer and try to make an abstraction/middleware chain that prevents you from doing anything stupid?
A database per tenant isn’t realistic--databases are fairly high-cost abstractions and aren’t designed for having large numbers of them.
I suspect most people do it in the app layer entirely. However, if you are careful about assigning tenant ownership to entities in your schema in a consistent and discoverable way, and being explicit about which ref attributes can cross tenant boundaries and which cannot, you can install a lot of guardrails and affordances into your system
At one extreme, you can use a d/filter
predicate into a database to filter datoms by tenant
but that has a performance cost which you may not want to bear at scale; so you could use application code and db/ensure to enforce tenancy boundaries on write, and your queries can trust that they were enforced
an offline job could double-check tenancy boundaries periodically; if you find boundary violations and put enough metadata on your transactions, you should be able to squash those bugs fairly easily, or at least have an audit trail of the write.
this is a more out-there idea, but you could encode tenancy into the entity partition (which you probably should do anyway for performance), and that would give you a blast-damage-limiting, very cheap d/filter
that would at least ensure a bug can’t break out of the tenants in a partition
anyway, I think the most important thing is make sure tenancy is very clearly expressed in your schema, and then you can build on that to the degree you’re willing to trade safety for performance
you're using some terms I'm not familiar with and I just want to make sure that I understand what you're implying -- tenancy -- would this translate to a db.type/ref for a user identity? partitions -- I thought about this. I have no idea if a partition per tenant is feasible or not, but that makes a lot of sense. metadata -- Do you mean literal metadata in the clojure sense of metadata or do you mean database metadata encoded via schema?
these are great suggestions, btw, thank you
tenancy: Roughly “ordinary scopes of read and write”--in your example each user would be a tenant. You might have smaller units or larger ones, or even hierarchical ones. I’m just using a general term for this problem.
partitions: you probably want to assign partitions via a hash function rather than explicitly. Partitions are really a performance optimization to increase locality and my suggestion is kind of an abuse of them. note also you can’t change the partition of an entity once assigned (it’s part of the entity id) so if you make a mistake this could be a real inconvenience
metadata: I mean attributes asserted on the transaction entity itself. This is just schema.
things you might assert as metadata: a reference to the authenticated user that performed the operation; the name of the operation in your code; a reference to the tenant, etc
Yeah, makes sense. You know what I think is interesting here is using Clojure in conjunction with Datomic for something like banking. I would imagine you'd want more structured like Java classes or something to put some compile-time enforcement as a backup.
but obviously the case-study says that it works. And from what I understand, non-datomic bank code is a huge mess anyway
I’m not sure how any compile-time approach would work for what is fundamentally a data reference problem
the enforcement has to live over the data at runtime; the question is just how much is lowered into the database runtime and how much perf are you willing to trade for enforcement.
My biggest concern is inappropriate egress, which could potentially occur from inappropriately transacting combined user data, which could potentially occur from merging hashmaps that should not be merged. Can this be handled without classes? Yes. But it's harder to inappropriately merge object than it is to inappropriately merge hashmaps.
Similarly, it's harder to egress "too much" data from a hashmap when marshalled into a class then serialized.
are you talking about queries that unavoidably cross tenants and aggregate somehow? otherwise I’m not sure how classes could enforce that e.g. every entity is from user A.
And yes I know you could use spec to do this, but I try to avoid anything that relies to much on discipline if that discipline can be outsourced to a different layer like the compiler or the database
if you’re just talking about what attributes on an entity are exposed, this isn’t really a tenancy problem. You control that with d/pull on the read-from-db side and explicit serialization code (i.e. just pulling certain fields, renaming fields, etc) on the write-to-wire side
And maybe this is what I need to hear -- that ultimately it is an application layer problem, no matter which way you slice it, and that I need to evaluate if the benefits of having datomic outweigh my concerns about data tenancy.
well it doesn’t have to be application layer--some databases lower quite complex authentication and row-and-column-level authorization controls into their database, but arguably the database is functioning as an application layer at this point
yeah but who wants to use other databases besides datomic 😛
thanks, as always, @U09R86PA4. You probably have no idea how many times you've pulled my @$$ out of the fire regarding Datomic over the years 😅
A database per tenant isn’t realistic--databases are fairly high-cost abstractions and aren’t designed for having large numbers of them.it's perhaps worth noting that there are people trying to address this (but I can't say how successfully), e.g. with Postgres https://www.thenile.dev/blog/introducing-nile
However any system like this still needs to grapple with the fact that shared-nothing means things like schema aren’t shared, so migrations involve a lot of orchestration
I use namespaced keywords per tenant. Many tenants in the same db, such as:
:tenant-1.order/total
:tenant-2.order/total
etcoh interesting @U0GE6JTKK . does that imply a schema per tenant?