Fork me on GitHub

Good Morning 🙂. I vaguely remember coming across some information a discussion at one point mentioning that when using multiple databases in a datomic cloud system, there is some overhead cost for each database, which can impact the system when there are very many such datbases. I was wondering if the source of this overhead is the idents, which are stated in the datomic cloud documention to be available in-memory on all compute nodes. If so, is there any way to mitigate this cost for a set of databases which ostensibly will have the same exact idents (logically). I’m asking because I’m trying to figure out the feasibility of using a separate db per tenant in a multi-tenanted system (with roughly same-sized tenants), in an effort to avoid having too many datoms for any one system.


I think it's about cache locality @okocim. If you have a lot of different databases on the same query group, usage of one might evict others from cache


you may want to spin up several query groups and use specific DBs in specific query groups


or if you have a particularly oversized tenant db, you put it in its own query group


Thanks, that makes sense. However, I would expect the cache locality issue to also be there to some degeree if I had all of my tenant data in a single database. What I mean is that I’d expect the data from one tenant to be evicted over another. Of course this probably has more to do with segments, so I expect that you’re right in that this situation is more likely to occur with different databases in the same query group.


I wonder if this will be how we stop using Lambdas (i.e. cold starts) with Ions?

👍 5