architecture 2023-09-07 | Slack Archive

timo13:09:04

Hi there, is anyone using anything different than RBAC? I wonder if anyone experienced an explosion of roles and if there are good alternatives out there that you recommend. (I know of OPA and Keycloak but prefer lightweight solutions)

danieroux13:09:58

We're doing ABAC, and so far it has worked out well. This standard is surprisingly readable: https://csrc.nist.gov/pubs/sp/800/162/upd2/final

👍 4

timo13:09:56

did you implement it yourself or are you using some lib or service or anything?

danieroux13:09:00

Implemented ourselves. Using Fulcro-RAD, and Pathom3 as "Policy Decision Points" in ABAC language.

👍 2

pithyless13:09:29

IMO, Pathom/EQL is unsung hero in Clojure sphere for ABAC. (Even if you don't expose it, but wrap it in a REST, GraphQL, whatever for the end user)

👍 4

lukasz14:09:39

I spoke to a couple of people using this for RBAC https://github.com/ory/keto - not sure if you need to run the test of the Ory stack (which also looks pretty good)

👍 2

krzsztf22:09:24

There is new AWS offering https://docs.aws.amazon.com/verifiedpermissions/latest/userguide/what-is-avp.html for RBAC+ABAC, and related open source project https://github.com/cedar-policy/cedar Really interesting introduction here https://www.youtube.com/watch?v=k6pPcnLuOXY

👍 2

Thomas Moerman09:09:11

Great 🧵. I have a question: how do you manage the interplay between authorization rules and queries on large datasets? E.g. "show me all items under this parent item for which i have write permissions". With Pathom alone this can become quite expensive to compute, so in my current approach, certain queries (and their parent->child resolvers) need to be authorization-aware, which then kind of forces me to model authorization concepts in the domain model so that the query engine (xtdb in our case) can efficiently respond, with pagination, sorting etc. I was wondering how you guys tackle these issues. Thanks.

lukasz14:09:25

That’s one of the reasons I wouldn’t implement this myself - the more complicated entity and permission graph grows the harder it is to have a solid solution, especially if it’s not the core offering wheat you’re building. PolicyAgent, Keto and others are designed to handle exactly that problem for you

lukasz15:09:29

As it happens, I stumbled upon this just now: https://github.com/authzed/spicedb looks really good

👍 2

pithyless15:09:11

I don’t understand how these services - as interesting as they look, especially all the oauth integration etc - solve the problem raised by @U052A8RUT: improved querying of datasets with granular permissions.

lukasz15:09:08

That's what SpiceDB is for - you define your schemas for various entities and store relationships and permissions there. From there you can query it to check if user A has access to document B and so on

lukasz15:09:59

(well, not only SpiceDB, other things I mentioned do this too)

pithyless15:09:10

OK, I'm going to go out on a limb and show how naive my understanding is. Apologies in advance, but I am interested in understanding how this works in practice. 🙇 I first assumed, all these permission languages define rules such as "<principal can read resource, such that resource has owner and resource.owner = principal>". But then I would need to fetch N documents from the db and make N queries: "Can user X see document Y?" But it's starting to sound, like these systems like Zanzibar (which sounds like a paper I need to add to my reading list) assume that you will write to the permission DB things like "<User X has read-access to Doc Y>" and so then presumably you can ask the permission system with single query "Please give me all Docs that User X has read access to". 1. Am I missing the forest for the trees or are we at least in the ballpark? 2. Everywhere I said resource "document", I'm assuming we can replace with attribute "document-title", because it's just a question of granularity?

lukasz15:09:04

Yes, all these systems are design pretty complicated relationship graphs, types of resources with nesting, user groups etc - so the answer to both questions is yes

lukasz15:09:00

and no worries, I didn't get the point of this in the beginning either! 4 projects later (which equals 6 attempts at building permission systems) and I learned to never build this again myself if the relationships get very complicated (multiple roles, assertions for groups, etc etc)

lukasz15:09:37

if you ever used something like AWS IAM, you basically know how this stuff works in the day to day usage

pithyless15:09:38

OK, so basically it's kind of a distributed database problem. When we update some data in our primary DB storage, we also have to ping the permissions DB to tell it "this new Document X was created" and btw "User A and User B have now read access to X".

lukasz15:09:22

Yes, that's the only downside I guess - you have to write your data to one store, and the entity info to another one - but again, that might be a worthy tradeoff given the reduction in complexity

pithyless15:09:30

So the alternative would be to have custom index queries that materialize these custom access patterns. (Where you wouldn't have to rely on yet another DB and permission language, but as you say you would be in a way reinventing the wheel and it is probably not your core product)

pithyless15:09:07

Thanks for the explanation @U0JEFEZH6 - the bit that was missing for me is that in case of ABAC, you need to reify and store all the attribute-identities in this external permissions DB.

lukasz15:09:42

Yes, although keep in mind that a lot of these projects use your off-the-shelf storage already (Postgres), so it's not like you're running a whole separate stack. Your point is still valid though: you're reinventing the wheel and most likely gonna end up with the same thing, but worse because it didn't have that many eyes on it :-) (as I did many times before)

pithyless16:09:08

I suppose we could take this further, and actually treat your Zanzibar et al storage as primary (since it knows about all the relationships of things) and perhaps just treat your previously primary storage like a big document/sha store.

lukasz16:09:01

yes, that's how I've seen it done before - of course it depends on the access patterns

lukasz16:09:08

I've used this strategy for search: the engine would only return IDs of matched documents, and everything else was pulled from the main database (I don't want to derail the discussion too much, just giving more examples of having multiple storage engines for different purposes)

pithyless16:09:17

once your start having to mix and match granular permissions with other search patterns (think full-text storage or custom sorting) we probably end up swinging back to now the permission storage is not particularly great oat giving you good pagination

lukasz16:09:06

yeah, it gets very complex at that point - you have to store some basic ownership info everywhere, especially when it comes to search

pithyless16:09:43

OK, this was discussion was informative (at least for me; hopefully for others as well 😅). :spock-hand:

lukasz16:09:21

🙏 I hope so, I just want to save people's time :D

timo17:09:59

Thanks for the discussion. Something I am looking out for is to use NOT a client server model again. I would love to have something that runs in my JVM process like good ol' RBAC via buddy. All these solutions have all bells and whistles for Google-scale that I'd rather not take up on. I've seen https://github.com/borgeby/jarl and it could be interesting but still I have to use a new language for something that could be easily expressed with some EQL or Datalog in Clojure.

👀 2

lukasz17:09:16

oh, this is great, TIL

fuad23:09:27

I was faced with the question of how to do RBAC/ReBAC recently and I came across https://www.osohq.com/post/why-authorization-is-hard which helped understand these authorization systems a bit better. I also took a deep look at OSO, the service that published that article, to understand how they solved some of these problems. In the end, my use case ended up being relatively simple to implement in house so that it didn't justify using an expensive third-party and making authorization a distributed systems problem but I too a lot of inspiration from their approach and "unbundled" authorization data from business data as much as possible in my app. This made it pretty easy to answer questions like "what are the resources this user can do X to", "what are the actions this user can do to this resource", "what are the users that can do Y to any resource", etc. I can see where a more robust service could be handy if your model is too complex: deep hierarchical relationships of resources or actors (groups, organizations, etc), centralization of authorization logic when data is distributed across microservices, etc. I wasn't aware of some of the tools posted on this thread and now I'm curious to look them up!

👀 4

fuad23:09:22

While implementing my little thing I also had the feeling that Datalog and triple stores would be great tools to express the relationships between entities, roles and policies, but i didn't go that far down the rabbit hole. I'd be curious to see something like that (although I tend to agree with the point made above that building such a generic system is very difficult and it takes some reinventing the wheel given the existing solutions).

Thomas Moerman14:09:23

Great reference @U3RBA0P4L!

pithyless15:09:07

The links from this https://news.ycombinator.com/item?id=33317597 are perhaps a good index of competing products and approaches in the OSS/platform space. Thanks @U3RBA0P4L for linking to the OSO article - it helped me better understand the decision matrix these libraries are taking, which is helpful in further evaluation.

2023-09-07

Channels