This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2022-08-19
Channels
- # aws (2)
- # babashka (4)
- # babashka-sci-dev (7)
- # beginners (92)
- # biff (7)
- # calva (64)
- # cider (2)
- # cljsrn (14)
- # clojure (8)
- # clojure-australia (5)
- # clojure-europe (14)
- # clojure-norway (8)
- # clojure-spec (36)
- # clojurescript (19)
- # component (15)
- # cursive (1)
- # data-science (6)
- # girouette (5)
- # hyperfiddle (3)
- # juxt (5)
- # leiningen (10)
- # lsp (7)
- # malli (12)
- # nbb (90)
- # polylith (1)
- # portal (11)
- # rdf (7)
- # reagent (6)
- # reitit (40)
- # remote-jobs (1)
- # shadow-cljs (21)
- # specter (5)
- # squint (83)
- # tools-deps (17)
- # vim (7)
I’ve heard Rich saying a couple of times that map specs should be open. Does a similar advice apply for APIs of REST endpoints? Should we allow extra keys (not mentioned in the payload schema) in request payloads? If yes, should we drop extra keys or should we pass them down the stream? If no, why?
IMO definitely don't pass anything extra by default from untrusted sources, it's a security risk (e.g. allowing a :credit-checked true
to be set to fresh orders). In Java, as it's mostly closed/classes and the JSON-libs have either "strip extras" or "fail on extras". With clojure & open maps, you need an extra tool to do that automatically, given a spec/schema. I recall reitit-ring defaults to "strip extras" with all of spec, schema & malli (supports also "failing on extras" & "explicitly open" for all three)
I would use "explicitly open" only from internal (trusted) sources, and when the use case benefits from it.
What do you think about rejecting request with extra keys?
That's a bad idea I believe- with distributed systems you need flexibility and openness in what you accept. That gives you more freedom in how you do deployments, etc
I think Rich has slightly changed his opinion on that matter, since Spec 2 allows closed map specs. I've had conversations about it with the core team, especially around security use cases, and to validate maps from the producer side. And they seem to recognize those use cases. But there's a time and place for both.
Personally, on an API, I would drop them. If someone sent a well formed payload that just has more stuff on it, why reject the request?
I’ve heard people say that if you tolerate an extra field, you might have a forward compatibility issue in case you make this extra field part of the schema.
At the end of the day, you got to decide what scenario you'd rather make annoying to your clients. Do you annoy them by erroring on a request that has everything it needs, but also has more stuff? Do you annoy them in the extremely rare scenario that you didn't error when they were giving you extra stuff, for years, letting them make calls without issues, and one day you might decide to make a field of the same exact name and break their superfluous data calls?
I prefer the former personally, because in practice I find there are more scenarios where that comes in useful. Say there's a field you no longer need? Or say you're doing a deployment, and the client and server are upgrading to support a new field at the same time, if the client deploys first and calls with additional fields on server that still don't support that behavior, etc. But honestly, I find this whole situation kind of unimportant, all these are pretty rare. Most clients tend to call with exactly the fields that the API asked for most of the time.
Well, actually I take that back. The upgrade situation is pretty common. Let me talk about it a bit more.
You want to add a mandatory new field to the API... How would you do that? The new code requires the new field, it's not optional, you want to make it mandatory. But the current API doesn't take that field, and clients are calling that API without the field. How do you time the deployment of new client code which now always include this new mandatory field, and the server code that mandates it?
If you deploy the server code first, your validation will error at all current clients which have not upgraded to pass the new field. And even if you didn't validate, well the API wouldn't work, cause your new logic needs that field, it's mandatory.
But if you have all clients upgrade first, they'll see errors, because the current server code doesn't allow additional fields.
If your current code though simply ignored additional fields, you're out of the pickle. Tell all clients that a new server code will be deployed on some date, and require a mandatory additional field. Clients can start passing the data even if the current code ignores it. And when they all do, you can deploy the new server code.
Very insightful scenario @U0K064KQV!
I’ve summarized your insights in a https://twitter.com/viebel/status/1561228623331397632?s=20&t=xJGKWSvqC6QDn42N8rgNvA @U0K064KQV.
There's one more common case: silently stripping extra params and returning success may hide incorrect usage of an API (which may or may not be obvious and discovered till much later).
Consider if we have an http endpoint where I added an optional filter query-param: /search?status=pending
If I'm incorrectly using the filter param (e.g. calling it with status
not current_status
) it would strip the invalid param and still return results (just not the ones I expected). Now you're relying on tests, etc. which may or may not fail; but also may fail much later and it may take longer to understand what went wrong.
I find it useful to define public APIs with strip-extras behavior (for all the reasons @U0K064KQV mentioned above), but to define them as strict and fail-fast when in development mode (for better developer experience and faster identification of contract inconsistencies)
Yeah. There is also an intent concern
I wonder if APIs should return warnings. Succeed, but warn :thinking_face: I also wonder if clients should be allowed to choose the behavior they prefer, can the api take "mode=strict" for example.
Interesting.
Sorry for the late reply but I just came across this. I would like to add one more perspective: The scenario @U0K064KQV gave is a breaking change (i.e. requiring more from the client). In this case, I would rather provide a new API which requires the new param (possibly just with a v2
suffix. Maybe remove the old version after a transitional period if it's desirable to not keep it around indefinitely. Basically do the same thing as Rich recommends for library APIs here: https://youtu.be/oyLBGkS5ICk?t=1944
In practice, you end up with a similar flow as in the solution suggested by @U0K064KQV but arguably it's a bit more transparent what's going on, e.g. it's very easy to determine how many clients are using the new API already. Also, you're still protected from breaking clients who accidentally were sending this newly required param before.
Yes, it's a breaking change where you will no longer handle the prior case. Upgrading to V2, generally implies V1 is still supported. If I do that, I too prefer versioning over making the new field optional. But sometimes, especially for microservices, you will not continue to support the old way, and need a path to migration that doesn't require downtime or cause issues.
But you have a transitional period of supporting both in your case, too (i.e. when some clients already send the to-be-required new param and others don't yet), it's just less explicit. I guess if you're sitting in the same organization as all consumers, it's not much of an issue as you can just tightly coordinate the upcoming breaking change. But if you're offering a service that's consumed by third parties outside of your control, I'd definitely prefer the explicit approach.
P.S.: Just re-watched the rest of the spec-ulation talk and found that it even has a section about (web) services, too, discussing that exact situation: https://youtu.be/oyLBGkS5ICk?t=3958
What does Rich say about that? I cannot listen to the talk right now
Basically the same thing: Don't break an existing operation foo
, make a foo2
instead.
(break by requiring more)
In case the new argument is mandatory, Rich advises us to give a new name to the API. But what about the case where the new argument is optional and of a different type that the one sent by the client?
Ah right, he doesn't discuss the forward compatibility issue. I for one prefer to not allow extra keys on that level for that reason, especially when we're talking plain JSON, i.e. non-namespaced keys. Also, it helps guard against typos for optional params (i.e. clients intending to provide an optional param but because it's misspelt, it's silently ignored). Of course, the latter could also be achieved by other means e.g. have a "warning" backchannel as suggested by @U0K064KQV above.
> But you have a transitional period of supporting both in your case, too (i.e. when some clients already send the to-be-required new param and others don't yet), it's just less explicit. I guess if you're sitting in the same organization as all consumers, it's not much of an issue as you can just tightly coordinate the upcoming breaking change. But if you're offering a service that's consumed by third parties outside of your control, I'd definitely prefer the explicit approach. > Ya, we agree. I'm talking in micro-service architecture, multiple services are part of a coherent system, maybe owned by one or more internal teams. You don't need to support multiple versions of each microservices internally, that's just too much to commit. You can force the consumers to take an update project for the new behavior. Once they did it they can go to prod, there's no transitory period where we support both, clients now pass extra fields, but the current server ignores them. Then you do deployment of the server and it now starts using those additional fields. With paying 3rd party customers to a service, you probably can't force them all to migrate early, and probably need to support both behaviors for a while.
Indeed - context matters 😄
I still don't think that erroring on additional fields has advantages in either of those cases though, at least can't think of any here.
@U0K064KQV @U06GVE6NR Did you know that the HTTP Prefer Header could be set to either “handling=strict” or “handling=lenient”? https://twitter.com/viebel/status/1565174064821551105?s=20&t=ijcp8FtsCUVa34sCvoCOAA