π£ clj-otel release 0.2.8 is out
π Bumped OpenTelemetry deps to 1.50.0
πͺ΅ Added logs panel to the Grafana dashboards in the microservices examples
πΈοΈ Deprecated non-escaping exception events, to follow the changed OpenTelemetry guidance
π― Updated attributes locating program source in spans
π«΄πΌ Generalised context propagation functions for use in server applications other than Ring HTTP servers
ποΈ See the https://github.com/steffan-westcott/clj-otel/blob/master/CHANGELOG.adoc for more information on changes in this release
This is mainly a maintenance release. OpenTelemetry Java has introduced several minor breaking changes since the last release of clj-otel so please take care should you migrate your existing applications. In particular, it appears io.opentelemetry/opentelemetry-api version 1.50.0 moved some classes to io.opentelemetry/opentelemetry-api-incubator which break OpenTelemetry agents earlier than 2.16.0. I tried messaging the maintainers for confirmation, but have yet to hear anything back π€·πΌββοΈ
If you use Pedestal, you should know that Pedestal 0.7.0 has its own OpenTelemetry integration. You may want to assess if Pedestal's support for OpenTelemetry is sufficient for your needs without clj-otel. Pedestal support in clj-otel will stay in its current form, I have no plans to remove it.
https://github.com/steffan-westcott/clj-otel
Hi!
I have a question regarding the deprecation of steffan-westcott.clj-otel.api.trace.http/wrap-exception-event. We used that middleware to capture information about an exception that a later middleware would transform into a 500 Internal Server Error HTTP response. From the deprecation note it appears as if that was wrong (or at least OpenTelemetry recommends not doing that now).
In a literal sense I agree that we consume the exception and it does not escape the HTTP server span. But it's still a failure we want to record, and at the same time we don't want HTTP-server-generated 500 error pages that we have no control over.
Could you please explain what problem this caused? (I searched a bit, also in https://github.com/open-telemetry/semantic-conventions/blob/main/CHANGELOG.md, but could not find an explanation.) And how should I approach this (wanting information what happened when a 500 occurs) instead?
OpenTelemetry no longer recommends recording non-escaping exceptions as span events. Therefore, use of wrap-exception-event is no longer recommended and should be phased out. You could provide your own middleware to perform the same job as wrap-exception-event, though note that telemetry backends may not recognise the deprecated exception.escaped attribute in future. See https://opentelemetry.io/docs/specs/semconv/registry/attributes/exception/#exception-escaped
The semantic conventions have changed as OpenTelemetry matures. I'm not involved in the development of OpenTelemetry (other than submitting some issues), so I'm unable to comment on why span events for non-escaping exceptions have been deprecated. If this is important for your application, I suggest either using a "plain" span event using add-event! or a log statement. You would implement this in your own middleware, placed in the middleware stack such that it executes before your middleware which transforms the exception into a 500 response.
The ordering of the middleware stack is crucial. There is an example in the 0.2.7 release here : https://github.com/steffan-westcott/clj-otel/blob/0.2.7/examples/microservices/manual-instrument/middleware/puzzle-service/src/example/puzzle_service/server.clj#L46-L63 Note that should an exception occur during request handling, trace-http/wrap-exception-event (adding span data on the exception) executes before exception/exception-middleware (transforming the exception to an HTTP response).
Lastly, to clarify an important point, wrap-exception-event does not swallow exceptions. As noted in its docstring, it is intended to use in applications where exceptions will be caught in subsequent middleware, such as the example I highlighted.
> though note that telemetry backends may not recognise the deprecated exception.escaped attribute in future
I.e. I should have set the exception.escaped attribute to false in the middleware that caught exceptions and turned them into 500es?
> Lastly, to clarify an important point, wrap-exception-event does not swallow exceptions.
Thanks. I see that me writing "that we consume the exception" might be misunderstood. I meant "we" as "the custom middleware in our application that catches exceptions and turns them into 500 responses".
exception.escaped is deprecated, so in future all exception events will assume exception.escaped is effectively always true
In other words, you need to find some other way to signal "this exception was caught" "here is an exception that happened to be caught"
It's very normal to transform exceptions to controlled HTTP responses. I'm not a fan of this change in the semantic conventions.
Ah, thanks for explaining this!
So this means I cannot just call steffan-westcott.clj-otel.api.trace.span/add-span-data! with {:ex-data {:exception e}} either. Instead I would have to translate ex-msg, ex-data, ex-cause and the stack trace into {:event {:attributes ...}} in some way, mangling keywords, and transforming the values into something OTel understands?
Yes. I suppose another way would be to add a middleware at the bottom of the stack which simply wraps a span around the handler. Then you get the proper exception reporting, as exceptions would not be caught.
That seems a lot better.
I'll have a bit more of a think on it. If this holds water, I'll update the examples.
If we set up our router correctly, then the wrap-route in https://github.com/steffan-westcott/clj-otel/pull/22 should already do that, right?
No, that just calls add-route-data!, but does not wrap the handler in a span.
... maybe it should? (And then steffan-westcott.clj-otel.api.trace.http/wrap-reitit-route should do the same?)
Would be interesting how the official OTel instrumentation of Java routers does this...
I've tried to avoid prescribing middleware stacks in clj-otel itself, and instead sketch out possibilities in the examples.