I'm having a hard time getting my ring/compojure (sweet) data into xray on aws
I have this in my middleware (middleware [trace-http/wrap-compojure-route]) and I'm using the java agent for automatic instrumentation. I am seeing xray data for other things, like sns/sqs and database access, but just nothing when it comes to routes that I want to trace. I'm sure I have something incorrectly setup!
Any help would be appreciated! 🙂
I don't have experience using AWS X-Ray, but I'll ask some questions as we need a fuller explanation. Are you not seeing any HTTP server traces (for your HTTP application) at all in AWS X-Ray, or are they showing up without HTTP route data?
I'm unsure, as it looks to me that the current context, from the agent isn't bound in a synchronous route.
if create-span? is false, there's a call to wrap-existing-server-span
but the agent context is only bound on an asynchronous route, not a synchronous route.
But if I modify the code a bit, to this:
(defn- wrap-existing-server-span
[handler]
(fn
([request]
(let [context (context/dyn)]
(handler (assoc request :io.opentelemetry/server-span-context context))))
([request respond raise]
(let [context (context/dyn)]
(handler (assoc request :io.opentelemetry/server-span-context context)
respond
(fn [e]
(span/add-exception! e {:context context})
(raise e)))))))then the agent context is bound to the request
I'm sorry, I'm not at all clear on the problem you are reporting. Can we please take a step back and identify the scope of your issue? First, are you seeing any server traces at all in the X-Ray console?
Apologies. Yes, I am seeing xray reports, but only for lower level stuff, like database access, sns, sqs requests and so on.
I do not see any route traces.
So I don't know if /foo/bar/baz is being called.
When I modified the wrap-existing-server-span, as above, I was then able to query (locally running jaeger) for url.path.
However, I'm not sure yet of whether that is just a red-herring that I'm going down
OK, so your issue isn't to do with the Compojure integration, as that just decorates server spans with extra detail.
What HTTP server library does your application use e.g. Jetty, http-kit...?
Jetty 12.0.14
OK, Jetty is supported by the OpenTelemetry Java instrumentation agent. Are you using that, or some other agent?
opentelemetry
OK, are you able to run your application (or something similar to your application) without clj-otel at all, but with the agent? You should be getting traces without any manual instrumentation.
I can attempt that. I have a locally running jaeger (as deploying to aws ecs is tedious each time). I can spin up my application and see what I get with the agent running
As you may appreciate, there are a lot of configuration details to nail down. Manual instrumentation (the major use case for clj-otel) is the icing on the cake, so to speak.
Yeah, it's quite tricky for sure
Establishing traces in X-Ray through automatic instrumentation for your application would be a significant milestone towards your ultimate goal of a manually instrumented application.
....manual is best?
Okay, so application running with agent and no clj-otel. Jaeger running. I can see traces in jaeger of database requests, subscriptions to sns an sqs going on.
Recalling the clj-otel https://github.com/steffan-westcott/clj-otel/blob/master/doc/tutorial.adoc, automatic instrumentation is good, but enriched instrumentation (automatic plus manual) enables extra insight.
As expected, attempting to query this url.path=/foo/bar/baz shows no traces
I'm confused again. Are you deploying locally or on AWS? Are you using X-Ray to view, or deploying Jeager on AWS?
I am deploying to AWS. However, as mentioned above, deploying to AWS ECS is tedious each time, takes +minutes to do. So, in order to simulate the behaviour, I'm using jaeger, which respects the otlp protocol (like aws xray). So what I can get in jaeger should be nearly close enuogh to what I see in xray
I am running locally for this experimentation to see why with the middleware (wrap-server-span) set in my application, I do not see route requests - both on aws xray and on jaeger
Your issue is more related to networking, rather than clj-otel functionality. This is why I believe you should focus on getting the automatic instrumentation working first.
It is working, I do see traces of everything except route requests.
Maybe partially working? 🙂
The instrumentation agent should be providing trace telemetry for your HTTP application. From what you've said, it looks like only AWS API calls are getting instrumented, so I think there's a problem with the configuration of the agent.
Indeed, tis a head scratcher.
I'll dig some more...
Make sure you can see err/`out` from the application. The agent will complain if its unable to export OTLP data.
In case you are not aware, AWS provides a distro for OpenTelemetry. As part of this, they offer a customised agent. https://aws-otel.github.io/docs/getting-started/java-sdk
Yes I use that
it's my sidecar
One important detail is that the AWS X-Ray propagator is not enabled by default in the vanilla agent. I imagine (but do not know) its enabled by default in the ADOT version.
I have an open telemetry yml which enables it
on the otel collector (aws distro)
now I'm confused
An update. Only by adding (wrap-serve-span {:create-span? true}) do I now get lovely paths and lots of juicy data from requests in amazon xray. It looks like, for whatever reason, the aws-opentelemetry-agent.jar is not instrumenting jetty(?) to get the paths out.
I'm happy enough to allow clj-otel collector to create the span for me - it'll do and seems to work fine. I can look into the java agent later.
.
I'm glad you've got something working, I know it is tough to get off the ground 😄 Your experiments so far have verified that networking between your application, agent, OpenTelemetry Collector and AWS X-Ray is working. That's a lot!
Thanks for your guidance and help 🙂
As you've noted, getting the agent to automatically instrument the application would be the next step. My guess would be some setting in ADOT needs attention. Perhaps the traces signal is disabled by default?
I'll look into it and if I find something, I'll report back to add to the collective knowledge 🙂
Yes, please do. It would be most useful for folk in general to give some experience reports of using clj-otel in the wild, like deployments in AWS and GCP. Many of the questions I get are related to configuration and deployment.
https://github.com/open-telemetry/opentelemetry-java-instrumentation/pull/10575/files Looks like jetty12 instrumentation was added earlier in the year. Not sure how you'd check if the AWS agent has that code or not
Interesting, it may not be in the 1.X release branch, (as the commit seems to be on main, which looksl like a 2.X branch). Amazon AWS otel java instrumentation is only on 1.X still.
https://github.com/aws-observability/aws-otel-java-instrumentation/releases that references quite an older version of the otel java one
guess you could downgrade the jetty server to validate if you wanted to check
https://github.com/aws-observability/aws-otel-java-instrumentation/blob/main/dependencyManagement/build.gradle.kts#L29-L34
In case you have any appetite to upgrade it. Though that could be rabbit hole.
For reference, here's the supported lib page for the version it's using v1.32.1
https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/v1.32.1/docs/supported-libraries.md#application-servers
which doesn't seem to have the changes for jetty 12
thank you! 🙂
mucho appreciated! I'll devote some cycles tomorrow to look at this.
Beware, there are some breaking changes between agent 1.x and 2.x so be sure to check the release notes. Off the top on my head, I remember some changes to default OTLP settings, and a load of semantic conventions.
Thank you. I'll stick to 1.X at the moment, as that is the version that aws-opentelemetry-agent supports. It appears that support for Jetty 12 is only in the 2.X branch for sure (2.6.0 onwards).
I can live without the spans for the moment 🙂