clj-otel

Daniel Jomphe 2022-03-31T11:12:44.060219Z

Hi! Here we're starting to implement observability in our Datomic Cloud Ion-based app. Clj-OTel looks like a very fine wrapper on top of OTel for Java. I'm pretty sure Cognitect won't allow us to load a Java Agent when it boots its Datomic Cloud. Given that, our Ions won't benefit from auto-instrumentation. I'm not yet clear on what this implies: • will we lose a lot of important OTel features or not? • will we be able to activate the pertinent ones manually? (the Java OTel has the concept of a programmatic Builder). The OTel for Java docs, the Honeycomb docs and the Clj-OTel seem to suggest auto-instrumentation really adds stuff that we can't just programmatically activate any other way. Not sure...

steffan 2022-03-31T11:37:42.702549Z

Hello Daniel, thanks for your interest 🙂 If you are not able to use automatic instrumentation, then you will be using manual instrumentation exclusively. This mode of usage is absolutely supported by OpenTelemetry and clj-otel , though it requires library/application design time work to actually apply the instrumentation where it is needed. The https://github.com/steffan-westcott/clj-otel/blob/master/doc/examples.adoc#microservices-run-with-manual-instrumentation-only show this in the context of Ring and Pedestal HTTP server applications.

steffan 2022-03-31T11:44:22.024509Z

Manual instrumentation is the main use case for clj-otel.

Daniel Jomphe 2022-03-31T11:55:38.936099Z

Thanks Steffan. I'm implementing this today. I'll report on progress just like you ask in the documentation. BTW the documentation looks very well written, thank you! With that said, reading your answer, I'm not sure: by not being able to use auto-instrumentation, are we losing things that we won't be able to easily activate using manual instrumentation?

steffan 2022-03-31T12:02:43.202359Z

Thank you for the kind words on the documentation 🙂 You will be able to create the same telemetry data manually as is done automatically, but this will involve work on your part to achieve. You can go about this incrementally, targeting the parts of your application that need observability the most.

Daniel Jomphe 2022-03-31T12:04:16.159399Z

Thanks for the clarification. If you ever wanted to include this bit of knowledge somewhere in your documentation, this might be helpful to other surveyors.

steffan 2022-03-31T12:08:49.794219Z

Adding observability to established systems does raise questions on overall strategy - This isn't particular to OpenTelemetry nor clj-otel. Perhaps I can add something to the Concepts docs to point to resources that address this.

Daniel Jomphe 2022-03-31T12:10:02.447939Z

OTOH even if we started this app from scratch, we'd still have this same question since we'd be starting in an environment where we don't own the startup to add a Java Agent. The Datomic Cloud Ions environment is quite a PAAS. I feel a bit like in the Google App Engine days.

steffan 2022-03-31T12:16:27.678989Z

OK, so I should add a few words on using manual instrumentation exclusively 👍🏻 The automatic instrumentation use case is pushed quite hard by the OpenTelemetry community working on the Java implementation, it's quite impressive what they've achieved so far. Its a shame its off limits for certain execution environments.

🙂 1
Daniel Jomphe 2022-03-31T12:19:07.835279Z

I contacted Cognitect for their official statement about it. I know they have their own internal Clojure-based implementation for Honeycomb beelines that doesn't depend on the java beelines. So they surely see the value of instrumentation. Who knows if they could welcome OTel someday to Datomic Cloud... 🙂 I'll let you know if they respond something useful or hopeful.

Daniel Jomphe 2022-03-31T12:22:05.515159Z

Oh, also - a question many clojurists might have is: how much does the auto-instrumentation for Java is useful to our Clojure processes, and at what cost. It surely depends on whether our processes (transitively or not) depend on typical Java libs or standard APIs.

steffan 2022-03-31T12:27:07.144589Z

Datomic seems to target AWS CloudWatch, and given its AWS focus I'd be surprised if they see a need to generalise that to OpenTelemetry. Perhaps feedback from customers might change that. AWS are themselves supporting OpenTelemetry, so things may change.

steffan 2022-03-31T12:32:09.892699Z

On your other question, the utility of the agent for a Clojure application depends on which Java libraries and frameworks are used by the application. clj-otel docs point to the https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/main/docs/supported-libraries.md

Daniel Jomphe 2022-03-31T12:47:46.683869Z

That's a nice illustration of why I asked my initial question. They say: > We automatically instrument and support a huge number of libraries, frameworks, and application servers... right out of the box! But they don't write about what users can expect to be able to do if they can't boot with the Java Agent. Is it possible to use their dynamic Builder to manually activate any one of these magic instrumentations and get the same result, or must the user do a lot more work than that...

steffan 2022-03-31T12:58:38.714189Z

What is the "dynamic builder" you are referring to? I don't believe I've come across that.

Daniel Jomphe 2022-03-31T13:00:33.841049Z

Oh, I realize this is https://docs.honeycomb.io/getting-data-in/opentelemetry/java-distro/#using-the-honeycomb-sdk-builder and unrelated to your direct dependency on the official OTel libs.

steffan 2022-03-31T13:22:32.312909Z

The opentelemetry-java-instrumentation repository does have some https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/main/docs/standalone-library-instrumentation.md for instrumenting without the agent. You will need to examine those you wish to use, as they all differ in nature.

steffan 2022-03-31T19:33:28.656959Z

The Honeycomb SDK Builder you refer to does the same job as the OpenTelemetry SDK Builder, with a couple of very minor tweaks to make use with Honeycomb easier. Both of these builders are a way of programmatically configuring the SDK. clj-otel-sdk (a module of clj-otel) is a Clojure wrapper of the OpenTelemetry SDK builder. See the https://github.com/steffan-westcott/clj-otel/blob/master/doc/guides.adoc#_run_with_programmatically_configured_sdk , https://cljdoc.org/d/com.github.steffan-westcott/clj-otel-sdk/0.1.1/api/steffan-westcott.clj-otel.sdk.otel-sdk and https://github.com/steffan-westcott/clj-otel/blob/master/doc/examples.adoc#manually-instrumented-application-run-with-programmatic-sdk-configuration for further info on programmatic SDK configuration. If Honeycomb is your telemetry backend of choice, my recommendation is to use autoconfiguration rather than programmatic configuration. There are hints how to do this in the examples both https://github.com/steffan-westcott/clj-otel/blob/e1a93ae985ee098df14f9acf5bfb246e10ffc07a/examples/auto-instrument-agent/middleware/word-length-service/deps.edn#L32-#L42 and https://github.com/steffan-westcott/clj-otel/blob/e1a93ae985ee098df14f9acf5bfb246e10ffc07a/examples/manual-instrument/middleware/puzzle-service/deps.edn#L34-L46 the agent.

Daniel Jomphe 2022-03-31T19:36:03.435829Z

Thanks. Following your example and guide, that's what I'm doing. For now I'm trying to find my way around

Mar 31, 2022 2:33:51 PM io.opentelemetry.sdk.internal.ThrottlingLogger doLog
SEVERE: Failed to export spans. Server is UNAVAILABLE. Make sure your collector is running and reachable from this network. Full error message:io exception

steffan 2022-03-31T19:40:04.705639Z

I don't have first hand experience using this, but you may find this useful when setting up Collector instances in AWS : https://aws-otel.github.io/

Daniel Jomphe 2022-03-31T19:40:49.735559Z

Thanks but for now I don't plan on using a Collector. I'm still wondering why the error mentions a collector, though.

steffan 2022-03-31T19:42:11.586029Z

My guess would be that using a Collector would be the expectation. I have found that when starting off with OpenTelemetry, NOT using a Collector at first is easier to get going.

Daniel Jomphe 2022-03-31T19:43:17.733809Z

Ok, so I'd need to find the param to tell it to not seek a collector. I saw your docs linked to many pages, I'll review them all.

steffan 2022-03-31T19:43:17.895079Z

That error message to me suggests a network connectivity issue, rather than anything specific to do with Collectors.

steffan 2022-03-31T19:45:37.553469Z

Assuming you're using OTLP, you'll likely have more joy using HTTP transport than gRPC.

Daniel Jomphe 2022-03-31T19:46:22.568179Z

Oh? I thought gRPC was required for honeycomb (and I saw they have a v1 http endpoint, though, which I assumed is deprecated)

Daniel Jomphe 2022-03-31T19:48:17.872119Z

Based on messages like this https://docs.honeycomb.io/getting-data-in/opentelemetry/java-distro/#grpc-transport-customization: > A gRPC transport is required to transmit OpenTelemetry data.

steffan 2022-03-31T19:48:37.177869Z

Honeycomb added support for OTLP over HTTP a while ago, you can see my example config https://github.com/steffan-westcott/clj-otel/blob/e1a93ae985ee098df14f9acf5bfb246e10ffc07a/examples/auto-instrument-agent/middleware/word-length-service/deps.edn#L38-#L42

Daniel Jomphe 2022-03-31T20:04:41.038429Z

There seems to be some kind of mismatch. It looks like to me I added the adequate deps and props, but the logs suggest it tried to use gRPC anyway:

Mar 31, 2022 2:55:57 PM io.opentelemetry.api.GlobalOpenTelemetry maybeAutoConfigureAndSetGlobal
SEVERE: Error automatically configuring OpenTelemetry SDK. OpenTelemetry will not be enabled.
io.opentelemetry.sdk.autoconfigure.spi.ConfigurationException: OTLP gRPC Trace Exporter enabled but opentelemetry-exporter-otlp not found on classpath. Make sure to add it as a dependency to enable this feature.

Daniel Jomphe 2022-03-31T20:05:29.906519Z

The screenshot doesn't show the headers for honeycomb nor the service.name, but they are set.

Daniel Jomphe 2022-03-31T20:07:03.133709Z

Hmm it may be that :jvm-opts isn't specified properly at the root of our deps.edn. Looking into this...

Daniel Jomphe 2022-03-31T20:09:55.294869Z

Yes, I believe JVM properties can't be set at the project level of the deps.edn. Looks like they can only be set by aliases.

steffan 2022-03-31T20:13:08.937769Z

That's correct, JVM options can only be set by aliases in deps.edn files. This is why all the examples use aliases.

Daniel Jomphe 2022-03-31T20:16:31.682449Z

I've now been able to receive data in honeycomb (with OTLP HTTP). I'm left wondering why you say I might prefer the HTTP protocol over the gRPC one?

steffan 2022-03-31T20:18:22.067469Z

Punching through firewalls, essentially. I didn't know if you were testing locally or from inside a Datomic Ion.

steffan 2022-03-31T20:19:38.941939Z

Anyhow, I'm glad you're over the initial hurdle of successfully exporting telemetry and viewing the result in a backend! 🎉

🎉 1
Daniel Jomphe 2022-03-31T20:21:13.970839Z

Yes, and thanks for the support! I hope this didn't overburden you but it sure helped me triangulate faster towards a solution.

steffan 2022-03-31T20:21:32.356179Z

By all means try gRPC with Honeycomb. It should be better performance, though I don't have data to support that assertion.

steffan 2022-03-31T20:22:16.280239Z

As ever, getting computers to talk to each other is tricky. You have achieved a lot!

Daniel Jomphe 2022-03-31T20:24:15.920199Z

Yes, there's always a trick or two or three to fiddle with before succeeding. 🙂 I rebooted with gRPC and it did work (locally for now).

steffan 2022-03-31T20:24:22.785279Z

The meat and potatoes of clj-otel is clj-otel-api, so I'll be very interested to know how you get on with that. Good luck!

Daniel Jomphe 2022-03-31T20:25:05.252529Z

Yes, the api is where you developed those with-span and etc. 🙂 I'll report on that with usage.

👍🏻 1
Daniel Jomphe 2022-03-31T11:59:08.892109Z

clj-otel migrates to Clojure 1.11

Daniel Jomphe 2022-03-31T12:01:23.571509Z

Here, our Datomic Cloud Ion-based app mandates that we use Clojure 1.10.3 until Cognitect releases a new version running 1.11. This could be months away since they'll want to make sure to not break their clients. Not sure yet; we've only been using them for something like a year. But meanwhile, clj-otel evolves to depend on 1.11 right now. We'll either depend on your latest and exclude its dependency on 1.11 (as long as your code doesn't start using 1.11 features it'll be fine), or we'll depend on your previous version that still compiles with 1.10.

Daniel Jomphe 2022-03-31T12:03:11.666549Z

Please don't read this as a request for you to ignore 1.11's new features. 🙂 It's just to let you know some of us are in this position.

steffan 2022-03-31T12:04:19.543119Z

I'm not planning to use Clojure 1.11.0 features any time soon. It should be fine to exclude the Clojure dependency as you suggest.

Daniel Jomphe 2022-03-31T12:05:21.732589Z

Good to know! Thank you, we'll do it this way, then. When you eventually start using 1.11 features, let's hope by then we can pin or already use 1.11 on our side too.

steffan 2022-03-31T12:05:39.427259Z

Thanks for raising this point, it is useful to know of conditions that potential users of clj-otel are working with!

👍🏼 1