This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2024-07-01
Channels
- # announcements (4)
- # babashka (19)
- # beginners (19)
- # calva (24)
- # cider (1)
- # clerk (7)
- # clj-otel (60)
- # clojars (22)
- # clojure (15)
- # clojure-europe (20)
- # clojure-nl (2)
- # clojure-norway (61)
- # clojure-uk (9)
- # clojurescript (18)
- # community-development (3)
- # cursive (3)
- # datahike (4)
- # datalevin (5)
- # emacs (8)
- # off-topic (23)
- # polylith (46)
- # releases (6)
- # spacemacs (8)
- # yamlscript (1)
📣 clj-otel
release 0.2.7 is out
🔭 Bumped OpenTelemetry deps to 1.39.0
🌡️ Support for creating and setting gauges that take measurements synchronously
📊 Option to set explicit bucket boundaries advice for histograms
🧹 Removed wrapper modules for obsolete Jaeger exporters, since Jaeger natively supports OTLP
📈 Added Grafana visualization for microservices examples, with pre-configured dashboards
⭐ Major refactor of microservices examples: Embedded nREPL, state management, simpler & faster build
🐘 Added example with instrumented PostgreSQL database access
:spiral_note_pad: See the https://github.com/steffan-westcott/clj-otel/blob/master/CHANGELOG.adoc for more information on changes in this release
https://github.com/steffan-westcott/clj-otel
Right now I'm having trouble getting the basic examples to work.
I checked out tag 0.2.7, downloaded the 2.5.0
version of the agent (for the 1.39.0 SDK) into the examples
folder.
I cleaned out any, docker images I had laying around and just tried starting cube, divisor, et al from the command line (via up.sh
) but it looks like I'm getting:
[otel.javaagent 2024-07-02 13:52:50:205 -0500] [main] INFO io.opentelemetry.javaagent.tooling.VersionLogger - opentelemetry-javaagent - version: 2.5.0
Clojure 1.11.3
user=> [otel.javaagent 2024-07-02 13:52:55:355 -0500] [OkHttp ] ERROR io.opentelemetry.exporter.internal.http.HttpExporter - Failed to export metrics. The request could not be executed. Full error message: unexpected end of stream on ...
java.io.IOException: unexpected end of stream on ...
at okhttp3.internal.http1.Http1ExchangeCodec.readResponseHeaders(Http1ExchangeCodec.kt:210)
at okhttp3.internal.connection.Exchange.readResponseHeaders(Exchange.kt:110)
at okhttp3.internal.http.CallServerInterceptor.intercept(CallServerInterceptor.kt:93)
at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.kt:109)
at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.kt:34)
Ubuntu 22.04.4 LTS , openjdk version "17" 2021-09-14, Clojure CLI version 1.11.3.1463
I checked prior to starting compose up - nothing was running or listening on that port. :man-shrugging:That is odd. As a sanity check, I've just tried cube
and divisor
examples and both are working as expected. Can you show the output of ./up.sh
?
For reference, this is the output of ./up.sh
for the divisor-app
example:
❯ ./up.sh
[+] Running 3/3
✔ Network divisor-app_default Created ✔ Container divisor-app-prometheus-1 Started ✔ Container divisor-app-jaeger-1 Started
Same
✔ Network divisor-app_default Created 0.2s
✔ Container divisor-app-prometheus-1 Started 0.5s
✔ Container divisor-app-jaeger-1 Started
docker containers look ok
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
29a51a0dc6bf prom/prometheus:latest "/bin/prometheus --c…" 24 seconds ago Up 24 seconds 0.0.0.0:9090->9090/tcp, :::9090->9090/tcp divisor-app-prometheus-1
78f07e4bb026 jaegertracing/all-in-one:latest "/go/bin/all-in-one-…" 24 seconds ago Up 24 seconds 4318/tcp, 5775/udp, 5778/tcp, 9411/tcp, 14250/tcp, 0.0.0.0:4317->4317/tcp, :::4317->4317/tcp, 0.0.0.0:16686->16686/tcp, :::16686->16686/tcp, 14268/tcp, 6831-6832/udp divisor-app-jaeger-1
oh this is weird.... it stopped crapping out. hold on
cube still is a problem ... hmm
Note that the OTLP port and protocol used differs when using agent 2.x.x. The default was changed from gRPC on port 4317 to HTTP/protobuf on port 4318. All example apps have been updated to assume agent 2.x.x is used. Confusingly, the default OTLP when using the SDK is unchanged from port 4317.
cube app is not able to connect given the reasons stated above. divisor comes up and is able to export metrics, it does complain about > 62024-07-02T14:20:40,510 WARN (r)etry.exporter.internal.grpc.GrpcExporter Failed to export spans. Server responded with gRPC status code 2. Error message: Broken pipe And as expected I can't find the span in the local jaeger. (service is not listed)
None. I made sure no process was listening on any of those ports either. I could go through it again.
I wonder if the Docker images are up to date. Before doing ./up.sh
, try docker compose pull
pulled - returned almost instantaneously.
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
43c65356179c prom/prometheus:latest "/bin/prometheus --c…" About a minute ago Up About a minute 0.0.0.0:9090->9090/tcp, :::9090->9090/tcp divisor-app-prometheus-1
7fc82707b374 jaegertracing/all-in-one:latest "/go/bin/all-in-one-…" About a minute ago Up About a minute 4318/tcp, 5775/udp, 5778/tcp, 9411/tcp, 14250/tcp, 0.0.0.0:4317->4317/tcp, :::4317->4317/tcp, 0.0.0.0:16686->16686/tcp, :::16686->16686/tcp, 14268/tcp, 6831-6832/udp divisor-app-jaeger-1
eek! hold on. sorry - long day - fighting (other) fires. :face_with_open_eyes_and_hand_over_mouth:
ONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
19f3ed6b89ea otel/opentelemetry-collector:latest "/otelcol --config=/…" 4 minutes ago Up 2 seconds 4317/tcp, 55678-55679/tcp, 0.0.0.0:4318->4318/tcp, :::4318->4318/tcp cube-app-otel-collector-1
46072a33b68d prom/prometheus:latest "/bin/prometheus --c…" 4 minutes ago Up 2 seconds 0.0.0.0:9090->9090/tcp, :::9090->9090/tcp cube-app-prometheus-1
ba2f7e644be6 jaegertracing/all-in-one:latest "/go/bin/all-in-one-…" 4 minutes ago Up 2 seconds 4317-4318/tcp, 5775/udp, 5778/tcp, 9411/tcp, 14250/tcp, 14268/tcp, 6831-6832/udp, 0.0.0.0:16686->16686/tcp, :::16686->16686/tcp cube-app-jaeger-1
Looks very strange. The agent running in cube-app
should be exporting OTLP to port 4318. The docker ps
shows that this should be directed to port 4318 of the collector instance. But for some reason, this communication is blocked.
Is this the first time you've tried running with agent 2.x.x? Also, what happens if you try to run one of the microservices/auto-instrument
examples?
2nd time. Before this I was on 0.2.6 using agent v.2.0.0 - I haven't run the examples in several weeks.
I'll try the microservice example now
Auto instrument docker ps
5240b0df1ca5 clojure:temurin-22-tools-deps-alpine "entrypoint clojure …" 42 seconds ago Up 19 seconds 0.0.0.0:8092->8092/tcp, :::8092->8092/tcp middleware-sentence-summary-load-gen-1
0705a70177f2 clojure:temurin-22-tools-deps-alpine "entrypoint clojure …" 42 seconds ago Up 30 seconds (healthy) 0.0.0.0:8090->8090/tcp, :::8090->8090/tcp middleware-sentence-summary-service-1
aaa17c6929f9 clojure:temurin-22-tools-deps-alpine "entrypoint clojure …" 42 seconds ago Up 41 seconds (healthy) 0.0.0.0:8091->8091/tcp, :::8091->8091/tcp middleware-word-length-service-1
0225611309c9 grafana/grafana-oss:11.1.0 "/run.sh" 42 seconds ago Up 41 seconds 0.0.0.0:3000->3000/tcp, :::3000->3000/tcp middleware-grafana-1
47d8ea08dbdc otel/opentelemetry-collector-contrib:latest "/otelcol-contrib --…" 42 seconds ago Up 41 seconds 4317/tcp, 55678-55679/tcp middleware-otel-collector-1
c2d64c834bcf prom/prometheus:v2.53.0 "/bin/prometheus --c…" 43 seconds ago Up 41 seconds 9090/tcp middleware-prometheus-1
ff3eb2e42207 grafana/tempo:2.5.0 "/tempo -config.file…" 43 seconds ago Up 41 seconds middleware-tempo-1
Looks good so far. Now navigate to http://localhost:3000/ and you should see (among other things) a list of traces. Try clicking on one for route GET /summary
and you should see a trace
grafana's up but empty - (No data), Not even JVM metrics. weird
I don't have desktop installed at all either - messed me up awhile ago.
Something seems to be fundamentally affecting docker on your machine. Can you recall any other work with docker compose
that is working?
(on a side note: For the microservices examples, it takes around a minute of uptime for meaningful output to appear in Grafana)
yeah - we use docker (with compose at) work and I can stand up our application (comes in 3 parts) - I could try some sample app.
In any of the microservices apps, after starting with ./up.sh
, you could try tailing the logs of the container running image otel/opentelemetry-collector-contrib:latest
. Use command docker logs -f <containerid>
Normal output for the collector should settle down to output like this:
2024-07-02T20:16:07.345Z info MetricsExporter {"kind": "exporter", "data_type": "metrics", "name": "debug", "resource metrics": 1, "metrics": 19, "data points": 50}
2024-07-02T20:16:08.219Z info TracesExporter {"kind": "exporter", "data_type": "traces", "name": "debug", "resource spans": 1, "spans": 40}
2024-07-02T20:16:08.822Z info TracesExporter {"kind": "exporter", "data_type": "traces", "name": "debug", "resource spans": 1, "spans": 53}
2024-07-02T20:16:09.553Z info MetricsExporter {"kind": "exporter", "data_type": "metrics", "name": "debug", "resource metrics": 3, "metrics": 15, "data points": 45}
2024-07-02T20:16:12.968Z info MetricsExporter {"kind": "exporter", "data_type": "metrics", "name": "debug", "resource metrics": 1, "metrics": 18, "data points": 46}
2024-07-02T20:16:13.241Z info TracesExporter {"kind": "exporter", "data_type": "traces", "name": "debug", "resource spans": 1, "spans": 22}
2024-07-02T20:16:13.843Z info TracesExporter {"kind": "exporter", "data_type": "traces", "name": "debug", "resource spans": 1, "spans": 26}
If there are no messages like this, then no messages are reaching the Collector, which would be very odd.
Yeah nothing is hitting the otel-collector. I could see the generator requesting and getting responses ... still digging.
found something.
> ... 18 more > [otel.javaagent 2024-07-02 20:23:05:616 +0000] [OkHttp http://otel-collector:4318/...] ERROR io.opentelemetry.exporter.internal.http.HttpExporter - Failed to export metrics. The request could not be executed. Full error message: Failed to connect to otel-collector/172.20.0.5:4318 > java.net.ConnectException: Failed to connect to otel-collector/172.20.0.5:4318 > at okhttp3.internal.connection.RealConnection.connectSocket(RealConnection.kt:297) > at okhttp3.internal.connection.RealConnection.connect(RealConnection.kt:207) > at okhttp3.internal.connection.ExchangeFinder.findConnection(ExchangeFinder.kt:226) > at okhttp3.internal.connection.ExchangeFinder.findHealthyConnection(ExchangeFinder.kt:106) > at okhttp3.internal.connection.ExchangeFinder.find(ExchangeFinder.kt:74) > at okhttp3.internal.connection.RealCall.initExchange$okhttp(RealCall.kt:255) > at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.kt:32) on : middleware-sentence-summary-service-1
same on the middleware-word-length-service-1 too .
oh... > 1b84f1ed180c otel/opentelemetry-collector-contrib:latest "/otelcol-contrib --…" 3 minutes ago Up 3 minutes 4317/tcp, 55678-55679/tcp middleware-otel-collector-1
I'm trying something locally in the otel collector config.
I fixed it I think.
"Fixed" is a stretch of course 😉 its publishing metrics to grafana now.
in the meantime - that extra line in the otel-collector config fixed cube-app.
want me to put up a PR on the otel-collector config(s) change?
I'd like to investigate further as there are several issues at play. I've just discovered that the latest
Jaeger build breaks things, but the most recent tagged build works.
sure thing np.
I've got 0.2.7 in our staging env - testing it out now.
got to use the synchronous gauge - that was a win.
and the new histogram bucket advice (without having to override it with a hack)
The Collector (and contrib Collector) had a breaking change introduced yesterday, where the OTLP receiver by default uses localhost
instead of 0.0.0.0
. https://github.com/open-telemetry/opentelemetry-collector/releases/tag/v0.104.0
Many thanks for finding the fix! 🙇:skin-tone-3:
I've pushed your fix, along with a fix for the broken Jaeger image. You should be fine using clj-otel
0.2.7, as those fixes address just the examples and not clj-otel
modules.
@U2C6SPLDS you are very welcome! Thank you for taking the time to 'plod' through this with me. Cheers!