Fork me on GitHub
#clojure
<
2024-05-09
>
agigao13:05:00

Hello Clojurians, has anyone dealt with zipping csv files in Clojure (in-memory) and responding to the client? I'd very much appreciate any pointers.

dpsutton13:05:37

i’ve got to run to other things right now, but i always start with and see how people use things there: https://grep.app/search?q=ZipOutputStream&amp;filter[lang][0]=Clojure

agigao13:05:48

Thank you!

👍 1
agigao14:05:24

I guess Java interop is the way to go

dpsutton14:05:35

yeah. lots of stuff exists for you that is quite battle tested and built into the jvm. You’d end up either finding a library that rewrites lots of stuff for probably no reason, or it would just be a wrapper over the interop, at which point you have to read two sets of documentation and figure out what they allow you to do and how to express it. Interop is fantastic

agigao14:05:49

Gotcha, thanks again! :the_horns:

Noah Bogart14:05:33

might be more suited for #C03RZGPG3

Noah Bogart14:05:37

but this is cool!

Samuel Ludwig14:05:46

quite varying results 😆

cvic15:05:52

A sort by date would be useful, heh...

grzm19:05:14

'lo Clojurians. I'm seeing the clojure cli tools hanging when using the opentelemetry javaagent when it needs to generate a classpath. When no classpath needs to be generated, (e.g., no deps.edn , or it's already cached), there are no issues. This is inside a container on Kubernetes. Details in 🧵

grzm19:05:30

$ clojure --version
Clojure CLI version 1.11.1.1435
$ rm deps.edn 
$ rm -r .cpcache
$ JAVA_TOOL_OPTIONS="-javaagent:opentelemetry-javaagent.jar" clojure
Picked up JAVA_TOOL_OPTIONS: -javaagent:opentelemetry-javaagent.jar
OpenJDK 64-Bit Server VM warning: Sharing is only supported for boot loader classes because bootstrap classpath has been appended
[otel.javaagent 2024-05-09 19:51:24:664 +0000] [main] INFO io.opentelemetry.javaagent.tooling.VersionLogger - opentelemetry-javaagent - version: 1.33.2
Clojure 1.11.1
user=> 

# now try again with a trivial deps.edn
$ echo '{}' > deps.edn
$ JAVA_TOOL_OPTIONS="-javaagent:opentelemetry-javaagent.jar" clojure
Picked up JAVA_TOOL_OPTIONS: -javaagent:opentelemetry-javaagent.jar
OpenJDK 64-Bit Server VM warning: Sharing is only supported for boot loader classes because bootstrap classpath has been appended
[otel.javaagent 2024-05-09 19:52:05:384 +0000] [main] INFO io.opentelemetry.javaagent.tooling.VersionLogger - opentelemetry-javaagent - version: 1.33.2
^C
# ^ hangs

grzm19:05:27

At this point, the classpath for the empty deps.edn has been cached.

$ ls -la .cpcache/
total 20
drwxr-xr-x 2 root root 4096 May  9 19:52 .
drwxr-xr-x 1 root root 4096 May  9 19:52 ..
-rw-r--r-- 1 root root 1703 May  9 19:52 1015551756.basis
-rw-r--r-- 1 root root  230 May  9 19:52 1015551756.cp
And it doesn't hang:
JAVA_TOOL_OPTIONS="-javaagent:opentelemetry-javaagent.jar" clojure
Picked up JAVA_TOOL_OPTIONS: -javaagent:opentelemetry-javaagent.jar
OpenJDK 64-Bit Server VM warning: Sharing is only supported for boot loader classes because bootstrap classpath has been appended
[otel.javaagent 2024-05-09 19:52:36:237 +0000] [main] INFO io.opentelemetry.javaagent.tooling.VersionLogger - opentelemetry-javaagent - version: 1.33.2
Clojure 1.11.1
user=> 

grzm20:05:39

I'd like to be able to leverage JAVA_TOOL_OPTIONS to toggle whether the opentelemetry javaagent is used. I tried "pre-caching" the classpath in the image, but it ended up recalculating it anyway. I can of course use a different environment variable to optionally include a jvm option to use the javaagent, but I'd like to understand better what's going on here, if anyone knows.

grzm20:05:15

Like always, I'm pretty sure I'm just doing something wrong, so any enlightenment is more than welcome. 😄

grzm20:05:57

I tried "pre-caching" the classpath in the image, but it ended up recalculating it anyway.That's probably because I didn't include the -m com.grzm.foo.main arg when I pre-cached it: https://github.com/clojure/brew-install/blob/1.11.3/src/main/resources/clojure/install/clojure#L328 I'd rather not start up the actual service when building the image.

p-himik20:05:21

Can't reproduce, with removing the cp cache before every attempt. First, an attempt at reproduction leads to everything working except for OpTel - it just can't connect to the server, but the REPL proceeds fine. After setting up the default server, it connects and proceeds to the REPL but the "greeting" is printed twice for some reason:

$ rm -r .cpcache
$ JAVA_TOOL_OPTIONS="-javaagent:opentelemetry-javaagent.jar" clojure
Picked up JAVA_TOOL_OPTIONS: -javaagent:opentelemetry-javaagent.jar
OpenJDK 64-Bit Server VM warning: Sharing is only supported for boot loader classes because bootstrap classpath has been appended
[otel.javaagent 2024-05-09 23:56:56:972 +0300] [main] INFO io.opentelemetry.javaagent.tooling.VersionLogger - opentelemetry-javaagent - version: 2.3.0
Picked up JAVA_TOOL_OPTIONS: -javaagent:opentelemetry-javaagent.jar
OpenJDK 64-Bit Server VM warning: Sharing is only supported for boot loader classes because bootstrap classpath has been appended
[otel.javaagent 2024-05-09 23:56:59:248 +0300] [main] INFO io.opentelemetry.javaagent.tooling.VersionLogger - opentelemetry-javaagent - version: 2.3.0
Clojure 1.11.2
user=>

p-himik20:05:32

The OpTel JAR is 2.3.0. The server is 0.100.0.

grzm21:05:01

Thanks for taking a look.

grzm21:05:25

> proceeds to the REPL but the "greeting" is printed twice for some reason: It's printed once when it's loaded by clojure for clojure cli tools to build the classpath, and the second time when application (in this case the REPL) is run.

grzm21:05:20

Are you running this locally or from within a pod?

p-himik21:05:59

Ah, you're right. Locally.

grzm21:05:08

Thanks for confirming. Yeah, there's clearly some piece of info I'm missing. I thought it might be some kind of resource constraint (still might be), but I've given it 3 cpus and 10Gi of memory. I should hope that's enough!

p-himik21:05:26

Note also that there might be ~/.clojure/deps.edn. It is on my machine, I forgot to ask Clojure to run without it by using -Srepro.

hiredman21:05:36

When there isn't a cached class path clj ends up launching two jvms, one to compute the cp and one to use it

👍 1
hiredman21:05:08

So you need to not set the java tool options for the first jvm

grzm21:05:01

@U0NCTKEV8 I couldn't think of a way to do that using the environment variable. Can you?

grzm21:05:44

My workaround has been to include it as a command-line jvm-opt arg (empty if I don't want to include the javaagent)

hiredman21:05:53

I forget, the clj script may have a env variable you can use to bypass the first jvm

grzm21:05:44

You still need that first jvm if you need the classpath calculated, correct?

hiredman21:05:01

Yes, if there is no cache

hiredman21:05:12

Hmm other way around, there is an env bar you can set to pass options to the first

grzm21:05:30

Are you thinking of CLJ_JVM_OPTS ? (Trying to determine it from the source: haven't seen it in the documentation)

grzm21:05:46

I'm currently going with abandoning JAVA_TOOL_OPTIONS and using

clojure ${OTEL_JAVAAGENT:+"-J-javaagent:${OTEL_JAVAAGENT}"} -M:service
where OTEL_JAVAAGENT is the path to the agent jar.

grzm21:05:00

(gotta love bash)

valerauko05:05:28

Does it hang hang or just takes a ridiculous amount of time? For example in my work env, a clojure (aleph) server is ready in 5-10sec. With the datadog agent (that includes opentelemetry) it takes well over a minute. It goes and instruments all over the classpath. Could your problem be similar?

grzm13:05:37

@UAEH11THP The examples above are for a completely empty deps.edn: there are no dependencies, and no source code. Whatever the issue is, it's not the application: it's some interaction between the otel java agent and the jvm process when the classpath is being calculated.

grzm13:05:03

I haven't been patient enough yet to wait indefinitely for it to try to complete. I'll try that now.

grzm13:05:11

Given that I don't know why it's happening yet (I only have a workaround), I'm sure I'm missing something, but I'm pretty confident it's narrowed to the clojure tools jvm process and not the application jvm process.

p-himik13:05:58

You can try attaching to the hanging process with a profiler and/or debugger and see where the time is being spent. That's what I was going to do initially, but I couldn't reproduce it.

grzm13:05:31

@U2FRKM4TW Attaching a profiler/debugger isn't something I have a lot of experience with. Do you have some recommendations or a recipe I could use?

grzm13:05:17

Well, I've narrowed it down a bit further. I believe it's because I'm using the prometheus exporter: I think the prometheus server process is starting up and not getting shut down.

grzm13:05:24

Yup. That appears to be it.

$ export OTEL_METRICS_EXPORTER=none
$ date +"%Y-%m-%dT%H:%M:%S%:z" ; clojure -P ; date +"%Y-%m-%dT%H:%M:%S%:z"
2024-05-10T13:27:46+00:00
Picked up JAVA_TOOL_OPTIONS: -javaagent:opentelemetry-javaagent.jar
OpenJDK 64-Bit Server VM warning: Sharing is only supported for boot loader classes because bootstrap classpath has been appended
[otel.javaagent 2024-05-10 13:27:46:684 +0000] [main] INFO io.opentelemetry.javaagent.tooling.VersionLogger - opentelemetry-javaagent - version: 2.3.0
2024-05-10T13:27:48+00:00

grzm13:05:17

I think I'll still need to pass some configuration via args instead of environment variables, whether its -javaagent or otel.metrics.exporter to avoid these settings being seen by the clojure tools jvm process. Probably cleanest to do it the way I currently am (`-javaagent`) as that toggles everything related to the instrumentation.

p-himik13:05:59

Regarding the debugger - your IDE must support it. I use IDEA with the Cursive plugin, it has Run -> Attach to Process in its menu. And regarding the profiler - it would work only if the process hangs due to some busy work. Nothing complicated, you just download something like VisualVM (or you can use IDEA), attach to the process, and see what's going on. If something hangs on e.g. IO or in a deadlock, profiling won't help (debugging might), but you can use jconsole which is shipped with JDK to see the stacktrace of every thread. E.g. for a REPL started by Cursive, I see that there's a main thread:

Name: main
State: TIMED_WAITING
Total blocked: 0  Total waited: 3

Stack trace: 
[email protected]/java.lang.Thread.sleep0(Native Method)
[email protected]/java.lang.Thread.sleep(Thread.java:509)
nrepl.cmdline$dispatch_commands.invokeStatic(cmdline.clj:452)
nrepl.cmdline$dispatch_commands.invoke(cmdline.clj:436)
nrepl.cmdline$_main.invokeStatic(cmdline.clj:459)
nrepl.cmdline$_main.doInvoke(cmdline.clj:454)
[...]
It's probably not the thread that's actually waiting for the REPL input though since Thread/sleep can't wake up upon an input and I doubt nREPL does busy looping while waiting for an input.

p-himik13:05:13

> I think I'll still need to pass some configuration via args instead of environment variables Another way would be to make sure that the classpath is cached. You can add the -P flag that will also download all the necessary dependencies that step itself doesn't get mixed up with OpTel.

p-himik13:05:04

I myself am not an expert here by any means, but there are a lot of nice, if a bit outdated in terms of UI, tools shipped with JDKs.

grzm13:05:40

Blessing or a curse, a lot of my time is spent fixing/maintaining pretty straightforward stuff where I haven't had too reach for profilers and debuggers. It's been at least a couple of years since I've thought about it. Thanks!

grzm14:05:10

Oh, and for the record, I ended up waiting for over 15 minutes before my patience ran out:

$  date +"%Y-%m-%dT%H:%M:%S%:z" ; clojure -P ; date +"%Y-%m-%dT%H:%M:%S%:z"
2024-05-10T13:05:57+00:00
Picked up JAVA_TOOL_OPTIONS: -javaagent:opentelemetry-javaagent.jar
OpenJDK 64-Bit Server VM warning: Sharing is only supported for boot loader classes because bootstrap classpath has been appended
[otel.javaagent 2024-05-10 13:05:58:005 +0000] [main] INFO io.opentelemetry.javaagent.tooling.VersionLogger - opentelemetry-javaagent - version: 2.3.0
^C2024-05-10T13:23:32+00:00

Ingy döt Net22:05:44

Is there a fairly performant way to determine the approximate allocated memory size of a mapping or vector (or any object)?

hiredman00:05:56

The jvm doesn't provide a way, I am sure there are tools, but size accounting is complicated

didibus00:05:47

Not sure how accurate.

hiredman00:05:02

Clojure maps and vectors are actually trees of objects, and while for a single thing you can traverse and add up the subtrees to get a size, once you start having multiple vectors and maps you start getting structural sharing which can result in significant overlap in the trees so size doesn't just add any more

oyakushev05:05:27

Not sure how accurate.On standard JVM implementations if you don't change ObjectAlignmentInBytes it is very accurate.

Ingy döt Net16:05:53

Thanks. I'll try it out.