leiningen

Hassan El-Mehalawi 2022-03-12T21:40:54.463349Z

Hopefully this is the right place to ask: I have an issue with leiningen where it times out when I run lein repl. It just waits the timeout period (set to 120000 in my project.clj), then says

jumar 2022-03-15T08:30:18.645799Z

@elmehalawi vemv and mikerod had a good point: sometimes, 2 minutes isn't enough - especially if that's the first time you are running lein repl. Maybe try lein deps first too. About the CPU proflie: it could be interesting but also misleading. It depends on where the process is stuck (or if it's just really taking too much time to load the code). About the stacktrace: it seems to me that you just reposted the stacktrace for pastebin. That's I believe the very first process the lein starts, not the process loading your code in the repl. I've tried to start the repl and the stacktrace was changing quite a bit but looked something like this:

jumar 2022-03-15T08:30:30.150349Z

Hassan El-Mehalawi 2022-03-15T08:56:09.987619Z

Thanks so much for your help guys

Hassan El-Mehalawi 2022-03-15T08:57:22.124059Z

I tried commenting out things in my project.clj and interestingly, there was one thing that got lein to boot when it was gone: the dev dependency ring/ring-devel

Hassan El-Mehalawi 2022-03-15T08:57:49.030189Z

Of course when lein booted, it just shows a stack trace because I think the project needs that to work

Hassan El-Mehalawi 2022-03-15T08:58:37.773159Z

I'm not sure if this is the culprit, or if ring-devel loads so early that it causes a crash before the hang has a chance to happen

Hassan El-Mehalawi 2022-03-15T09:09:20.722209Z

> Visualvm > cpu sampler > take snapshot. Then you can view and export the snapshot as an nps file. I like the hot spot filter that typically falls out key function names I tried this but the jvisualvm command isn't working on my jvm, I will investigate further

Hassan El-Mehalawi 2022-03-15T09:11:05.181149Z

@jumar I think that's what mine looks like too, it didn't have any project code so I figure it wasn't relevant. I did paste it above though

Hassan El-Mehalawi 2022-03-15T09:45:17.418629Z

So I don't think the ring-devel dep was really it. When I removed all references to it in the project so it would build, lein ended up hanging again anyway

Hassan El-Mehalawi 2022-03-15T09:45:40.402179Z

So I think that just caused something earlier in the process to fail, but it didn't fix the issue

jumar 2022-03-15T14:27:21.789109Z

That's strange. It might be some weird code loading issue. What if you keep the dependency there but don't require it anywhere in your code? You can also try to enable verbose ns loading to narrow it.

;; alternatively, you can add :verbose to require statements.
(alter-var-root #'clojure.core/*loading-verbosely* (constantly true)) 

Hassan El-Mehalawi 2022-03-16T02:16:03.092489Z

@jumar Just tried that, it does the same thing unfortunately

Hassan El-Mehalawi 2022-03-16T02:17:31.306829Z

I think that loading seems to happen before whatever causes the hang

jumar 2022-04-01T04:03:44.329349Z

@elmehalawi did you find what was the problem?

vemv 2022-03-12T21:43:09.110489Z

Ctrl+\ while it's hanging, you'll get a thread dump so maybe you'll see a stacktrace of whatever your program is doing

Hassan El-Mehalawi 2022-03-12T22:01:15.553269Z

@vemv thanks, that did give me a stack trace (among other things) and it looks like maybe there's something going on with the ring handler, although I'm not sure. I will continue to investigate.

🙂 1
2022-03-13T01:31:46.279429Z

It could also be failing to fetch dependencies if you needed a vpn or something for a private repo

2022-03-13T01:32:02.435749Z

You maybe can find more with DEBUG=true

Hassan El-Mehalawi 2022-03-13T01:57:26.509199Z

@mikerod when I run DEBUG=true lein repl, I get: > Leiningen's classpath: :/usr/local/Cellar/leiningen/2.9.8/libexec/leiningen-2.9.8-standalone.jar > Applying task repl to [] > Applying task javac to nil > Running javac with [@/var/folders/xb/0v6prcw13331hhfyw8qskhqc0000gn/T/.leiningen-cmdline13237562225446695570.tmp] > Applying task compile to nil > All namespaces already AOT compiled. > REPL server launch timed out.

Hassan El-Mehalawi 2022-03-13T01:59:33.815269Z

I'm not totally sure what that means but I guess that's normal?

2022-03-13T02:01:39.752519Z

I don't see anything that stands out there

2022-03-13T02:01:49.627059Z

Yes looks normal. Just showing the steps it went through.

2022-03-13T02:03:03.235539Z

You could try getting stack as was mentioned. Or hook a cpu sampling profiler to the lein process while it is hanging to see where it is hanging/spending the time at for diagnosis

Hassan El-Mehalawi 2022-03-13T02:12:03.033749Z

Thank you for your help, that is definitely new for me, but I will try it out

2022-03-13T03:07:23.680379Z

Ok. Yeah I don't have any simpler way unless there is just something common like this you can find via search. I've used the profiling method before to diagnose though

jumar 2022-03-13T07:21:19.393739Z

Async Profiler is good and relatively easy to use for cpu profiling. strace might give you dome details too

jumar 2022-03-13T07:22:29.015269Z

It's useful to check whether it's a problem for other lein-based projects on your machine and whether it works with this project somewhere else

Hassan El-Mehalawi 2022-03-14T22:16:22.053369Z

I looked at the stack traces from the other process and it didn't have any of my code in it

Hassan El-Mehalawi 2022-03-14T22:18:13.395649Z

clojure-agent-send-off-pool-0  Waiting CPU usage on sample: 0ms
  jdk.internal.misc.Unsafe.park(Native Method)
  java.util.concurrent.locks.LockSupport.park(LockSupport.java:211)
  java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:715)
  java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1047)
  java.util.concurrent.CountDownLatch.await(CountDownLatch.java:230)
  clojure.core$promise$reify__8524.deref(core.clj:7111)
  clojure.core$deref.invokeStatic(core.clj:2324)
  clojure.core$deref.invoke(core.clj:2310)
  nrepl.ack$wait_for_ack$fn__10144.invoke(ack.clj:28)
  clojure.core$binding_conveyor_fn$fn__5772.invoke(core.clj:2034)
  clojure.lang.AFn.call(AFn.java:18)
  java.util.concurrent.FutureTask.run(FutureTask.java:264)
  java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
  java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
  java.lang.Thread.run(Thread.java:833)

clojure-agent-send-off-pool-1  Runnable CPU usage on sample: 0ms
  sun.nio.ch.Net.accept(Native Method)
  sun.nio.ch.NioSocketImpl.accept(NioSocketImpl.java:755)
  java.net.ServerSocket.implAccept(ServerSocket.java:675)
  java.net.ServerSocket.platformImplAccept(ServerSocket.java:641)
  java.net.ServerSocket.implAccept(ServerSocket.java:617)
  java.net.ServerSocket.implAccept(ServerSocket.java:574)
  java.net.ServerSocket.accept(ServerSocket.java:532)
  nrepl.server$accept_connection.invokeStatic(server.clj:53)
  nrepl.server$accept_connection.invoke(server.clj:49)
  nrepl.server$start_server$fn__11039.invoke(server.clj:176)
  clojure.core$binding_conveyor_fn$fn__5772.invoke(core.clj:2034)
  clojure.lang.AFn.call(AFn.java:18)
  java.util.concurrent.FutureTask.run(FutureTask.java:264)
  java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
  java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
  java.lang.Thread.run(Thread.java:833)

clojure-agent-send-off-pool-2  Waiting CPU usage on sample: 0ms
  jdk.internal.misc.Unsafe.park(Native Method)
  java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:252)
  java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:401)
  java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:903)
  java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1061)
  java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1122)
  java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
  java.lang.Thread.run(Thread.java:833)

ClosingPipeCloser 1449847400  Waiting CPU usage on sample: 0ms
  java.lang.Object.wait(Native Method)
  java.lang.Thread.join(Thread.java:1304)
  java.lang.Thread.join(Thread.java:1372)
  com.hypirion.io.Pipe.join(Pipe.java:120)
  com.hypirion.io.ClosingPipe$OutCloser.run(ClosingPipe.java:111)
  java.lang.Thread.run(Thread.java:833)

Common-Cleaner  Waiting CPU usage on sample: 0ms
  java.lang.Object.wait(Native Method)
  java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:155)
  jdk.internal.ref.CleanerImpl.run(CleanerImpl.java:140)
  java.lang.Thread.run(Thread.java:833)
  jdk.internal.misc.InnocuousThread.run(InnocuousThread.java:162)

main  Waiting CPU usage on sample: 0ms
  jdk.internal.misc.Unsafe.park(Native Method)
  java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:252)
  java.util.concurrent.FutureTask.awaitDone(FutureTask.java:444)
  java.util.concurrent.FutureTask.get(FutureTask.java:203)
  clojure.core$deref_future.invokeStatic(core.clj:2306)
  clojure.core$future_call$reify__8477.deref(core.clj:6976)
  clojure.core$deref.invokeStatic(core.clj:2328)
  clojure.core$deref.invoke(core.clj:2310)
  nrepl.ack$wait_for_ack.invokeStatic(ack.clj:29)
  nrepl.ack$wait_for_ack.invoke(ack.clj:15)
  leiningen.repl$server.invokeStatic(repl.clj:309)
  leiningen.repl$server.invoke(repl.clj:287)
  leiningen.repl$repl.invokeStatic(repl.clj:410)
  leiningen.repl$repl.doInvoke(repl.clj:337)
  clojure.lang.RestFn.invoke(RestFn.java:425)
  leiningen.repl$repl.invokeStatic(repl.clj:395)
  leiningen.repl$repl.invoke(repl.clj:337)
  clojure.lang.AFn.applyToHelper(AFn.java:154)
  clojure.lang.RestFn.applyTo(RestFn.java:132)
  clojure.lang.Var.applyTo(Var.java:705)
  clojure.core$apply.invokeStatic(core.clj:669)
  clojure.core$apply.invoke(core.clj:662)
  leiningen.core.main$partial_task$fn__7340.doInvoke(main.clj:284)
  clojure.lang.RestFn.invoke(RestFn.java:410)
  clojure.lang.AFn.applyToHelper(AFn.java:154)
  clojure.lang.RestFn.applyTo(RestFn.java:132)
  clojure.lang.AFunction$1.doInvoke(AFunction.java:31)
  clojure.lang.RestFn.applyTo(RestFn.java:137)
  clojure.core$apply.invokeStatic(core.clj:669)
  clojure.core$apply.invoke(core.clj:662)
  leiningen.core.main$apply_task.invokeStatic(main.clj:334)
  leiningen.core.main$apply_task.invoke(main.clj:320)
  leiningen.core.main$resolve_and_apply.invokeStatic(main.clj:343)
  leiningen.core.main$resolve_and_apply.invoke(main.clj:336)
  leiningen.core.main$_main$fn__7429.invoke(main.clj:453)
  leiningen.core.main$_main.invokeStatic(main.clj:442)
  leiningen.core.main$_main.doInvoke(main.clj:439)
  clojure.lang.RestFn.applyTo(RestFn.java:137)
  clojure.lang.Var.applyTo(Var.java:705)
  clojure.core$apply.invokeStatic(core.clj:667)
  clojure.main$main_opt.invokeStatic(main.clj:514)
  clojure.main$main_opt.invoke(main.clj:510)
  clojure.main$main.invokeStatic(main.clj:664)
  clojure.main$main.doInvoke(main.clj:616)
  clojure.lang.RestFn.applyTo(RestFn.java:137)
  clojure.lang.Var.applyTo(Var.java:705)
  clojure.main.main(main.java:40)

Notification Thread  Runnable CPU usage on sample: 0ms
  Stack trace is not available

PipeThread 1449847400  Waiting CPU usage on sample: 0ms
  java.lang.Object.wait(Native Method)
  java.lang.Object.wait(Object.java:338)
  com.hypirion.io.Pipe$PipeOutputStream.run(Pipe.java:235)
  java.lang.Thread.run(Thread.java:833)

PipeThread 1498588749  Runnable CPU usage on sample: 0ms
  java.io.FileInputStream.readBytes(Native Method)
  java.io.FileInputStream.read(FileInputStream.java:276)
  java.io.BufferedInputStream.fill(BufferedInputStream.java:244)
  java.io.BufferedInputStream.read1(BufferedInputStream.java:284)
  java.io.BufferedInputStream.read(BufferedInputStream.java:343)
  java.io.FilterInputStream.read(FilterInputStream.java:106)
  com.hypirion.io.Pipe$PipeOutputStream.run(Pipe.java:239)
  java.lang.Thread.run(Thread.java:833)

PipeThread 618181595  Runnable CPU usage on sample: 0ms
  java.io.FileInputStream.readBytes(Native Method)
  java.io.FileInputStream.read(FileInputStream.java:276)
  java.io.BufferedInputStream.fill(BufferedInputStream.java:244)
  java.io.BufferedInputStream.read1(BufferedInputStream.java:284)
  java.io.BufferedInputStream.read(BufferedInputStream.java:343)
  java.io.FilterInputStream.read(FilterInputStream.java:106)
  com.hypirion.io.Pipe$PipeOutputStream.run(Pipe.java:239)
  java.lang.Thread.run(Thread.java:833)

Thread-2  Waiting CPU usage on sample: 0ms
  java.lang.Object.wait(Native Method)
  java.lang.Thread.join(Thread.java:1304)
  java.lang.Thread.join(Thread.java:1372)
  com.hypirion.io.Pipe.join(Pipe.java:120)
  leiningen.core.eval$sh.invokeStatic(eval.clj:191)
  leiningen.core.eval$sh.doInvoke(eval.clj:173)
  clojure.lang.RestFn.applyTo(RestFn.java:137)
  clojure.core$apply.invokeStatic(core.clj:667)
  clojure.core$apply.invoke(core.clj:662)
  leiningen.core.eval$fn__7103.invokeStatic(eval.clj:263)
  leiningen.core.eval$fn__7103.invoke(eval.clj:261)
  clojure.lang.MultiFn.invoke(MultiFn.java:234)
  leiningen.core.eval$eval_in_project.invokeStatic(eval.clj:368)
  leiningen.core.eval$eval_in_project.invoke(eval.clj:358)
  leiningen.repl$server$fn__11198.invoke(repl.clj:300)
  clojure.lang.AFn.applyToHelper(AFn.java:152)
  clojure.lang.AFn.applyTo(AFn.java:144)
  clojure.core$apply.invokeStatic(core.clj:667)
  clojure.core$with_bindings_STAR_.invokeStatic(core.clj:1977)
  clojure.core$with_bindings_STAR_.doInvoke(core.clj:1977)
  clojure.lang.RestFn.invoke(RestFn.java:425)
  clojure.lang.AFn.applyToHelper(AFn.java:156)
  clojure.lang.RestFn.applyTo(RestFn.java:132)
  clojure.core$apply.invokeStatic(core.clj:671)
  clojure.core$bound_fn_STAR_$fn__5767.doInvoke(core.clj:2007)
  clojure.lang.RestFn.invoke(RestFn.java:397)
  clojure.lang.AFn.run(AFn.java:22)
  java.lang.Thread.run(Thread.java:833)

Hassan El-Mehalawi 2022-03-14T22:18:35.694979Z

It looks fairly generic

Hassan El-Mehalawi 2022-03-14T22:18:54.466309Z

At this point, I'm considering moving my project to something non-leiningen like kit

Hassan El-Mehalawi 2022-03-14T22:19:00.771109Z

And just moving source files over one by one

Hassan El-Mehalawi 2022-03-14T22:19:18.716849Z

I think that actually might take less time than diagnosing this issue

vemv 2022-03-14T22:43:22.969209Z

Leiningen generally works very nicely. Fixing this might be pretty enlightening I'd try to bisect the issue by removing stuff: • :repl-options • any Lein plugins • dev/ dir if any

👍 1
vemv 2022-03-14T22:45:43.875259Z

btw 120000 is 2 minutes which is not a lot for work projects! Clojure projects scale linearly in terms of compilation size. I've had many work projects that load in the 5-8 minute range. So you can increase it to 10m for good measure

2022-03-15T01:33:42.690079Z

Yeah. Good point to see if something is just taking a long time. Then try to figure out why it's so slow once you know it does start

2022-03-15T01:34:00.338929Z

I also agree lein usually works well and I'd say it's rare to see this sort of issue.

2022-03-15T01:34:30.371489Z

Also one thing I've often done when stuck enough is to actually clone leiningen repo and launch an repl against the lein version tag you are using.

2022-03-15T01:35:00.450089Z

You can then do repl driven debugging within lein which lets you add things like println or whatever.

2022-03-15T01:35:45.574029Z

Lastly I id still prefer hot spots from a profiler shared. I can't parse these thread dumbs that well.

2022-03-15T01:36:33.816009Z

Visualvm > cpu sampler > take snapshot. Then you can view and export the snapshot as an nps file. I like the hot spot filter that typically falls out key function names

Hassan El-Mehalawi 2022-03-13T09:32:37.689989Z

@jumar thank you I will check those out. Lein seems to be working normally outside of this project, so I suspect it's something wrong with my config or deps, but I'm not sure which yet

Hassan El-Mehalawi 2022-03-13T10:00:40.747769Z

So I profiled lein repl using async profiler as suggested, and I got this result:

Profiling for 60 seconds
Done
--- Execution profile ---
Total samples       : 9

--- 30000000 ns (33.33%), 3 samples
  [ 0] __psynch_cvwait
  [ 1] os::PlatformMonitor::wait(long)
  [ 2] Monitor::wait_without_safepoint_check(long long)
  [ 3] G1ServiceThread::sleep_before_next_cycle()
  [ 4] G1ServiceThread::run_service()
  [ 5] ConcurrentGCThread::run()
  [ 6] Thread::call_run()
  [ 7] thread_native_entry(Thread*)
  [ 8] _pthread_start
  [ 9] thread_start

--- 20000000 ns (22.22%), 2 samples
  [ 0] __psynch_cvwait
  [ 1] os::PlatformMonitor::wait(long)
  [ 2] Monitor::wait_without_safepoint_check(long long)
  [ 3] WatcherThread::sleep() const
  [ 4] WatcherThread::run()
  [ 5] Thread::call_run()
  [ 6] thread_native_entry(Thread*)
  [ 7] _pthread_start
  [ 8] thread_start

--- 20000000 ns (22.22%), 2 samples
  [ 0] __psynch_cvwait
  [ 1] os::PlatformMonitor::wait(long)
  [ 2] Monitor::wait_without_safepoint_check(long long)
  [ 3] VMThread::wait_for_operation()
  [ 4] VMThread::loop()
  [ 5] VMThread::run()
  [ 6] Thread::call_run()
  [ 7] thread_native_entry(Thread*)
  [ 8] _pthread_start
  [ 9] thread_start

--- 10000000 ns (11.11%), 1 sample
  [ 0] __gettimeofday
  [ 1] os::PlatformMonitor::wait(long)
  [ 2] Monitor::wait_without_safepoint_check(long long)
  [ 3] WatcherThread::sleep() const
  [ 4] WatcherThread::run()
  [ 5] Thread::call_run()
  [ 6] thread_native_entry(Thread*)
  [ 7] _pthread_start
  [ 8] thread_start

--- 10000000 ns (11.11%), 1 sample
  [ 0] __psynch_cvwait
  [ 1] os::PlatformMonitor::wait(long)
  [ 2] Monitor::wait_without_safepoint_check(long long)
  [ 3] MonitorDeflationThread::monitor_deflation_thread_entry(JavaThread*, JavaThread*)
  [ 4] JavaThread::thread_main_inner()
  [ 5] JavaThread::run()
  [ 6] Thread::call_run()
  [ 7] thread_native_entry(Thread*)
  [ 8] _pthread_start
  [ 9] thread_start

          ns  percent  samples  top
  ----------  -------  -------  ---
    80000000   88.89%        8  __psynch_cvwait
    10000000   11.11%        1  __gettimeofday

Hassan El-Mehalawi 2022-03-13T10:01:10.058599Z

It looks like it's spending most of its time waiting and about 11% on "gettimeofday"

Hassan El-Mehalawi 2022-03-13T10:01:25.714359Z

Not sure what that means as far as my project is concerned

vemv 2022-03-13T16:49:53.871139Z

Those don't mean much without a mapping from namespaces (or Java packages) to class/method calls. I don't know much about async profiler but maybe it can give it to you. YourKit definitely can (although it might take some effort to properly run it for this specific task) In all honesty I think Ctrl+\ woud give you all info you need

2022-03-13T16:51:19.257279Z

Yeah I need java stack trace in profiling to be useful for me

2022-03-13T16:52:02.990009Z

Your kit is good. Visualvm is fine too. I think it comes built in with modern jdks even

2022-03-13T16:52:14.505309Z

I use the app though downloadable from the site.

jumar 2022-03-13T17:04:12.715449Z

For Async Profiler you want to create a flamegraph and look for the main thread (not sure what that means in terms of the lein repl process(es)). But yeah, if you capture stacktraces via kill -3, jcmd Thread.print or some other means that's probably the most useful thing you can do.

Hassan El-Mehalawi 2022-03-13T21:18:41.941699Z

I will try yourkit next

Hassan El-Mehalawi 2022-03-13T21:19:25.799589Z

I've been overanalyzing the stack traces I get with ctrl+\ and I'm struggling to find an issue

Hassan El-Mehalawi 2022-03-13T21:19:43.683669Z

I see my code but it looks like it's just loading it (all the lines numbers are 1)

Hassan El-Mehalawi 2022-03-13T21:20:27.219109Z

And there doesn't seem to be an exception thrown

Hassan El-Mehalawi 2022-03-13T21:53:34.862989Z

Here's what the top of the yourkit flamegraph shows

Hassan El-Mehalawi 2022-03-13T21:54:18.694479Z

It looks like the highest function that's not completely generic is compiler.eval

Hassan El-Mehalawi 2022-03-13T21:54:37.194009Z

It looks like it just hangs on compiler.eval

Hassan El-Mehalawi 2022-03-13T21:54:44.866279Z

Not sure why that is

jumar 2022-03-13T21:55:04.972509Z

Can you share the stack traces?

Hassan El-Mehalawi 2022-03-13T21:58:16.882989Z

@jumar of course, here they are: https://pastebin.com/dMjAB9u9

Hassan El-Mehalawi 2022-03-13T22:08:27.348899Z

Does anything stand out that I might have missed?

jumar 2022-03-14T06:09:36.405339Z

I would say the most interesting one is the main thread. That one seems to be simply waiting the repl to start Now, I don't know much details about leiningen internals, but lein repl typically launches two jvm processes. So it might be helpful to gather stacktraces for the other process too.

Hassan El-Mehalawi 2022-03-12T21:40:56.820149Z

"REPL server launch timed out."

Hassan El-Mehalawi 2022-03-12T21:41:41.428509Z

This does not happen anywhere besides in my project

Hassan El-Mehalawi 2022-03-12T21:42:04.687069Z

A few questions: • Is there a way to figure out what the issue is? Maybe a verbose option or something? • Are there common issues I should look at when this sort of thing happens?