At Nubank, we want to measure how developers use the REPL, like how much time it takes to start, finish the first eval, run tests etc, any suggestion on how to achieve that?
I thought about a middleware we could inject on repl, and have access on that somehow
You can use (.getUptime (java.lang.management.ManagementFactory/getRuntimeMXBean)) to tell for how long the JVM has been running, but the "first eval" will depend on the user, unless user.clj is requiring the main ns which is bad practice.
Out of curiosity, is the repl startup to first eval important? On one side I think it highly depends on project + hardware and on the other I tend to keep repls running for a long time. I feel it would be more useful to measure repl restarting frequency per project to identify projects that for whatever reason force the user to restart the repl or "sub optimal" worklflows that end up restarting the repl constantly.
But I think a nrepl middleware + the getUptime could work to tell you how long it took for the repl to start, if you check the uptime when the middleware sees the response of the nrepl describe op for example, or some other of the cider-nrepl "bootstrap messages"
we have some kind of machines standard on nubank, but the idea is to measure both REPL JVM startup time + the load-file s requests, so we understand if our devs are wasting so much time there, we have +1000 devs running repl daily so metrifying the used machine (System/getProperty "os.etc") + those times would be quite valuable for us
thanks for the getUptime tip, looks promissing, now I need to check how to "know" when user requested a nrepl op like load-file to then measure that, I suspose this is possible via middleware?
metrifying the used project is also something we have in mind
yeah, if your middleware is on the "outside", you should be able to see every message passing by. On emacs you can start a repl and see all the messaging, so you can pick a message there. Then on your middleware when you see that message going in, you can grab the id, wait for the response, and check the uptime on the response of that message
When I start a repl, this are the "bootstrap messages" I see :
(-->
id "1"
op "clone"
time-stamp "2025-03-31 15:28:11.561035937"
)
(<--
id "1"
session "f80d1a39-802c-4677-9d68-912dd0bfa252"
time-stamp "2025-03-31 15:28:11.564755185"
new-session "0f33b636-b853-42b4-b376-2abefbbce2c8"
status ("done")
)
(-->
id "2"
op "clone"
time-stamp "2025-03-31 15:28:11.581342241"
)
(<--
id "2"
session "d9619d15-f20a-4026-ad79-39590dd54e3b"
time-stamp "2025-03-31 15:28:11.581869955"
new-session "99683920-96d8-48c9-a342-12bc4c547d2a"
status ("done")
)
(-->
id "3"
op "describe"
session "0f33b636-b853-42b4-b376-2abefbbce2c8"
time-stamp "2025-03-31 15:28:11.601538134"
)
(<--
id "3"
session "0f33b636-b853-42b4-b376-2abefbbce2c8"
time-stamp "2025-03-31 15:28:11.604556147"
aux (dict ...)
ops (dict ...)
status ("done")
versions (dict ...)
)
(-->
id "4"
op "eval"
session "0f33b636-b853-42b4-b376-2abefbbce2c8"
time-stamp "2025-03-31 15:28:11.622607059"
code "(when-let [requires (resolve 'clojure.main/repl-requires)]
..."
column 1
file "*cider-repl flow-storm-testers/big-demo:localhost:40729(clj)..."
inhibit-cider-middleware "true"
line 11
nrepl.middleware.print/buffer-size 4096
nrepl.middleware.print/options (dict ...)
nrepl.middleware.print/print "cider.nrepl.pprint/pprint"
nrepl.middleware.print/quota 1048576
nrepl.middleware.print/stream? "1"
)
(<--
id "4"
session "0f33b636-b853-42b4-b376-2abefbbce2c8"
time-stamp "2025-03-31 15:28:11.639408963"
value "nil"
)
(<--
id "4"
session "0f33b636-b853-42b4-b376-2abefbbce2c8"
time-stamp "2025-03-31 15:28:11.679779089"
ns "user"
)
(<--
id "4"
session "0f33b636-b853-42b4-b376-2abefbbce2c8"
time-stamp "2025-03-31 15:28:11.679950375"
status ("done")
)
(-->
id "5"
op "out-subscribe"
session "0f33b636-b853-42b4-b376-2abefbbce2c8"
time-stamp "2025-03-31 15:28:11.680146648"
)
(-->
id "6"
op "init-debugger"
session "0f33b636-b853-42b4-b376-2abefbbce2c8"
time-stamp "2025-03-31 15:28:11.680298187"
nrepl.middleware.print/buffer-size 4096
nrepl.middleware.print/options (dict ...)
nrepl.middleware.print/print "cider.nrepl.pprint/pprint"
nrepl.middleware.print/quota 1048576
nrepl.middleware.print/stream? "1"
)
(-->
id "7"
op "classpath"
session "0f33b636-b853-42b4-b376-2abefbbce2c8"
time-stamp "2025-03-31 15:28:11.681887882"
)
(<--
id "5"
session "0f33b636-b853-42b4-b376-2abefbbce2c8"
time-stamp "2025-03-31 15:28:11.743715412"
out-subscribe "0f33b636-b853-42b4-b376-2abefbbce2c8"
status ("done")
)
(<--
id "7"
session "0f33b636-b853-42b4-b376-2abefbbce2c8"
time-stamp "2025-03-31 15:28:11.866315103"
classpath ("/home/jmonetta/my-projects/flow-storm-testers/big-demo/src" ...)
status ("done")
)
(-->
id "8"
op "ns-list"
session "0f33b636-b853-42b4-b376-2abefbbce2c8"
time-stamp "2025-03-31 15:28:11.884222171"
exclude-regexps ("^cider.nrepl" "^refactor-nrepl" "^nrepl")
)
(<--
id "8"
session "0f33b636-b853-42b4-b376-2abefbbce2c8"
time-stamp "2025-03-31 15:28:11.908004439"
ns-list ("amalloy.ring-buffer" ...)
status ("done")
)
(-->
id "9"
op "version"
session "0f33b636-b853-42b4-b376-2abefbbce2c8"
time-stamp "2025-03-31 15:28:11.933486131"
debug "false"
insert-newline-after-require "true"
prefix-rewriting "false"
)
(<--
id "9"
session "0f33b636-b853-42b4-b376-2abefbbce2c8"
time-stamp "2025-03-31 15:28:11.934670455"
status ("done")
version "3.10.0"
)
(-->
id "10"
op "version"
session "0f33b636-b853-42b4-b376-2abefbbce2c8"
time-stamp "2025-03-31 15:28:11.953797495"
debug "false"
insert-newline-after-require "true"
prefix-rewriting "false"
)
(<--
id "10"
session "0f33b636-b853-42b4-b376-2abefbbce2c8"
time-stamp "2025-03-31 15:28:11.954309198"
status ("done")
version "3.10.0"
)
(-->
id "11"
op "artifact-list"
session "0f33b636-b853-42b4-b376-2abefbbce2c8"
time-stamp "2025-03-31 15:28:11.974408629"
debug "false"
force "false"
insert-newline-after-require "true"
prefix-rewriting "false"
)
(<--
id "11"
session "0f33b636-b853-42b4-b376-2abefbbce2c8"
time-stamp "2025-03-31 15:28:12.414016236"
artifacts ("side-fx" ...)
status ("done")
)that is without doing anything, without any eval
then if I go and load a file, I see :
(-->
id "12"
op "load-file"
session "0f33b636-b853-42b4-b376-2abefbbce2c8"
time-stamp "2025-03-31 15:29:14.131369526"
file "(ns big-demo.web-app
(:require [org.httpkit.server :as htt..."
file-name "web_app.clj"
file-path "/home/jmonetta/my-projects/flow-storm-testers/big-demo/src/b..."
)
....
....
(<--
id "12"
session "0f33b636-b853-42b4-b376-2abefbbce2c8"
time-stamp "2025-03-31 15:29:15.226128860"
status ("done")
)but that load-file is being fired by hand, so it will be different every time I start my repl
awesome, I do know this emacs feature, I used so much to develop https://github.com/afucher/clojure-repl-intellij, good to know I have access to that on middleware side
I'm aware of those bootstrap messages, we will have to do some kind of filtering indeed, or just send everything and do the filter on dashboards, but looks promissing
I guess you can also use the time-stamp fields on the messages
yeah, makes sense
Cool, looks it's possible so, I was not sure it was possible to make that middleware work, thanks for the tips!
I guess you can report a "total time" by waiting for the load-file response and doing total-time = uptime-at-last-nrepl-boot-response + (load-file-resp-ts - load-file-req-ts)
I want to have the JVM startup metric even if the user started a repl but not evaluated nothing or so, I don't want to couple those things
got it
That workaround seems to work tho, but would be nice to nrepl support notifications like the LSP protocol and not only requests
yeah, I created this issue a while back https://github.com/nrepl/nrepl/issues/315
yeah, that looks really interesting
ah nice, I took a look and saw these dynamic vars configs which seems valid to complement indeed, thanks for the help, I'll go ahead and support both configs and this config via file
I've love for someone to do some exploration work on this issue, as I also agree that would be valuable. cider-nrepl has a limited way to notify clients, but for a proper solution we'd have to tackle the problem in nREPL itself.
I known very little about nrepl, but do you think would be feasible/makes sense to have a concept of server notification? or you mean solve this problem in some other way?
I'm not an nrepl expert by any means, but what I find special about nrepl is the request->multi-response protocol, with a special response supposedly marking the "end" of the response (status done). This can be used I guess for SSE, like a request (subscribe-to-events) that will receive responses (events) without ever receiving status done. This isn't possible to consume with the current nrepl client api but shouldn't be hard to extend. Also not sure if a SSE implementation should implement multi-session events, in case the server wants to communicate something to every connected client.
Last time I check for the specific events of something in the server writing to *out* or *err* the Cider elisp nrepl client, and Reply client (used by lein) does something that is not possible with the nrepl client, which is to watch every message independent of the request and if they see something responding with :out or :err generate this event client side, which is pretty hacky.
hum, that idea is interesting indeed of not sending done ever
Kind of like how it works in HTTP, a GET request that returns a piece of the response (`Content-Type: text/event-stream` header) and never closes the connection.
Although that would work, I feel like having a concept of notification where the server could send messages to client would be easier to understand and maybe evolve, but it may require more work on nrepl to support that
yeah, given nrepl works always over a bi-directional connection (TCP or Unix sockets) I guess a SSE that doesn't require a "subscribe request" to initiate notifications can be implemented.
the downside being complexity. Currently nrepl is about sending async messages with :op, :id and :session, and getting possibly multiple responses matching the request id with a well defined middleware system over this. Making the server able to send something not related to requests, makes it more complex, since you need another message system, which will probably don't play with the middleware system, and starts changing nrepl into something else.
yeah, makes really sense, would love to hear @bozhidar feelings about which way we should explore more
maybe also @alexyakushev (from https://clojurians.slack.com/archives/C04CAKAGADU/p1744114269657929?thread_ts=1743442655.555199&cid=C04CAKAGADU so you don't have to read everything)
Hum, I just found that there is no nrepl event or similar to tell it's ready to accept ops, and confirmed that starting the repl is not enough as describe and other ops are sent only when user manually eval or do it's first move.
I wanted to metrify the startup time, so I ended up with this ugly code:
;; Workaround to export the startup JVM time
;; as close as possible to REPL ready to user.
(defonce ^:private _export-startup-metric!!!
(metrics/metrify
:task/jvm-started
{:time-ms (.getUptime (ManagementFactory/getRuntimeMXBean))}))
as I want to send the metric as soon as the nrepl is ready, the first op is only trigerred with a user action which is not enough(I added a first-op-requested even too though)
I also agree that middleware is the best way to go for something like this. I recall we had some discussions in the past about having a standard request timing middleware in nREPL, but I'm not sure I'll be able to find them.
yeah, I'm experiment creating a OSS middleware lib to do that, good to know
The timestamps we introduced on the client-side were mostly to avoid differences from the network communication and time spent encoding/decoding messages.
That'd be great - I'm sure others would benefit from something like this.
@ericdallo You might also find the middleware here interesting as a source of inspiration https://github.com/jonpither/cider-spy-nrepl
It was supposed to track multiple REPLs
ah yeah, definitely will help as I'm new on building middleware, thanks!
@bozhidar, given your experience, do you have any suggestion on how to "config" a middeware?
I want to support different kind of appenders for this lib, like metrify to stdout, opentelemetry, file, etc.
cider-spy-nrepl for example expects a .cider-spy-nrepl.edn file which it should slurp and get the config, that's similar how clojure-lsp and other tools do, is there any good other way, like passing via nrepl args when passing the middleware or something, WDYT?
Yeah, thatโs good way, at least for establishing the initial configuration. Iโm guessing another one would be to use configuration from .nrepl.edn directly, if you want to avoid a new configuration file. (I think cider-spy was created before nREPL got modernized) In general itโs a good idea to support passing configuration options via requests as thatโs the most dynamic form of configuration - basically every config options becomes some (optional) request flag. I think this always complements well a config file.