clojure

pieterbreed 2025-10-13T07:12:42.744889Z

Hi everyone, I am busy experimenting with using clojure inside a lambda (deployed as a container). The container has the corretto jdk 21, I am using an up-to-date version of clojure. My app is packaged as an uberjar and the lambda function starts with a java -cp /path/to/uber.jar ns.application/lambda-handler. The lambda has 2gb of RAM assigned, but is using less than 200Mb. My cold-start times are around 9-10 seconds. I often see the very first lambda invocations terminated when it takes longer than 10 seconds to init. My assumptions going into this was, sure it will be slow, but 3-4 seconds is tolerable. I did not expect 9-10 seconds to be honest, and it creates such psychological confusion in my (internal) users that I may have to change it. Before I just throw everything out and start over with something else... My question; what can I realistically do to speed up this initial load? When I look at the logs, it takes about half of the time to load the namespace for the entry-point (`ns.application`) and at that point another namespace & symbol is requiring-resolve 'd to load a symbol that will actually handle the call, this takes another few seconds. I don't expect magic here, "it is what it is" is a valid answer; However, I am hoping that I'm doing something wrong and maybe someone has been through this and can tell me to tweak some setting, or to pre-generate some cache or something and/or something along those lines. I also read recently about changes in JDK25 that addresses cold start-up times, but unsure what that means or how to start to take advantage of it.

igrishaev 2025-10-13T07:21:54.467239Z

Yes, Java cold start in a Lambda is horrible: 6-8 seconds for the first invocation is normal for JVM. What you can do is: β€’ prewarm your lambda: make a cron job that will ping it keeping it working. That will cost you something, of course; β€’ Compile your code with native image to have a single binary file. Its start time is instant, like Golang or similar languages β€’ If you don't want to compile, use babashka + .clj files. Babashka starts pretty fast as well

igrishaev 2025-10-13T07:24:14.429879Z

I've got a couple of lambdas compiled with native image: one for a Telegram bot, and another for handling HTML forms. They work pretty nice. Both are done with this repo: https://github.com/igrishaev/lambda (check out the readme file)

igrishaev 2025-10-13T07:27:27.307079Z

Oh I forgot: there is still a public lambda I made for demo purposes! You're welcome to bench it: https://kpryignyuxqx3wwuss7oqvox7q0yhili.lambda-url.us-east-1.on.aws/

pieterbreed 2025-10-13T07:38:43.800459Z

Thank you @igrishaev for confirmation of startup times. At least I feel less incompetent now... πŸ˜… General knowledge question regarding clojure -> compile your code with native image; Is there certain kinds of clojure code/libraries/functions which if you use them, pre-compile does not work?

igrishaev 2025-10-13T07:42:40.489549Z

Most of the libraries work with native image nowadays; I recall I had problems with Docjure because it's based on Apache POI, and that's a total Java mess inside. Also, official AWS SDK also fails to compile due to various shenanigence in Java runtime.

πŸ‘πŸ½ 1
igrishaev 2025-10-13T07:43:34.703119Z

There is a repo with a list of Clojure libs tried once with native image. This list is incomplete of course: https://github.com/clj-easy/graalvm-clojure

igrishaev 2025-10-13T07:44:31.801789Z

But as you can see, the most needed stuff is there: json, csv, sql/postgres, hickory, nippy, http clients...

igrishaev 2025-10-13T07:46:14.134399Z

There is also another lambda-oriented project, and the most interesting part is the Makefile to compile and deploy: https://github.com/igrishaev/teleward/blob/master/Makefile

πŸ™πŸ½ 1
valtteri 2025-10-13T07:49:07.244909Z

Check out Lambda Snapstart https://docs.aws.amazon.com/lambda/latest/dg/snapstart.html

πŸ‘πŸ½ 1
1
igrishaev 2025-10-13T07:52:44.438319Z

Wow I didn't know about Snapstart, thank you @valtteri for sharing this. By the way, the page says that container images are not supported while @pieterbreed has mentioned he has a container

mpenet 2025-10-13T08:13:05.183689Z

I have used crac also for a demo recently, you can basically snapshot the vm state after startup and have it resume in tens of milliseconds. But that means packaging your app as a docker container, and there are some subtleties wrt to what should be done or not before snapshotting (ex networking)

πŸ‘πŸ½ 1
pieterbreed 2025-10-13T08:28:17.599749Z

I love the name crac... it is exactly what is sounds like. πŸ‘πŸ½

dpsutton 2025-10-13T08:29:40.704449Z

are you AOT’ing your code? Unsure how you are building, but if you are just copying source into your uberjar you might get β€œeasy” wins from just compiling

πŸ‘πŸ½ 1
☝️ 1
Steven Lombardi 2025-10-13T08:52:52.501489Z

Yes, second the AOT point. Definitely look into that. I recommend, before going the graal route, to try your code AOT'd on JDK 24 or 25 and see where that gets you.

Steven Lombardi 2025-10-13T08:55:17.274359Z

Make sure you bump the Java version for both the container's JDK and whatever JDK you develop with or use to AOT.

πŸ™πŸ½ 1
mkvlr 2025-10-13T10:31:22.645839Z

still AOT will probably get you nowhere near the startup times of native image or crac

dpsutton 2025-10-13T10:32:36.674359Z

Of course not. But I think the goal here is for it to successfully run in the cloud and not be killed after 10 seconds. And trying AOT is certainly the first waypoint for that if it’s not already done

mpenet 2025-10-13T10:33:51.849649Z

I'd start with AOT as well. Usually you can get in the 1s range just using that (for hello world)

dpsutton 2025-10-13T10:34:48.998049Z

Also, if you use core async there’s a recent release that might even further speed up compilation times if you have access to virtual threads

pieterbreed 2025-10-13T10:38:04.502219Z

I've always run under the assumption that aot "in general" is an anti-pattern, and have therefore avoided it except for the occasional -main - containing namespace. Is there any downside to pre-compiling all namespaces?

mpenet 2025-10-13T10:38:44.667889Z

generally it's a bit useless, but in some cases it's a necessary thing

mpenet 2025-10-13T10:39:10.121339Z

ex with lamba (when startup time matters) or if you have to be careful about usage at startup (ex with kube, depending on what your org allows or not)

dpsutton 2025-10-13T10:40:30.163589Z

A fun experiment can be start up a repl with the normal class path and then do (time (require main.ns :verbose))

πŸ‘πŸ½ 1
dpsutton 2025-10-13T10:41:04.329859Z

If you are macro heavy compilation can take up a decent amount of time. And having classfiles around is an easy win in that case

Steven Lombardi 2025-10-13T10:46:27.887419Z

It might not be comparable to native image or crac, but keep in mind that JDK 24 and 25 have done a lot to improve start times. So I think with both that and AOT together, you'll see meaningful improvement.

Steven Lombardi 2025-10-13T10:53:44.652619Z

Regarding the trade offs to AOT, there may be some issues depending on how the compiler is configured. But I don't expect you'd hit those issues with default settings. This might be worth a read: https://clojure.org/reference/compilation

πŸ‘πŸ½ 1
Steven Lombardi 2025-10-13T10:54:49.392699Z

Only word of caution I would give is to not use AOT when building library artifacts. Only application artifacts.

πŸ’― 1
2025-10-13T15:28:42.213689Z

in my experience the AOT problems happen when libraries use AOT. it is reliable when used as an application deploy step

2025-10-13T15:30:14.673819Z

(I will never forget the hours, days lost to errors that could not be deciphered when viewing source, because an AOT artifact was being loaded instead of the source and they did not match)

pieterbreed 2025-10-14T07:21:53.957099Z

I have on more question on this thread: Am I doing aot correctly? In my deps.edn I have :paths ["src" "resources" "target/classes"] and in my build.clj I have

(build-api/compile-clj {:basis     project-basis
                        :class-dir "target/classes"
                        :src-dirs  ["src"]})
(build-api/uber {:basis     project-basis
                 :class-dir "target/classes"
                 :main      main-namespace
                 :uber-file uberjar-file})
This has made 0 difference to the startup time: This log shows the timestamp when the java command is exec'd and the first log statement, still nearly 7 seconds.
2025-10-14T07:14:17.697Z + exec java -cp /opt/gometro/gometro-bridge.jar spicyrun.ninja.lambda.runtime
2025-10-14T07:14:24.141Z 169.254.38.181 DEBUG [spicyrun.ninja.lambda.runtime:98] - Starting lambda main process...
Am I missing something?

dpsutton 2025-10-14T08:06:39.284729Z

Can you inspect the jar and verify you have lots of class files?

dpsutton 2025-10-14T08:07:09.386439Z

And it’s totally possible that compilation speed isn’t what is holding you back. But it’s for sure the first thing to check

pieterbreed 2025-10-14T08:08:14.390779Z

yes, it has tons of .class files. πŸ‘πŸ½

πŸ‘ 1
Steven Lombardi 2025-10-14T11:39:46.298709Z

I'm not sure the mere presence of class files is enough to ensure they are being used. Does your main ns have a genclass? Something like this?

(ns clojure.examples.hello
    (:gen-class))

πŸ‘πŸ½ 1
Ben Sless 2025-10-15T04:38:05.661459Z

You can create a CDS archive with your uberjar https://gist.github.com/bsless/fb79601eb2bfdee85ebf4663dbc7bb1b

pieterbreed 2025-10-15T10:22:52.717799Z

Status update: β€’ I've upgraded everything: jdk25, clojure 1.12.3, all of the maven dependencies β€’ I think I've always had aot classes baked into the uberjar, and so far I've not seen improvement I can say was caused by aot. β€’ This is a big project. The uberjar is now ~130mb. β€’ I've used the suggestion to make a CDS archive, and package this into the container, that serves as the runtime image for the lambda. β€’ The CDS archive is a further ~100mb β€’ The very first cold start (right after a function update) takes around 25 seconds. β—¦ I'm not sure exactly what happens with the process space the lambda runs in. Even though the initial events timeout on init, and the logs make it seem like a new process starts up, the JVM itself gets going at some point and starts successfully handling lambda events. β€’ Subsequent requests <200ms β€’ Subsequent cold-starts take around 9 seconds. β—¦ I believe the difference between the initial cold start and subsequent cold starts are the container layers being moved around which also takes non-zero time.

pieterbreed 2025-10-15T10:26:22.674469Z

I'm learning a lot along the way. lambda is weird execution environment, generally speaking...

Steven Lombardi 2025-10-15T13:53:51.611549Z

Thanks for the update. I'm not super familiar with the implications of the JVM running on lambda, but it's slightly disappointing that JDK 25 plus AOT didn't get you better results. Did you notice any startup improvement when running your lambda container on your local machine?

reefersleep 2025-10-13T09:31:11.372399Z

I love using e.g.

(log/info (format "My %s
            multiline print
            is great!) "delicious"))
, but I find that my editor's desire to indent the multiline string means that the first line has one indentation level in the log, and the rest have another. This is true also for e.g. (clojure.test/testing) output, it's not specific to whatever log library I'm using. One solution is to to like
(log/info (format 
              "
              My %s
              multiline print
              is great!") "delicious"))
This will have all of the lines have the same indentation level, which is better than them being unaligned, but not quite as good as controlling the leading spaces independent of code indentation. And it's a bit curious for the next programmer touring the code, who'll probably join the solitary starting " and the subsequent line and never realise the consequences. Also, I can't use literal "\n" as a part of my format string argument without that next line having no indentation at all. Is there some nice built-in formatting solution (e.g. proper use of format πŸ˜…) that is independent of indentation in the code, while also being easily read in the code? I know I could do (-> ["My %s" " multiline print" "is great!"] (#(clojure.string/join "\n" %)) (format "delicious"), but that doesn't read so well and seems overly complicated.

igrishaev 2025-10-13T09:38:04.085779Z

fyi: perhaps you should switch to (log/infof ...) and other macros that end with "f", debugf, errorf, and so on:

(log/infof "something has happened, id: %s, user: %s, context: %s"
  id current-user context)

reefersleep 2025-10-13T09:39:10.561339Z

Nice, thanks! I'm using something other than clojure.tools.logging, though

igrishaev 2025-10-13T09:39:44.899129Z

Also, it's better to avoid multi-line logs as most of log parses follow the rule: 1 line = 1 log. Some loggers replace \n with a soft break so the log entry stays a single line

reefersleep 2025-10-13T09:41:49.860229Z

Good to know. That also implies that developers can still convert \n in their editor and get something readable. (I've done so for logs I wanted to read and didn't control myself)

igrishaev 2025-10-13T09:41:57.960259Z

I see, so just create a macro like this:

(defmacro infof [template & args]
  `(log/info (format ~template ~@args)))
That might solve issues with indenting and so on

reefersleep 2025-10-13T09:44:55.130619Z

Do you mean that the macro expansion ensures similar indentation across lines?

igrishaev 2025-10-13T09:47:15.883849Z

I mean, the macro will look like this in your code:

(infof "something has happened, id: %s, user: %s, context: %s"
  id current-user context)
which is pretty nice I think

reefersleep 2025-10-13T09:48:18.347629Z

That's a good idea for more easily formatted logs, but I don't think it solves my problem of trying to achieve easily readable code that prints easily readable multiline content πŸ™‚

reefersleep 2025-10-13T09:49:59.280789Z

In my current use case, the print is a result of

(testing (format "This editor-indented code works improperly
           with multiline strings")
  (is (= true false) (format "and so
                       does this")))
, so we're not in log land anyways (at the moment).

p-himik 2025-10-13T10:30:15.795749Z

> trying to achieve easily readable code that prints easily readable multiline content I'd argue that maybe it shouldn't be done at all. A human-friendly description of something doesn't have to be printed, doesn't have to be logged, doesn't even have to be a part of the code - it can be a docstring or a comment. A log entry semantically is "this happened here, with this context". The "here" and the "context" parts are pretty much solved - the first one via logging macros that add the location and the second one by passing relevant run-time values to the logging functionality. The "this" part's only purpose is to easily distinguish log entries from each other and to maybe more easily locate the original logging statement (line numbers are nice but aren't unchangeable). So instead of "My %s multiline print is great!" "delicious" I would use some structured logging tool with ::multiline-print {:kind :delicious}. Mulog has a great docs on the topic and generalizes logging and tracing, if you're interested.

pieterbreed 2025-10-13T10:58:16.784079Z

I'll also weigh in on this; I've found that making log statements with "patterns" in them (like you would get with string concatenation or format) are kindof making things harder for you than it needs to be downstream. The useful parts of a log statement: β€’ a string, targeted at a human β€’ context, in the form of a map of data (timestamp, line-nr, namespace, log level, also request-id, customer-id &c.) I've found it much better to make any log statement produce data (json, gelf &c.) with the human-targeted string part being a constant, with the other pieces of context that would normally have been going into format. just being in the log map data directly. Since the constant-string is really a kind of "log statement identifier" it now also benefits from being short (and not multi-line). This unwinds a lot of complexity when you analyze or query the logs in bulk. An example of this is integrating your logs with Sentry, which will choke if every message is unique, but knows how to handle a map of data just fine. Anyway, good luck πŸ™‚

reefersleep 2025-10-13T12:09:47.388219Z

Good point, @pieterbreed. I tend to make very short log statements with data as a separate map. Our logging library takes care of how to handle that. Another reason to avoid hyperactive string concatenation in general for producing logs or similar is that it makes it harder to find in the code base! I love greppable code. But occasionally I try to create a something that has good dev-ex by virtue of telling you everything you need to know, narrated cohesively through prose. And one of these exceptions prompted my question today. I rarely make highly dynamic testing texts and is texts like I'm trying to today, but the test in question sort of acts as documentation and, preferably, should make it very easy for the reader to understand what needs to be done and to actually do it.

πŸ‘πŸ½ 1
hrtmt brng 2025-10-14T09:55:22.005369Z

I also from time to time have the same question.

[:pre β€œint f(int a){
         return a
       }β€œ]
Something like a reader macro for this use case would be helpful in my opinion.

p-himik 2025-10-14T10:04:30.476349Z

A data reader that does that would be trivial to implement. Or a function, or a macro.

reefersleep 2025-10-15T14:12:54.045519Z

An update; for my purposes, a function that splits the input by \n, calls triml on each line and joins it again, and then calls format with any additional args. was sufficient. Easy enough to make, no worries there, but not built in, which is what I was asking about. And fair enough if there isn't anything built-in, I love the conciseness of clojure.core , I just wanted to make sure I wasn't missing out on anything πŸ™‚