Hi, I've just been porting a Python script to Clojure, and I need to package it up into a Docker image to run as an AWS ECS task. Are there any guidelines for building Docker images for Clojure projects? This is the Dockerfile I came up with, but I'd welcome any pointers if there's a better way:
FROM AS builder
WORKDIR /build
COPY build.clj deps.edn /build
COPY src /build/src
RUN clojure -T:build ci
FROM
RUN apk add git git-lfs aws-cli
WORKDIR /srv/bitbucket-backups
COPY --from=builder /build/target/com.metail/bitbucket-backups.jar .
RUN adduser -D default
USER default
CMD ["/opt/java/openjdk/bin/java", "-jar", "bitbucket-backups.jar"] Just a note to confirm that the Cognitect AWS API is working just fine for my ECS task. I added debugging to the code to check the environment and cross-referenced that with the credentials impl in the Cognitect API, everything looked like it should work, so I added an explicit HTTP GET request for the credentials to debug the problem but found...no problem! I've no idea what caused the error I saw yesterday: put it down to a wetware issue.
FWIW it looks reasonable to me. There's not much to it.
Thanks, that's encouraging!
I don't know how ECS tasks work and what their requirements are, but my Clojure apps that use Docker run directly from sources inside a container, so I don't even have build.clj. :) No AOT at all.
looks good, I've been shipping uberjars in Docker to ECS more or less the same way
@lukaszkorecki have you used the Cognitect AWS API in ECS? My ECS task is failing to retrieve credentials and I'm wondering if ECS credentials are supported or if I did something wrong.
it's been few years ago, so that's important to keep in mind - we were using a mix of Cognitect libs for APIs, and AWS SDK for interacting with S3 and couple of other services that needed better IO perf
Hmm, looking at the source code it appears to be supported so maybe an error at my end.
I had to write a custom authorizer to use ECS (and AWS SSO locally) creds
let me dig it out - it probably doesn't work anymore, but might be salvageable - I don't work with AWS anymore
https://gist.github.com/lukaszkorecki/120008f7832e23702e94f4205b8e3df5 here you go
Thanks. I'm comparing your code with the Cognitect library (which now has support for ECS credentials), it looks similar. but there might be some subtlety in the env vars. I'll add some debugging to my code to make sure the expected environment variables are set.
Make sure you're using a recent version of aws-api, I seem to remember bumping into this ECS/container problem and it ended up w/ a patch to aws-api.
I just looked at the changelog and can't seem to find the specific issue, but AWS_EC2_METADATA_SERVICE_ENDPOINT being related to IMDSv2 seems to ring a bell
On that note, I guess you'd want to do a RUN apk upgrade too, the eclipse-temurin images aren't always up to date and seem a bit slow to patch vulnerable libraries as you may have seen https://hub.docker.com/layers/library/eclipse-temurin/26-jre-alpine/images/sha256-f6227038f5b89d45a98ebf69c964d689a123baa06dc74247e77b8dfeefabcf19
might not matter here since based on dockerfile it looks like an internal thing, but for prod apps def agree - best if you roll your own internal base images that can be updated without waiting for public ones
Yes, this is internal only, but I will address the security issues. First I need to get it working in ECS.
just wrote two of these conflict handlers. One for vertica shading this in, another for hive bringing in some jakarta activation classes.
(def ^:private gson-conflict-handler
"vertica-jdbc (and potentially other fat JARs) bundle their own copy of gson classes.
When these overwrite the correct version from com.google.code.gson/gson, BigQuery's
error-handling path crashes with NoSuchMethodError on JsonWriter.value(float),
introduced in gson 2.9.0. This handler ensures the pinned gson version always wins
regardless of JAR processing order. See #73736."
{"com/google/gson/.*"
(fn [{:keys [lib path in]}]
(if (= lib 'com.google.code.gson/gson)
{:write {path {:stream in}}}
nil))})
Found it pretty worthwhile to add a test that builds an uberjar into a temp directory and captures any class conflicts
(let [conflicts (atom [])]
(clean!)
(b/uber {:class-dir class-dir
:uber-file uberjar-filename
:conflict-handlers (merge conflict-handlers
{:default (fn [{:keys [lib path]}]
(when (str/ends-with? path ".class")
(swap! conflicts conj {:path path :lib lib}))
nil)})
:basis basis
:exclude dependency-ignore-patterns})
@conflicts)
really helpful to see what kind of nasty jars you might haveNice. Super useful in practice for ruling out potential root causes.
Reminds me of the old days when I was verifying the state of osgi containers. Fun times.
Yeah it's surprising what some drivers can bring in. And it's easy to check once. And then four years later you're not so sure. I added a test that asserts no more class conflicts than the ones we currently have. And then we can start reducing it over time
Run time is such a target rich environment for stuff like this, not sure why mainstream is so obsessed with compile time. It certainly has its place but you'd never be able to write something like this.
for instance, did you know databricks jdbc shipped arrow classfiles unshaded? https://github.com/databricks/databricks-jdbc/tree/main/src/main/java
Actually, yes, I'm pretty sure I've hit that before when trying to use that same library.
At least, trying to use data bricks as a read source was scrapped early in favor of just routing a Kafka stream my way so I can maintain my own materialized view.
I don't recall the specifics but there was a lot more to it than just janky jar files.
You won't need to defend your conflict detector from me. I've felt the pain myself. It's justified, especially if you're using a ton of Java native libs.
Honestly though this is one of those things that I'd adapt to my liking and make it part of my standard tool kit.
Same. Finding overlapping prefixes in classpath jars is super interesting
If you published it as a configurable docker image, I'm using it.
It gets even more interesting when there's more than one classloader involved 😆