Fork me on GitHub
#docker
<
2023-03-15
>
orestis14:03:04

I re-started my Docker excursion, this time Docker Desktop works just fine in both macOS ARM and AWS amd64/x86. I managed to get a proof of concept running in AWS. However I noticed that the Clojure image (even the bullseye-slim variant) is quite big. The alpine images are not available in both platforms, I guess it's an upstream issue? Should I be worried about docker image sizes in practice?

orestis14:03:52

Presumably when pushing new Docker images from the CI, only the changed layers will be pushed, so a base debian / JVM will not be moved across the network all the time.

lispyclouds14:03:21

can i ask what would be the need to deploy using the clojure image itself? are you using the clojure tools in the app too? normally the way to do this would be to make the uberjar and deploy using a dedicated jdk based image. that makes it quite small and purpose built

lispyclouds14:03:24

the jdk image just has the jvm in it and can have an entrypoint like java -jar app.jar

orestis14:03:42

Oh I’m not there yet. We’re currently building an Uber jar for faster startup so presumably we will be doing this also inside Docker.

lispyclouds14:03:18

right so you can go for a multistage build maybe? first stage is from the big clojure image building the uberjar, second one is the jdk one?

orestis14:03:30

Yes, that’s the way Practicalli recommends.

practicalli 2
lispyclouds14:03:22

yeah essentially the clojure images being big shouldnt be much of an issue as thats generally isnt whats deployed

orestis14:03:28

I also see that Amazon provides a cross platform JDK build based on Amazon Linux 2 and Coretto, which is what we use today and would make also auditors happy

practicalli-johnny15:03:54

Amazon Linux 2023 should be release later this year, although it should be an easy switch from Amazon Linux 2 https://docs.aws.amazon.com/linux/al2023/ug/compare-with-al2.html I was quite amazed how good the Docker layer cache is, most of my builds were amazingly fast on the CI as it skipped layers that didnt need rebuilding Let me know if anything is not clear (or could be improved) on the article https://practical.li/blog/posts/build-and-run-clojure-with-multistage-dockerfile/

orestis16:03:46

I actually googled “Clojure Docker practicalli” :)

orestis16:03:18

I would add a hello world example running Clojure from source as that’s the first step to validate the whole pipeline

orestis16:03:09

Actually, the alpine recommendation doesn’t work with Apple silicon - it’s not available for that architecture.

orestis19:03:31

The bulk of the image is actually the JDK: • 200MB for amazon linux 2 • 315MB for amazon coretto JDK • 5MB for a "hello world" http-kit based uberjar

practicalli-johnny20:03:26

If Amazon Corretto has a JRE then that should save a couple of hundred MB. I assume the question is, which is more important - keeping auditors happy or saving some megabytes 🙂 Once an docker image is built then hopefully that image is promoted through any environments (e.g. dev, test, stage, prod) rather than being rebuilt, typically using a central repository like amazon container repository (ECR) If not, then if the uberjar is added to a repository, that uberjar could also be promoted through environments Either way this minimises builds to just development changes (changes to the code or how that code is built)

lispyclouds20:03:25

if reducing size is of utmost importance and keep the compliance, id recommend looking into https://www.baeldung.com/jlink thats the recommended way to shrink the jdk footprint. can start with coretto here. This is https://github.com/bob-cd/bob/blob/main/apiserver/Containerfile#L9 to have final images < 100M.

lispyclouds20:03:28

But these are the things I'd strongly recommend finding the this is good enoughline before spending too much time. They are quite the rabbit holes with not much value return.

practicalli-johnny20:03:59

is java.desktop module needed? I would assume is not for a backend service unless they sill have something essential in there. Although I guess I should google what each java module actually provides before I start shaving the yak yak

lispyclouds20:03:51

i quite forget what it was for but something was looking for it, but also this was a while back, maybe i should do some spring cleaning

lispyclouds20:03:28

its quite hard to impossible to figure out what a clojure uberjar needs. hence the rabbit hole too

lispyclouds20:03:09

that list was arrived upon mostly by trial and error

practicalli-johnny20:03:55

The Eclipse temurin docker instructions suggest the following to create a jre

RUN $JAVA_HOME/bin/jlink \
         --add-modules java.base \
         --strip-debug \
         --no-man-pages \
         --no-header-files \
         --compress=2 \
         --output /javaruntime
Which I assume uses these java.base packages https://docs.oracle.com/en/java/javase/17/docs/api/java.base/module-summary.html This should create a JRE of around 100Mb

practicalli-johnny20:03:37

Then as much time as could be useful taken to cherry pick the ones needed, or removing ones that are probably not needed.

lispyclouds20:03:39

Yeah real world clojure apps need quite a lot more than the base

practicalli-johnny20:03:35

if that were true then there seems little value in exploring the use of the JRE and simply use the JDK and get on with more important things.

lispyclouds20:03:11

Yep, hence my rabbit hole comment

lispyclouds20:03:17

It is useful for languages making use of the JDK module system and publishing that info for statically figuring out what’s needed. Clojure doesn’t and it’s quite hard and often not worth it

orestis07:03:17

Yeah I think I'm going to give up. It seems in practice the layer caching does the right thing and it's even deduplicated so you don't pay for storage either.

orestis14:03:03

All the tutorials and quick starts always mention the :latest tag - surely that’s not how people version their images? My hunch would be to tag an image with the git-sha of its repo. Are there other places to put metadata into?

practicalli-johnny16:03:26

Meta data can be added to the Dockerfile using LABEL , which I add just after the FROM of the openjdk image in the image that will be deployed.

LABEL org.opencontainers.image.authors=""
LABEL io.github.practicalli.service="Gameboard API Service"
LABEL io.github.practicalli.team="Practicalli Engineering Team"
LABEL version="1.0"
LABEL description="Gameboard API service"
I think version here is the application version though, rather than image There are some standard label names it seems, but I assume anything meaningful could be added. Depends how you wish to use the version information

practicalli-johnny16:03:13

I assume it would be the case that the CI workflow would add a relevant tag when pushing an image to a repository, e.g. the Amazon Container Repository I assume it would add versioning using https://docs.docker.com/engine/reference/commandline/tag/

orestis17:03:40

Yes it seems tags are the way to do it, even though they have no semantic meaning.