This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2021-06-17
Channels
- # announcements (3)
- # beginners (107)
- # calva (13)
- # clj-kondo (5)
- # cljsrn (21)
- # clojure (99)
- # clojure-australia (8)
- # clojure-dev (51)
- # clojure-europe (108)
- # clojure-nl (1)
- # clojure-portugal (3)
- # clojure-spec (9)
- # clojure-uk (10)
- # clojurescript (147)
- # component (7)
- # conjure (5)
- # core-async (2)
- # cursive (11)
- # datomic (11)
- # emacs (14)
- # graalvm (163)
- # graalvm-mobile (317)
- # honeysql (15)
- # introduce-yourself (4)
- # jobs (3)
- # lambdaisland (1)
- # lsp (19)
- # luminus (3)
- # malli (17)
- # off-topic (10)
- # pathom (11)
- # reagent (10)
- # remote-jobs (2)
- # ring (1)
- # shadow-cljs (22)
- # test-check (2)
- # testing (5)
- # tools-deps (39)
Please upvote or join this discussion: https://github.com/oracle/graal/discussions/3476
Can anyone explain some of the finer details here?
I know java static initializers are code blocks that execute at class instantiation; so I’m assuming for native image that --initialize-at-build-time
just means those blocks are executed when the native image is compiled (hence for example any side effects etc there will happen at compile time not runtime).
What I don’t fully understand is how the clojure compiler uses static initializers, and exactly what the implications of this are for clojure.
I’m assuming it’s because clojure’s a lisp, and even aot’d clojure code still has to apply effects at runtime. e.g. ns initialisation presumably runs as a static initializer etc.
so presumably that’s why this impacts all clojure code
I've tried several times to build without this option. It's educational, but I can't really put words to this to explain it in detail, but due to how Clojure is set up / compiles things, it's needed.
I would say: try without the option and perhaps come up with a better explanation, I think that would be very useful.
so build a clojure hello world, with and without this option? And inspect the clojure generated class files with javap
to see if we can explain it?
Clojure emits these kinds of things for functions:
static {
const__0 = RT.var("clojure.core", "println");
}
yeah I’ve seen those before… presumably static linking changes those too?
Try to build a graalvm hello world program without the option and see how far you get
I've done such a process here: https://github.com/oracle/graal/issues/3251#issuecomment-842305171
If you don't initialize at build time then you will get:
Caused by: java.io.FileNotFoundException: Could not locate clojure/core__init.class, clojure/core.clj or clojure/core.cljc on classpath.
at clojure.lang.RT.load(RT.java:462)
which might be coming from here: https://github.com/clojure/clojure/blob/b1b88dd25373a86e41310a525a21b497799dbbf2/src/jvm/clojure/lang/RT.java#L338
interesting
Firstly it’s unsurprising that the line in RT is in a static initializer 🙂
but that uses the dynamicclassloader, etc, which won't work in a native-image anymore anyway
yeah — was going to say something similar though hopefully you can clear it up for me, if I’m inaccurate/wrong… My reasoning was essentially: 1. clojure is a single pass compiler… i.e. essentially your whole program is a flattened require tree / repl session… all deps “essentially concatenated”. 2. therefore if we’re loading clojure/core there, at some point after that we’ll be loading all of your apps dependencies in a similar initializer block.
3. hence we can quit here
Though I guess the dynamic class loader essentially just implements what I said
i.e. resolving clj / class files, and compiling clj into .class etc… essentially controlling “Read (compile) Eval”
yeah. another way to put it: you can't "dynamically load classes" at runtime in GraalVM native-image, but Clojure does this in static initializer blocks, hence these must be initalized at build time.
Yeah that’s a good way to put it
Perhaps this could be resolved if you make a Java class which does all the loading in a static initializer block and you only initialize that one at build time and the rest of your classes could be inialized at runtime, but this would probably require changes to Clojure itself
But interestingly these changes can be accomplished using substitutions as well perhaps.
Here is an example: https://github.com/borkdude/clj-reflector-graal-java11-fix#the-solution
I was wondering why the clojure compiler needs to use the dynamic class loader for AOT’d code? Presumably it could (at least in a graal compilation context) avoid that? Or is that essentially what you’re describing?
in an AOT-ed (native-image) setting you know which namespaces you want, so you could just write that out explicitly
Backing up for a second to the graal issue: Re thomaswue’s point: > Specifying the option for all classes in a specific jar file seems quite reasonable. Would it be OK for this to only work in such broad manner if an uber jar is created first or is that too limiting? Why do we need to bundle into an uberjar? Could we not also just give them a classpath?
yes, you can already do this, but the original problem in that topic is that they want to get rid of the option without explicitly specifying the classes for which you want build time initialization
and here he offers some kind of compromise to be able to say for which .jar you want it. so if you provide an uberjar you will get all the classes again
Yeah… I’m just trying to understand the tooing and froing of conversation.
So to summarise the thread / they don’t want people to mark every class for build-time-initialization; because for some possible classes it’ll screw things up.
For most idiomatic clojure code we need build-time-initialization. Though for some clojure code that also won’t work (e.g. an ns with (def data (fetch-data-from-postgres ,,,))
will need to either be rewritten or opt in to runtime initialisation).
If we default a classpath into build-time-initialization we may build invalid “build time” state into the runtime (adinn’s point) for java deps etc.
Is that the general gist of it?
correct. but imo taking away this option will make it harder on Clojure developers since for most CLIs I built this stuff worked fine (or I was able to work around it). Occasionally a library like httpkit would give problems: https://github.com/http-kit/http-kit#native-image
But perhaps listing all clojure-related namespaces through some script is possible, I was just trying to make sure Clojure projects would still be able to run
Yeah I agree with that. The default for .clj(c) files should be build time, because of the nature for clojure
@U04V15CAJ: Yeah I was literally typing: presumably we could use something like mranderson to move all clj code under a new top level package/ns, and then flag that to default as build time
how do you avoid mixing java library / classes into the uberjar though?
@U06HHF230 well you don't have to, you could of course just make a jar with your project code + clojure libs and put the Java code into another jar
(I agree the mranderson thing would be a hack)
but personally I would just go with all build at runtime for everything and figure out the exceptions
the tools I build are usually CLI tools and not huge micronaut web server things which I think the issue is more concerned about
Perhaps we can figure out a good pattern to build only clojure classes at build-time
> but personally I would just go with all build at runtime for everything and figure out the exceptions well to be fair that is how any approach in clj will eventually end up working — the main difference would be starting from a point where you didn’t picking the wrong default for java libs.
@U04V15CAJ: Yeah I was going to say the issue is that there’s no tooling that knows what a clojure lib is vs a java lib.
We’d need something that knew how to biject clj files into their class files…. essentially mapping munge
over the .clj(c)
classpath.
indeed
it would be nice to avoid having to have another step
btw, I'm trying these flags:
"--initialize-at-build-time=clojure."
"--initialize-at-build-time=clojure.core.server"
but I'm still getting errors about clojure.core.serverwith graal 22?
yeah that’s essentially equivalent to listing all of the top level namespaces you use there.
it’s good to prove what they’re suggesting will work for us… it’s just a shame it’s more clunky.
@U06HHF230 yeah, so this works with httpkit (2.5.3):
"--initialize-at-build-time=clojure,refl,org.httpkit"
"--initialize-at-run-time=org.httpkit.client"
so you have to use the package name refl
to get all the related classes refl.main__init
, etc.
so perhaps a "simple" all-ns
with some munging/post-processing could be all that's needed
@U06HHF230 Something like this:
user=> (->> (map ns-name (all-ns)) (remove #(str/starts-with? % "clojure")) (map #(str/split (str %) #"\.")) (keep butlast) (map #(str/join "." %)) distinct (map munge) (cons "clojure"))
("clojure" "refl" "org.httpkit")
for babashka:
("clojure" "sci.impl" "selmer" "babashka.nrepl" "babashka.impl.clojure.java" "babashka.impl" "rewrite_clj.node" "bencode" "rewrite_clj.parser" "babashka.impl.clojure" "org.httpkit" "rewrite_clj.custom_zipper" "rewrite_clj.zip" "borkdude.graal" "babashka.nrepl.impl" "babashka.pods" "cognitect" "babashka" "edamame.impl" "cheshire" "rewrite_clj" "hiccup" "sci" "borkdude" "flatland.ordered" "babashka.pods.impl" "clj_yaml" "babashka.impl.clojure.core" "datascript" "hf.depstar" "babashka.impl.tools" "sci.addons" "babashka.impl.clojure.test")
https://github.com/babashka/babashka/commit/207e22a6fa04184e609f2aa5af73a382efddc19a
ok, that leads to:
Exception raised in scope ForkJoinPool-2-worker-25.ClosedWorldAnalysis.AnalysisGraphBuilderPhase: org.graalvm.compiler.java.BytecodeParser$BytecodeParserError: com.oracle.graal.pointsto.constraints.UnsupportedFeatureException: No instances of com.fasterxml.jackson.core.io.SerializedString are allowed in the image heap as this class should be initialized at image runtime. To see how this object got instantiated use --trace-object-instantiation=com.fasterxml.jackson.core.io.SerializedString.
kind of demonstrating that it would be painful to have to do this exercise for every graalvm project
"--initialize-at-build-time=clojure,sci.impl,selmer,babashka.nrepl,babashka.impl.clojure.java,babashka.impl,rewrite_clj.node,bencode,rewrite_clj.parser,babashka.impl.clojure,org.httpkit,rewrite_clj.custom_zipper,rewrite_clj.zip,borkdude.graal,babashka.nrepl.impl,babashka.pods,cognitect,babashka,edamame.impl,cheshire,rewrite_clj,hiccup,sci,borkdude,flatland.ordered,babashka.pods.impl,clj_yaml,babashka.impl.clojure.core,datascript,hf.depstar,babashka.impl.tools,sci.addons,babashka.impl.clojure.test"
"--initialize-at-build-time=com.fasterxml.jackson"
Sorry was afk for lunch 🙂 > unfortunately there the clojure and java package overlaps What do you mean? Clojure and java code inhabiting the same package/ns? Meaning the java classes are defaulted into build time init?
I’m guessing for babashka you just ran that at a repl and pasted the output into the shell script; but would plan to automate it at somepoint (or convince the graal folk to do something different)
@U06HHF230 are you on linux btw?
macos
ok. in #babashka-circleci-builds there are new binaries compiled on the init-at-build-time branch. I wonder if this would impact startup time
> if this would impact startup time In which direction were you thinking?
Shouldn’t we be expecting for essentially the same coverage? i.e. all clojure code (except the few exceptions) to be initialised at build time?
yeah assuming both builds behave the same wrt to correctness, I’d expect there not to be a significant difference in startup time… If there were it’d probably mean we weren’t covering everything we needed to.
Do you think any of this changes how the graal thread has been left? > Specifying the option for all classes in a specific jar file seems quite reasonable. Would it be OK for this to only work in such broad manner if an uber jar is created first or is that too limiting?
ah thanks — just refreshed
What are the use cases for the uberjar case thomaswue is pushing for? I’m not even sure for clojure it’s sufficient
I usually tend to compile and collect all the code into an uberjar first and then feed that to graalvm
you don't have to do this, but I find this easier, since you just know what code you're dealing with after the uberjar step
also I distribute the uberjars so people who want to make nixos derivations etc can use them
I could also say in case of an issue to a graalvm dev: here you have the uberjar, I do this to compile it, but it doesn't work
Yeah I get that it’s useful for your other requirements (you want uberjars anyway etc). But an uberjar is just a reified/flattened classpath… so why can’t they just take a classpath?
I should probably ask them 🙂
Just want to check that I’m not arguing against what you want 🙂
but if you have a fat jar, you're not a library owner saying this, you are the end user
yeah ok
that makes sense
(actually I was meaning to ask you about this for another reason… I’ll start another thread on the channel for it though as it’s a change of topic)
I think this also relates to clojure startup time. Perhaps we attempt a clojure-side compilation flag that solves (or makes progress towards) both issues at AOT time?
meaning when this flag is in effect the clojure compiler generates different byte code and this byte code both starts up faster and works with graal native without needing --initialize-at-build-time.
@UDRJMEFSN do you have any concrete ideas of what can be done differently?
I will look a lot more closely; I just know those are two related things and my profilers always show var initialization as one of the startup issues so somehow compiling that data down into something perhaps more concrete that loads faster is an interesting issue that seems related.
delaying var initialization to build time will make native images slower to start up right?
I don't want to delay anything, I want AOT to produce data as a side effect that can be quickly loaded to initialize vars during runtime initialization.
ok, but now these are are already initialized in the image heap, so that work has already been done when starting the image
Yes, I agree and that is not what I am suggesting. const_0
being initialized to a static class instance in your example above would make things faster as it would bypass the RT.var mechanism.
it could directly reference the AOT-ed class which represents the println var right?
Yes, in this case. You also have the case where something is initialized via a complex function that produces a persistent datastructure and in this case the data can be saved in resources and found via a hashtable lookup or straight array lookup in constant time eliding the generating code. I haven't looked at this in huge depth but for instance I was extremely careful with dtype-next and it still takes some time even after an AOT run to pull in, for instance, the ND system via require. This is a solvable problem.
My thought is more of the form move --initialize-at-build-time into the clojure compiler and allow anything that it did in the graal vm system to be done during the AOT step. Then --init-at-build-time should be a noop if done during graal vm compilation.
As clojure.core is AOT-ed by default anyway, I guess Compiler could be instrumented in such a way that it can reference these classes directly when generating more code. For core vars only it would already be a win
This is complicated by the fact that bytecode files aren't general data storage mechanisms (at least as far as I know) which means you need some level of sidecar file generated at build time for pure data.
It would be nasty and error prone. Definitely a YMMV pathway but with time it could work well.