Fork me on GitHub
#graalvm
<
2020-03-02
>
Aaron Cummings13:03:08

I have a command line program I have compiled with native-image on Windows 10 (x64). I have found that on a fresh Windows install, it won't run because of missing 'vcruntime140.dll'. If I redistribute this program, is it sufficient (and legal?) to include this dll with my distribution?

borkdude13:03:38

@aaron383 Could it be related to the Visual C/C++ Redistributable 2010? Which is pretty much always required if you run a dynamically linked executable compiled with GraalVM on Windows I think

borkdude13:03:19

A lot of times this is already installed because a lot of other programs need it too. The scoop maintained by @ales.najmann suggests this: https://github.com/littleli/scoop-clojure/blob/a0b245db60d824177ea60cb9616239bb875baf50/clj-kondo.json#L13

littleli13:03:16

yes, it's quite common issue

borkdude13:03:56

In doubt, it's best to ask in the native-image channel of the GraalVM slack (https://app.slack.com/client/TN37RDLPK/CN9KSFB40) and if you have any news, contribute it to https://github.com/lread/clj-graal-docs

Aaron Cummings13:03:06

My users are corporate folks who don't necessarily have admin access to their laptops, so to be complete I'd need to include any dependencies. The Win10 image on my work machine does include this dll, so things "just work", but I'm concerned that this won't generally be the case.

Aaron Cummings13:03:44

I'll try the Slack you suggested; thanks for the pointer.

littleli14:03:47

scoop in general uses user's profile, where admin rights are not necessary. But vcredist20XX are usually coming with installers, and with that it's hard to tell.

borkdude16:03:26

@ghadi You were asking about the JDK11 client recently. I've got a fork of hato, a clj-http-inspired lib that is based on JDK11's client working here: https://github.com/borkdude/hato-native The only thing I had to do is remove some conditional requires/resolves.

borkdude16:03:14

Compilation time of babashka went up from 2 minutes to 8 minutes though when I included it, so I'll likely not include it

😱 1
ghadi17:03:31

yeah... I'm not 100% a fan of the approach in hato

ghadi17:03:41

compilation time 📈 because it pulls in way too many dependencies

[cheshire.core]
   [cognitect.transit]

borkdude17:03:21

cheshire.core and cheshire.transit are not problematic in jet, this still compiles within a minute on my macbook

borkdude17:03:18

same on circleci, about 1 minute: https://circleci.com/gh/borkdude/jet/746 (the Build binary step)

littleli18:03:21

I suspect there is some megamorphic call site in the hato project where native image has to go through to many branches of the code.

borkdude20:03:15

@alexmiller > The latest Graal (20) still failed on this example, but with different errors. Care to share what 20 gave? Which JDK did you use, 8?

alexmiller20:03:25

might have been 11 - is there a separate install for 8?

alexmiller20:03:33

ah, I see. will check there

alexmiller20:03:19

works on graal 20 java 8

alexmiller20:03:37

here's the graal 20 java 11 failure

borkdude20:03:40

@alexmiller GraalVM has problems with the reflective use of the method handle. There is an "easy" workaround for that if you can rely on JDK11 and forget about JDK8: https://github.com/lread/clj-graal-docs#jdk11-and-clojurelangreflector

borkdude20:03:20

@alexmiller I think you're logging some stuff which makes this output very large. Also probably makes compilation slower.

alexmiller20:03:22

so those are the options I'm using

borkdude20:03:55

fwiw I try to avoid --report-unsupported-elements-at-runtime and I've never used -H:+ReportExceptionStackTraces.

lread20:03:55

There are so many GraalVM options, it is hard to keep track, you seem to be using one of these for clj-kondo https://github.com/borkdude/clj-kondo/blob/63414903822d523c5bc154f7cf75035c287c5879/script/compile#L16

borkdude20:03:09

ah, I am ok.

lread20:03:10

I frankly don’t know how/when/where/why we started using --report-unsupported-elements-at-runtime for clj-graal-docs

borkdude20:03:24

maybe @lee can enlighten me why this is needed 🙂

lread20:03:53

oh hey there! 👋

lread20:03:28

I just took our compile script @borkdude

lread20:03:53

So… even though we thought excluding graal on jdk11 from reproduction steps was a good idea… maybe we should update to ensure it works.

lread20:03:18

Core team will likely verify for both jdk8 and jdk11

alexmiller20:03:41

I added the patch around the methodhandle, fails with:

alexmiller20:03:43

Error: com.oracle.svm.hosted.substitute.DeletedElementException: Unsupported method java.lang.ClassLoader.defineClass1(ClassLoader, String, byte[], int, int, ProtectionDomain, String) is reachable
To diagnose the issue, you can add the option --report-unsupported-elements-at-runtime. The unsupported element is then reported at run time when it is accessed the first time.
Detailed message:
Trace:
	at parsing java.lang.ClassLoader.defineClass(ClassLoader.java:1016)
Call path from entry point to java.lang.ClassLoader.defineClass(String, byte[], int, int, ProtectionDomain):
	at java.lang.ClassLoader.defineClass(ClassLoader.java:1014)
	at java.lang.ClassLoader.defineClass(ClassLoader.java:877)
	at clojure.lang.DynamicClassLoader.defineClass(DynamicClassLoader.java:46)
	at clojure.core$get_proxy_class.invokeStatic(core_proxy.clj:288)
	at clojure.core$get_proxy_class.doInvoke(core_proxy.clj:276)
	at clojure.lang.RestFn.applyTo(RestFn.java:137)
	at spec.main(Unknown Source)
	at com.oracle.svm.core.JavaMainWrapper.runCore(JavaMainWrapper.java:151)
	at com.oracle.svm.core.JavaMainWrapper.run(JavaMainWrapper.java:186)
	at com.oracle.svm.core.code.IsolateEnterStub.JavaMainWrapper_run_5087f5482cc9a6abc971913ece43acb471d2631b(generated:0)

lread20:03:38

I verified steps against graal 19.3.1 - iirc that’s the LTS version, right @borkdude?

alexmiller20:03:38

well, I re-ran with --report-unsupported-elements-at-runtime and it worked

alexmiller20:03:54

so that's weird

ghadi20:03:50

it's saying that get-proxy-class calls ClassLoader.defineClass, which it does

alexmiller20:03:47

if I add the flag --report-unsupported-elements-at-runtime, it reports nothing, and produces a working executable

ghadi20:03:00

that makes sense to me

ghadi20:03:27

surprised that graal jdk8 doesn't say the same

borkdude21:03:17

fwiw, here is a working jdk11 project example: https://github.com/borkdude/sci/tree/jdk11 the compile script is in script/compile. it uses a patched clojure.lang.Reflector in src-java.

lread21:03:44

So, I’m wondering under what graalvm env core team should verify that this is fixed.

alexmiller21:03:18

I don't see the CLJ-1472 problem in any env, so I consider it verified from that perspective

alexmiller21:03:55

works on graal 19 java 8, graal 20 java 8, graal 20 java 11 with --report-unsupported-elements-at-runtime, and see different problems under graal 20 java 11 w/o that flag

borkdude21:03:00

it doesn't use --report-unsupported-elements-at-runtime, maybe because I'm using some java opts: https://github.com/borkdude/sci/blob/ebb7b03230dc3a0808606482eb118613c2283e7b/project.clj#L29-L30 not sure ¯\(ツ)

ghadi21:03:26

@alexmiller did you get that get-proxy-class error with direct-linking enabled?

alexmiller21:03:49

direct linking enabled when compiling what?

alexmiller21:03:00

clojure? spec? example code?

ghadi21:03:31

all of those

alexmiller21:03:32

the way I'm testing clojure should be direct linked, the others not

borkdude21:03:51

I'm compiling the uberjar with direct linking, just for performance

ghadi21:03:03

I'm just looking for a differential on why that error is there.

borkdude21:03:18

and the spec macro setting is to avoid the locking issue

borkdude21:03:32

without that I've had it pop up more often

borkdude21:03:18

anyway, I think the focus is on CLJ-1472 and if that issue is solved with your patch, I think that is nice progress. other issues should be addressed another time probably

alexmiller21:03:39

are there other issues to address in clojure 1.11 timeframe?

borkdude21:03:08

The only issue I know of is the one with clojure.lang.Reflector on JDK11

lread21:03:15

We never did figure out pprint did we?

borkdude21:03:28

it's relatively easy to address. I wouldn't know how to fix that one properly without making separate releases for separate JDK versions, to avoid the runtime check on Java versions (this was about clojure.lang.Reflector)

lread21:03:43

oh yeah it is

borkdude21:03:46

ah right, I run with a patched version of that too, forgot about it

lread21:03:56

fix versions: 1.11

ghadi21:03:27

do we have a bug for the "Reflector thing"? needs clarity

alexmiller21:03:01

no jira that I know of, but I'm not sure that it's something we should address in clojure itself

lread21:03:51

here’s what @borkdude logged on pprint on clj-graal-docs: https://github.com/lread/clj-graal-docs/issues/19 - we closed because it was stale

lread21:03:31

perhaps at least logging pprint issue as a symptom in jira would be helpful?

alexmiller21:03:29

it's not helpful unless there's more information than that

lread21:03:36

if I find the time to diagnose why pprint does not work, I will log

borkdude21:03:08

> do we have a bug for the "Reflector thing"? needs clarity The JDK11 support for GraalVM is relatively new, I think most people are still using JDK8, that's why it probably never showed up. If you want, I can make an issue for it.

borkdude21:03:34

@alexmiller Following the exact same repro from lread but with this code:

(ns spec-test.core
  (:gen-class)
  (:require [clojure.pprint :as pprint]))

(defn -main [& _args]
  (println *clojure-version*)
  (pprint/pprint (range 100)))
also triggers the monitor mismatch error.

borkdude21:03:28

Removing the references to pprint will produce a working binary with 1.10.1

alexmiller21:03:57

related to locking?

alexmiller21:03:33

with --report-unsupported-elements-at-runtime, works with the patch

borkdude21:03:44

@alexmiller confirmed!

$ ./spec-test
{:major 1, :minor 10, :incremental 1, :qualifier patch_38bafca9_clj_1472_5}
(0
 1
 2
 3
 4

ghadi21:03:31

cool, but I need an explanation!

ghadi21:03:38

pprint doesn't use locking AFAIK

borkdude21:03:58

@ghadi this might be triggered by the Clojure compiler invoking spec on macro syntax checking?

borkdude21:03:22

and that uses locking in dynaload

ghadi21:03:22

what macro?

ghadi21:03:12

if that were the case, it would be any form affected

borkdude21:03:12

I don't know, but somehow that code is reachable by the analyzer. I don't know the details. I've been able to get around this using "-Dclojure.spec.skip-macros=true" which is why I thought that would be somehow related to it.

borkdude22:03:23

Note that the output can get quite large.

ghadi22:03:00

well -- in any case clj-1472 takes care of it

ghadi22:03:14

I don't the ability to make that report at the moment

ghadi22:03:38

but I am idly curious still 🙂 . I don't like "spooky" / "magic"

borkdude22:03:42

btw @lee - that script... pure awesome, so easy to test out a clj-1472 patch 🙂

😊 1
borkdude22:03:02

I'm able to compile clj-kondo with graalvm 20 jdk11 and 1472 patch 5. I consider clj-kondo to be a non-trivial example

borkdude22:03:45

@alexmiller The spec repro example works for me in GraalVM 20 java11:

$ ./spec-test
{:major 1, :minor 10, :incremental 1, :qualifier patch_38bafca9_clj_1472_5}
11.0.6
GraalVM 20.0.0 CE
true

borkdude22:03:13

This is the program:

(ns spec-test.core
  (:gen-class)
  (:require [clojure.spec.alpha :as s]))

(s/def ::g int?)

(defn -main [& _args]
  (println *clojure-version*)
  (println (System/getProperty "java.version"))
  (println (System/getProperty "java.vm.version"))
  (println (s/valid? ::g 1)))

alexmiller22:03:41

it works for me with --report-unsupported-elements-at-runtime, but not without it

alexmiller22:03:33

rich ok'ed clj-2502 too

bananadance 2
alexmiller22:03:05

and we've vetted the clj-1472 patch on a much larger test bed with datomic too

borkdude22:03:34

I'm happy clj-1472 is working out, but I have no idea why it worked 🙂 I did read the explanation and tried understanding it. Why is there a try inside a try, is the outer try still doing anything?

alexmiller22:03:42

well, it helps if you're looking at the bytecode :)

borkdude22:03:07

> Why is there a try inside a try, is the outer try still doing anything? Remember that we talked recently about there being no legit use case for a try without a catch / finally for a linter feature? 🙂 I wonder if this matter for the bytecode?

alexmiller01:03:40

so, it is important in this case because of how the body of the try gets lifted into a function, but it should not matter in 99.999% of normal cases.

alexmiller01:03:21

what was happening before is that the locking macro expanded to a try not in the tail position, and that lifts the try body into a function (around which you can do catch/finally kinds of stuff), which means the monitorenter was in the original body and the monitorexit was in the lifted function (with the lockee as a closed over field in the function object). because these are in different methods and in a field, which gets loaded and cleared due to locals clearing, the graal analyzers can't connect that the monitorexit is attached to the monitorenter and it looks unbalanced.

alexmiller01:03:10

the outer let in the new locking macro evaluates the lock object, that goes into the the next try block. the inner let puts it in a local, which will be not cleared due to the outer try (as it might be needed if there is a finally), the lock object then ends up on the stack for both enter and exit.

alexmiller01:03:33

so the outer try here influences locals clearing just to make the lockee tracking easier for the graal analyzer.

alexmiller01:03:20

needless to say, this is super subtle

bherrmann08:03:52

I recommend copy/pasting that explanation right above the try block.

borkdude08:03:34

yeah, adding a comment seems good to me. maybe the linter feature will still be good because in 99.999% of the cases forgetting a catch is probably something in the category of silly errors (and the rule can be disabled)

richhickey13:03:46

this is not a good explanation

alexmiller22:03:41

it might actually matter here for the scope of the exception table, I will look into it a little more

alexmiller22:03:13

or maybe it could actually just be a do

alexmiller22:03:43

this is like v5 of this patch and it was doing something in the prior version

borkdude22:03:42

I remember seeing an empty let (let [] ...) somewhere in core once

alexmiller01:03:00

I think this could easily also be a do - not sure if there is any good reason to use let [] there.

alexmiller01:03:06

could just be historical

borkdude23:03:39

maybe it's there for a similar reason

borkdude23:03:41

I have a repro for the clojure.lang.Reflector issue on JDK11:

(ns spec-test.core
  (:gen-class)
  (:import [clojure.lang Reflector])
  (:require [ :as io]))

(defn -main [& _args]
  (println *clojure-version*)
  (println (System/getProperty "java.version"))
  (println (System/getProperty "java.vm.version"))
  (println (Reflector/invokeInstanceMethod (io/file ".") "exists" (object-array []))))
reflection.json:
[{
  "name" : ".File",
  "allPublicMethods" : true,
  "allPublicFields" : true,
  "allPublicConstructors" : true
}]
With 20.0.0 java8:
$ ./spec-test
{:major 1, :minor 10, :incremental 1, :qualifier patch_38bafca9_clj_1472_5}
1.8.0_242
GraalVM 20.0.0 CE
true
With 20.0.0 java11:
$ ./spec-test
{:major 1, :minor 10, :incremental 1, :qualifier patch_38bafca9_clj_1472_5}
11.0.6
GraalVM 20.0.0 CE
Exception in thread "main" com.oracle.svm.core.jdk.UnsupportedFeatureError: Invoke with MethodHandle argument could not be reduced to at most a single call or single field access. The method handle must be a compile time constant, e.g., be loaded from a `static final` field. Method that contains the method handle invocation: java.lang.invoke.Invokers$Holder.invoke_MT(Object, Object, Object, Object)
	at com.oracle.svm.core.util.VMError.unsupportedFeature(VMError.java:101)
	at clojure.lang.Reflector.canAccess(Reflector.java:49)

borkdude23:03:14

Let me know if you want an issue for it.