Fork me on GitHub
#graalvm
<
2019-11-09
>
borkdude09:11:28

@lee and me are collection information on GraalVM and most notably CLJ-1472 here: https://github.com/lread/clj-graal-docs

borkdude09:11:35

welcome to add / improve stuff

borkdude14:11:46

weird observation. using a manual compile I have to set --report-unsupported-elements-at-runtime while when I'm building an uberjar with lein, I don't. https://github.com/lread/clj-graal-docs/blob/master/doc/hello-world.md#todo

borkdude14:11:41

Hmm, I think the difference is in leiningen AOT-ing every single namespace that's transitively required. How do we get this behavior using tools.deps?

Alex Miller (Clojure team)14:11:22

That is the normal aot behavior

Alex Miller (Clojure team)14:11:50

Really it’s the only aot behavior

borkdude14:11:36

@alexmiller when I do: jar -tf target/hello-world-0.1.0-SNAPSHOT-standalone.jar | less, I see for example:

clojure/data$diff_associative.class
clojure/reflect$fn__12091.class
clojure/core$derive$tf__6674.class
clojure/inspector$list_model$fn__9524.class
clojure/pprint$make_pretty_writer.class
while when I do a manual compile:
clojure -e "(require '[hello-world.core]) (compile 'hello-world.core)"
I only see:
$ find classes
classes
classes/hello_world
classes/hello_world/core$loading__6721__auto____16.class
classes/hello_world/core$fn__153.class
classes/hello_world/core$fn__18.class
classes/hello_world/core$loading__6721__auto____151.class
classes/hello_world/core__init.class
classes/hello_world/core.class
classes/hello_world/core$_main.class

borkdude14:11:06

where are all the other classes?

Alex Miller (Clojure team)14:11:35

Lein uberjar deletes them I believe

borkdude14:11:49

I did lein uberjar with :aot :all

Alex Miller (Clojure team)14:11:59

Or lein compile, not sure which part it is

borkdude14:11:47

I'm not using lein in between:

[email protected] /tmp/hello-world $ $ clojure -e "(require '[hello-world.core]) (compile 'hello-world.core)"
hello-world.core
[email protected] /tmp/hello-world $ find classes
classes
classes/hello_world
classes/hello_world/core$loading__6721__auto____16.class
classes/hello_world/core$fn__153.class
classes/hello_world/core$fn__18.class
classes/hello_world/core$loading__6721__auto____151.class
classes/hello_world/core__init.class
classes/hello_world/core.class
classes/hello_world/core$_main.class

Alex Miller (Clojure team)14:11:55

AOT is inherently transitive - if you look at the impl, it’s just loading plus a flag that says “also, put the class file on disk”

Alex Miller (Clojure team)14:11:30

So your compile example makes sense

borkdude14:11:14

not understanding. transitive to me means: also the dependencies (like clojure.core, etc)

borkdude14:11:09

but like you said in #clojure, maybe I need tools.namespace to do this

Alex Miller (Clojure team)14:11:57

Lein I presume is probably capturing more transitively, don’t know

Alex Miller (Clojure team)14:11:02

If the code you call compile on doesnt load dependencies, they won’t be compiled

borkdude15:11:11

ok, mystery solved. AOT-ing all namespaces using tools.deps gives the same result as lein now.

borkdude15:11:06

cool, now I finally have a way to do this all "manually" or "in code"

borkdude20:11:27

Some performance tests of the current situation + 2 patches:

$ clj
Clojure 1.10.1
user=> (def o (Object.))
#'user/o
user=> (time (dotimes [i 100000000] (locking o nil)))
"Elapsed time: 5846.896771 msecs"
nil
user=>
clj-1472-3.patch
$ clj
Clojure 1.11.0-master-SNAPSHOT
user=> (def o (Object.))
#'user/o
user=> (time (dotimes [i 100000000] (locking o nil)))
"Elapsed time: 243.317501 msecs"
CLJ-1472-reentrant-finally2
$ clj
Clojure 1.11.0-master-SNAPSHOT
user=> (def o (Object.))
#'user/o
user=> (time (dotimes [i 100000000] (locking o nil)))
"Elapsed time: 256.507628 msecs"
Either patch is much faster than the current locking macro and seem similar in performance.

Alex Miller (Clojure team)20:11:32

the modern jvm is smart enough to know the locked code is doing nothing and elide the lock entirely in code like this, so I wouldn't trust any of these numbers

borkdude20:11:32

I'd be happy to test other more realistic examples if you have any

Alex Miller (Clojure team)20:11:36

not at hand, but certainly doing some work under the lock that updates a mutable var would be better. even the single-threaded version of that may be optimized away, would need to be careful to check (or turn off lock elision)

Alex Miller (Clojure team)20:11:50

-XX:-EliminateLocks might be sufficient

borkdude20:11:14

I'll try that together with:

$ clj
Clojure 1.10.1
user=> (def o (Object.))
#'user/o
user=> (def mut (volatile! -1))
#'user/mut
user=> (time (dotimes [i 100000000] (locking o (vswap! mut #(* % -1)))))
"Elapsed time: 17435.702364 msecs"
nil
user=> @mut
-1

Alex Miller (Clojure team)20:11:03

really, the best is to have some actual program that uses locks where you can make actual task measurements. however, that's probably hard to find in the clojure world.

borkdude20:11:43

right. the places where locking is used is in loading namespaces or starting socket servers, which is both in the order of 1-100, probably not 1000000

Alex Miller (Clojure team)20:11:27

really, clojure programs rarely employs locks (in Clojure user space), and even much more rarely use locks for fine-grained state changes, so it's unlikely that this kind of change is going to affect anyone's overall system perf

Alex Miller (Clojure team)20:11:46

and the difference would really have to be significant to make an impact, so it's really just making sure we're not doing something super dumb

Alex Miller (Clojure team)20:11:30

maybe having a few threads bump a shared counter while hitting the lock would be good

Alex Miller (Clojure team)20:11:09

the code above is all single-threaded so that opens a lot of space for the jvm to do smart things

Alex Miller (Clojure team)20:11:27

make sure to record the jvm you're using too, this stuff is actively improving over jvms

Alex Miller (Clojure team)20:11:44

I care most about 8 and 11

borkdude20:11:11

clj -J-XX:-EliminateLocks is how you pass it right?

Alex Miller (Clojure team)21:11:46

cleaning all this info up and making a really readable ticket would be really helpful. this is basically what I would do on this ticket (https://clojure.org/dev/creating_tickets) and is a huge boost to getting it ready to screen.

borkdude21:11:00

(println "Java version:" (System/getProperty "java.version"))
(println "Clojure version:" (clojure-version))

(def o (Object.))

(def mut (volatile! 0))

(defmacro do-parallel [n f]
  (let [fut-bindings
        (for [i (range n)
              sym [(symbol (str "fut_" i))
                   `(future (locking o (vswap! mut ~f)))]]
          sym)
        fut-names (vec (take-nth 2 fut-bindings))]
    `(let [[email protected]] ;; start all futures
       (doseq [f# ~fut-names] ;; wait for all futures
         @f#))))

;; measure time running 10k times incrementing mut 1k times in parallel
(time (dotimes [_ 10000] (do-parallel 1000 inc)))

(println @mut) ;; should be 10000000


;; $ clj -J-XX:-EliminateLocks
;; Clojure 1.10.1
;; user=> (load-file "/tmp/parallel.clj")
;; Java version: 1.8.0_212
;; Clojure version: 1.10.1
;; "Elapsed time: 19619.464243 msecs"
;; 10000000
;; nil

borkdude21:11:51

CLJ-1472-reentrant-finally2:

$ clj -J-XX:-EliminateLocks
Clojure 1.11.0-master-SNAPSHOT
user=> (load-file "/tmp/parallel.clj")
Java version: 1.8.0_212
Clojure version: 1.11.0-master-SNAPSHOT
"Elapsed time: 20173.415374 msecs"
10000000
nil

borkdude21:11:01

clj-1472-3.patch:

$ clj -J-XX:-EliminateLocks
Clojure 1.11.0-master-SNAPSHOT
user=> (load-file "/tmp/parallel.clj")
Java version: 1.8.0_212
Clojure version: 1.11.0-master-SNAPSHOT
"Elapsed time: 19793.283815 msecs"
10000000
nil

borkdude21:11:14

It seems neither patch cause a performance regression

borkdude21:11:57

And the numbers are more realistic now

Alex Miller (Clojure team)21:11:35

presuming those are repeatable numbers, I don't think any of that is concerning

borkdude21:11:18

right, it's probably because the locking itself isn't the overhead here, it's probably dealing with boxed integers

borkdude21:11:51

or future overhead

Alex Miller (Clojure team)21:11:54

yeah, and same is probably true of nearly all such uses of locking

Alex Miller (Clojure team)21:11:18

it's just a check that we didn't invent a new problem :)

borkdude21:11:47

that's very good to check

borkdude21:11:32

I have a similar check on CI for clj-kondo... I've seen dumb mistakes make it 1.5-2x as slow

sogaiu22:11:37

am i following the instructions (https://github.com/lread/clj-graal-docs/tree/master/CLJ-1472#scripts) appropriately to try this out? i cloned the clj-graal-docs repo, ran the build-clojure-with-1472-patch.sh script, and populated /tmp/parallel.clj. then made a deps.edn file and in a repl session beginning clj -J-XX:-EliminateLocks did a load-file:

$ cat deps.edn 
{
 :deps {org.clojure/clojure {:mvn/version "1.10.1-patch1472"}}
}

$ clj -J-XX:-EliminateLocks
Clojure 1.10.1-patch1472
user=> (load-file "/tmp/parallel.clj")
Java version: 11.0.5
Clojure version: 1.10.1-patch1472
"Elapsed time: 19499.429168 msecs"
10000000
nil

sogaiu22:11:28

this is on manjaro linux

borkdude22:11:50

seems good!

borkdude22:11:21

this is on Java 11, good to have those numbers too

sogaiu22:11:21

i can do java 8 as well if that'd be helpful btw, any hints on how to do the 1.11.0-master-SNAPSHOT-based things listed at: https://github.com/lread/clj-graal-docs/blob/master/CLJ-1472/README.md#performance

sogaiu22:11:40

or may be not necessary?

borkdude22:11:00

I don't get your question?

sogaiu22:11:48

the link to the performace section lists:

CLJ-1472-reentrant-finally2:

$ clj -J-XX:-EliminateLocks
Clojure 1.11.0-master-SNAPSHOT
user=> (load-file "/tmp/parallel.clj")
Java version: 1.8.0_212
Clojure version: 1.11.0-master-SNAPSHOT
"Elapsed time: 20173.415374 msecs"
10000000
nil

clj-1472-3.patch:

$ clj -J-XX:-EliminateLocks
Clojure 1.11.0-master-SNAPSHOT
user=> (load-file "/tmp/parallel.clj")
Java version: 1.8.0_212
Clojure version: 1.11.0-master-SNAPSHOT
"Elapsed time: 19793.283815 msecs"
10000000
nil
both of those appear to be 1.11.0-master-SNAPSHOT-based runs, right?

borkdude22:11:09

those are my patched versions of clojure master

borkdude22:11:23

with the respective patches

sogaiu22:11:08

ok, so as far as collecting data from various os platforms and java versions is concerned, is working with the script at the repository enough?

borkdude22:11:44

there are two relevant patches which I applied manually.

borkdude22:11:08

@lee's script does it automated and he is working at making the patch selection configurable

sogaiu22:11:33

ah, ok. will wait for that then. thanks for the explanation.

lread22:11:22

yep, in progress...

sogaiu22:11:51

on a separate note, is the skip-macros thing mentioned somewhere already or worth mentioning?

borkdude22:11:51

@sogaiu I am using that in graalvm but I'm skeptical that it does anything as macros are compile-time

sogaiu22:11:29

ah, i see. so it remains to be seen whether there is any startup-time benefit.

sogaiu22:11:39

thanks for the links.

borkdude22:11:50

@sogaiu to build those branches, the flow is this: - build a checkout of spec.alpha - then build the clojure branch using the local spec.alpha version - then re-build spec.alpha using the local clojure branch version - then re-build the local clojure branch

borkdude22:11:04

and be sure to clear target before doing all of those at every step

sogaiu22:11:33

thanks for the tips -- i'll see if i can manage

lread22:11:00

@sogaiu, I think any option that we use that is not graal specific is worth explaining. and sometimes the graal specific ones, if they are only needed for clojure are worth some words.

borkdude22:11:37

it's good to check clj -Spath when testing a clojure branch to make sure you're running the SNAPSHOT of spec, I've had a few times that it didn't work. Clear .cpcache as well to be sure

sogaiu22:11:03

@lee yes, that makes a lot of sense -- i was quite confused with figuring out what was and what was not necessary when trying to get things to work

sogaiu22:11:26

@borkdude thanks for those tips

borkdude22:11:02

@sogaiu These are my "old" notes: https://gist.github.com/borkdude/dd0857cf1958b25496fddbdbf359ca59 which are being ported to the repo.

👍 4
borkdude22:11:34

and are partially automated by lread

sogaiu23:11:25

CLJ-1472-reentrant-finally2.patch

$ clj -J-XX:-EliminateLocks
Clojure 1.11.0-master-SNAPSHOT
user=> (load-file "/tmp/parallel.clj")
Java version: 11.0.5
Clojure version: 1.11.0-master-SNAPSHOT
"Elapsed time: 17843.179634 msecs"
10000000
nil

borkdude23:11:16

looks normal

sogaiu23:11:59

@borkdude still awake? working on the other patch -- i get:

$ git apply clj-1472-3.patch
error: patch failed: src/jvm/clojure/lang/Util.java:242
error: src/jvm/clojure/lang/Util.java: patch does not apply

sogaiu23:11:52

don't mind manual patching, just curious if you encountered this

lread04:11:51

we hit this too. resolved by installing git-extras and running git rebase-patch clj-1472-3.patch.

sogaiu04:11:58

ah, very nice -- that worked :thumbsup: i had installed git-extras already because the script required it -- but i didn't know about rebase-patch, only apply 🙂

lread04:11:24

yay, glad it worked! simple_smile

sogaiu23:11:39

clj-1472-3.patch:

$ clj -J-XX:-EliminateLocks
Clojure 1.11.0-master-SNAPSHOT
user=> (load-file "/tmp/parallel.clj")
Java version: 11.0.5
Clojure version: 1.11.0-master-SNAPSHOT
"Elapsed time: 18813.726689 msecs"
10000000
nil