Fork me on GitHub
#powderkeg
<
2017-03-27
>
cgrand08:03:02

Issue #27 is killing me: how can we have not seen it sooner? (okay we are used to cycling the sparkcontext when things get weird)

cgrand08:03:33

The core of the issue os that (defn foo …) creates a class named ns$foo and when you redefine it you get another class going by the same name. Once packaged and sent to workers, workers have two copies of the class and use the first one (the older one).

viesti08:03:05

hmm so on every redefine, workers get another jar onto classpath?

cgrand08:03:59

not on every redef, on every barrier (eg a call to rdd)

cgrand08:03:38

so if you have severak redefs between two barriers only the last one is put in the jar

viesti09:03:07

thinking of a test for this :)

cgrand09:03:32

not hard if you allow yourself to use eval

viesti09:03:38

should read more about spark worker classpath management

viesti09:03:42

Clojure's dynamic classloader is probably not available on workers?

viesti09:03:30

err, how does spark repl do this?

cgrand09:03:03

spark repl = spark scala shell ?

viesti09:03:59

yep, forgot the name

cgrand09:03:14

a mix of private scala-specific stuff and not solving all issues

viesti09:03:14

scala is not as repl centric as clojure, from what I remember, app in the repl is a bit foreign (keeping state not so straightforward)

viesti09:03:45

I'd probably look into ClojureDynamicClassloader a bit, can't say immediate answer

cgrand09:03:16

some options: • automatically cycle context on redefs 😞 • or document cycling cases (along with protocol redefs) • rename classes (either at broadcast time or by hotpatching clojure)

viesti09:03:07

would class renaming work?

cgrand09:03:04

If you want to be 99.99% certain it’s hard work

cgrand09:03:04

because, before putting classes into the jar you have to transform the class to change its name and change all callers too to refer to the new name and...

cgrand09:03:54

it’s not even going to work because at the other end we can’t change kry ClassResolver.

cgrand09:03:11

so hotpatching

viesti10:03:07

this is just the kind of exciting way to learn Clojure internals 🙂

viesti10:03:44

first thought that since this is related to loading classes, I’d look into DynamicClassloader, but now that I did so, it seems to be more of a cache compiled classes, for which the name munging has to be done first

cgrand10:03:43

what was your general plan with DynClassLoader?

viesti11:03:53

firstly to understand how Clojure class redefining works

cgrand11:03:00

user=> (eval '(do (defn x [] 1) (def x1 x) (defn x [] 2) (def x2 x) [x1 x2]))
[#object[user$x 0xbe35cd9 "user$x@be35cd9"] #object[user$x 0x44821a96 "user$x@44821a96"]]
user=> (map #(%) *1)
(1 2)
user=> (map class *2)
(user$x user$x)
user=> (map #(.getClassLoader %) *1)
(#object[clojure.lang.DynamicClassLoader 0x72cde7cc "clojure.lang.DynamicClassLoader@72cde7cc"] #object[clojure.lang.DynamicClassLoader 0x696da30b "clojure.lang.DynamicClassLoader@696da30b"])
user=> (map #(.getParent %) *1)
(#object[clojure.lang.DynamicClassLoader 0x2accdbb5 "clojure.lang.DynamicClassLoader@2accdbb5"] #object[clojure.lang.DynamicClassLoader 0x2accdbb5 "clojure.lang.DynamicClassLoader@2accdbb5"])

cgrand11:03:44

so despite classes having same name they have different classloaders (which share common ancestor)

cgrand11:03:07

so they are different classes from a JVM point of view

viesti15:03:14

right so re-defining also creates a classloader by which the class is loaded

cgrand15:03:27

Looks like a step forward

user=> (class foo)
user$foo__1596
user=> (defn foo [])
#'user/foo
user=> (class foo)
user$foo__1601

viesti16:03:39

would we need to patch other Exprs than FnExpr too?

viesti16:03:31

so normally when one does defn foo in ns1, then uses ns1/foo in ns2/bar, in order to make ns2/bar use re-defined ns1/foo, one has to redefine ns2/bar, if I’m correct

viesti16:03:32

just wondering that to keep similar behaviour, only new class name would be needed

cgrand16:03:04

This is unrelated. You are conflating class names and car names.

cgrand16:03:32

So when var A uses var B one doesn't have to redef in cascade. This is not true when it's the var value (instead of the var) that is closed over. The latter is more of a special case.

viesti17:03:21

ah because one get’s to fetch the current binding of the var