Fork me on GitHub
#clojure-dev
<
2019-01-17
>
cfleming20:01:28

I think there’s a problem in 1.10 😞

cfleming20:01:54

Debugging under JDI seems to be broken because of a change in the behaviour of DynamicClassLoader.

cfleming20:01:58

Although I can’t figure it out.

cfleming20:01:22

I’ve been working with the debugger guy at JetBrains, and he said: > For some reason java.lang.Object is not reported as visible from the DynamicClassLoader used in evaluation, so jdi method args types checks fail. In 1.09 it does not happen. Definitely some changes in 1.10 trigger that.

cfleming20:01:11

I’ve asked him for some clarification on what exactly he means by “not visible”, but he’s inconveniently asleep.

cfleming20:01:04

Looking at the change history, there were only two changes to the DCL in 1.10, one making a change for CLJ-1550 and the other reverting it, so AFAICT it shouldn’t have changed. Could there be something with the bytecode changes that affect how it responds?

Alex Miller (Clojure team)20:01:26

other than the update in min jdk / asm, can’t think of anything else that changed

ghadi20:01:42

@cfleming you can test that bytecode theory by testing before and after the change, which was.... beta4?

Alex Miller (Clojure team)20:01:53

certainly the classloader structure of java changed in java 9

cfleming20:01:11

This is still under Java 8.

cfleming20:01:36

@ghadi Ok, I’ll try that, hopefully that doesn’t overlap the range where the DCL had actually changed.

ghadi20:01:03

the change was between beta4 and beta5

cfleming20:01:03

The DCL change, or the bytecode one?

ghadi20:01:34

the bytecode one

cfleming20:01:44

Ok, so beta4 has the old code?

cfleming20:01:51

I’ll test that.

ghadi20:01:43

I'm sorry @cfleming , I meant between alpha4-alpha5

Alex Miller (Clojure team)20:01:21

@ghadi any possibility of the custom getCommonSuperClass() being related?

cfleming20:01:33

Ok, I can probably do a bisect on 1.10 versions actually, it’s relatively quick to test, but I’ll try alpha4/alpha5 first.

ghadi20:01:49

I don't know @alexmiller, but that call is only called during bytecode emission, not runtime

Alex Miller (Clojure team)20:01:14

yeah, I can’t construct a theory of how it would be in play, just thinking of the edge stuff

cfleming20:01:40

That might affect this - when evaluating expressions, I construct an fn, compile it and invoke it.

cfleming20:01:33

The problem is that Egor reproduced the problem, but I don’t have any more information from him than just that message from earlier.

ghadi20:01:56

waiting with bated breath

cfleming20:01:25

It’ll be in a little bit, I’m still doing morning childsitting and multitasking 🙂

ghadi20:01:29

i don't think getCommonSuperClass is related either...

Alex Miller (Clojure team)20:01:59

yeah, I can’t see how

cfleming20:01:05

Works with alpha4, busted with alpha5

cfleming20:01:44

I’ll bisect the changes in alpha5 when I get in to the office.

cfleming22:01:36

@ghadi First bad commit is 38705b49fd3dbae11e94c576ef49ff3eb1c47395, which is the getCommonSuperClass change.

ghadi22:01:41

Ok hmmm. Let us know what the debugger expert says, too

cfleming22:01:14

Looking at it, that also seems to be the commit when opcodes were changed from 1.5 to 1.8, so it’s possible that change affected something else, i.e. that causes JDI to work differently with the newer bytecode version.

cfleming22:01:55

@ghadi Was that the main commit which changed the bytecode version, or was the version also changed in other places in other commits?

ghadi22:01:07

that was it

ghadi22:01:20

one commit for the asm bump, one commit for emitting jdk8+

dpsutton22:01:56

and asm bump happened in two separate commits due to a bug in the first version of the asm library

dpsutton22:01:52

1650f9c2546a6fc71c8d74b29e18547d6696fc54 "CLJ-2367 Incorporate ASM merge request 189 and add case tests" right before the alpha6 release

cfleming22:01:35

@dpsutton I can’t see anything in the diff for that changing opcode versions.

seancorfield22:01:10

yeah, alpha 5 broke something for us, as I recall and that was my patch to apply the fix from the ASM project to the ASM code in Clojure... so we went from alpha 4 to alpha 6...

dpsutton22:01:43

I just brought it up because it changes the asm library which emits bytecode. but it happened after alpha5 which you say is already broken for debugging. so it appears to be after the root cause anyways

cfleming22:01:46

Yeah - I can’t see where that commit changes the asm lib, but it’s definitely after the root cause anyway.

dpsutton22:01:20

it adjusted one of the asm files itself:

CLJ-2367 Incorporate ASM merge request 189 and add case tests

Signed-off-by: Stuart Halloway <[email protected]>

2 files changed, 9 insertions(+), 2 deletions(-)
src/jvm/clojure/asm/commons/GeneratorAdapter.java | 8 ++++++--

dpsutton22:01:43

src/jvm/clojure/asm/ ... is the asm lib.

cfleming22:01:44

Oh, I see - right, I forgot ASM is vendored in. Thanks!

cfleming22:01:16

Ok, I’ll ask Egor about those two changes, and I’ll get some detail from him to repro the problem myself.

cfleming22:01:44

I’m not sure I have the knowledge of the gory bits of JDI to understand it, but I’ll see what he says.

ghadi23:01:38

dpsutton you're muddling the issue... the asm bump had a minor emission bug the first time around

ghadi23:01:47

shouldn't affect anything @cfleming is looking into

ghadi23:01:02

AFAHCT, the salient change is between alpha4 and alpha5

ghadi23:01:34

that's already been bisected

Alex Miller (Clojure team)23:01:43

I think the thing to look at is how the compiled bytecode differs before/after

Alex Miller (Clojure team)23:01:27

This could also easily be a bug in asm

cfleming23:01:37

@alexmiller good idea, I’ll check that.

Alex Miller (Clojure team)23:01:58

That getCommon... is used for computing stack maps so maybe there is something weird with those. the before doesn’t have stack maps as those were added after the old bytecode version

cfleming23:01:30

Yeah, if I had to bet a beer, it would be on something with the stack maps.

cfleming23:01:35

Those things are really tricky.

cfleming23:01:51

But the verifier is usually good at catching problems AFAIK.

Alex Miller (Clojure team)23:01:21

I did actually do a fair amount of bytecode diffing before/after trying to catch anything obvious

Alex Miller (Clojure team)23:01:39

@ghadi has a script to revendor asm - could just revendor to latest and see if it works

ghadi23:01:23

JDI and classloaders are runtime -- this doesn't sound like a bytecode problem

Alex Miller (Clojure team)23:01:46

Well the class loader loads bytecode

ghadi23:01:46

I'm curious what the debug-guru says

ghadi23:01:57

yeah but how could you be missing Object?

Alex Miller (Clojure team)23:01:28

I don’t have a theory

Alex Miller (Clojure team)23:01:43

Other than that the bytecode is different across the change

cfleming23:01:42

The issue is that JDI doesn’t load the Object class - I got that far, basically the debugger loops endlessly on expression evaluation since it receives a ClassNotLoadedException. JDI normally takes care of that, so I guess JDI doesn’t like the class for some reason. I don’t understand how the DCL interaction happens though.

ghadi23:01:52

does it work on other JVMs? like 11

cfleming23:01:30

I don’t know, I can try that now.

cfleming23:01:17

That’s the bytecode diff. It looks reasonable to me, but I’m no expert.

cfleming23:01:00

Some constant pool differences, the version numbers and the StackMapTable entries.

cfleming23:01:07

Still broken on OpenJDK 11