Fork me on GitHub
#polylith
<
2022-11-11
>
seancorfield00:11:39

(unfortunately, after all that work, I'm still getting the IncompatibleClassChangeError in CI only -- but I've also confirmed that poly test produces a differently-ordered classpath on different machines which may be contributing to this...)

1
seancorfield18:11:51

One nice side-effect of switching to the external test runner is that I can reduce the amount of heap I need to run the poly tool and because all the memory is released for each project's test run, we can get through our entire test suite with much less memory in CI -- so I might even be able to reduce us back to a 1x BitBucket instance instead of the 2x instance we've been on for ages. Maybe. It also means all threads are cleaned up after each project's test run, instead of potentially hanging around (still running!) during subsequent project's test runs, which is a lot cleaner. I'm interested in feedback on whether it should be part of the core Polylith project or an external test runner (like the Kaocha one) -- if I split it out, I'll have to copy the colorized component into the external runner. Right now, the process API is very simple: color-mode project-name setup-fn-name? test-nses* teardown-fn-name? and it is run with the classpath with the project's test classpath, as constructed by poly test. See https://github.com/seancorfield/polylith/blob/process-test-runner/bases/external-test-runner-cli/src/polylith/clj/core/external_test_runner_cli/main.clj You can pass JVM options via the POLY_TEST_JAVA_OPTS environment variable (perhaps a better name could be used and/or perhaps a JVM property might be better or a good addition?). It currently assumes java is on your PATH -- it should probably use a similar approach to the Clojure CLI script for that, so it honors JAVA_CMD and/or JAVA_HOME. POLY_TEST_JVM_OPTS is probably a better/more consistent name for the env var, now I look at the clojure script. The ns of the test runner main is returned by the test runner instance -- I added the ExternalTestRunner protocol with an external-process-namespace method, and added is-external-test-runner? to the test runner contract interface with a satisfies? check (so it won't break existing TestRunner's).

imre19:11:32

Sean I just had a rogue idea. Would it be possible to implement this as an optional wrapper around a classloader-based test runner?

imre19:11:39

It could live in poly core then, and one could configure, say, the default test runner but wrapped in this.

imre19:11:15

That way, if, say I want kaocha+separate processes I don't have to copy code over and reimplement the process forking

imre19:11:45

and you wouldn't have to duplicate the default test runner's test discovery/execution/color coding

seancorfield19:11:10

The classloader is the problem I need to avoid.

imre19:11:51

I mean the wrapper that creates the new process could just pass the process' own classloader along

seancorfield19:11:15

Classpath. You can't pass a classloader across processes.

seancorfield19:11:31

And the process needs to be started with that classpath.

imre19:11:55

wrapper->classpath->newprocess->classloader->testrunner

imre19:11:11

I meant that the subprocess could pass its own classloader to the testrunner. Not sure how to properly put it into words

seancorfield19:11:58

But then that new process becomes pretty heavyweight since it has to duplicate all of the classloader and test orchestration logic from Polylith itself.

imre20:11:08

Not the classloader stuff, the process (which has the entire test classpath now) could just pass its own (default?) classloader

imre20:11:53

Orchestration: probably yes but I'd say that's probably lighter weight

imre20:11:58

so basically the default-test-runner implementation would have to be added to the subprocess classpath

imre20:11:02

this would separate concerns though: poly assembles classpath and eventually a classloader which is by default in-process but if the wrapper is used then it's in a subprocess, testrunner works off that classloader and figures out which tests to execute and how

seancorfield20:11:50

I think you'd have to drag enough of Polylith into that new process that it wouldn't be worth doing this at all. It would slow the process down and reintroduce the exact problem I'm trying to avoid: classloader isolation which doesn't work properly.

seancorfield20:11:20

See all my previous posts about problems with classloader isolation -- it's a flawed approach.

imre20:11:30

I'm not suggesting classloader isolation

seancorfield20:11:12

You want minimal dependencies in your test runner process. You do not want a whole bunch of Polylith code (and its dependencies) in that process.

seancorfield20:11:46

Building a classloader and trying to execute just the tests in it is a flawed approach -- it does not work in large, multi-threaded situations like ours.

imre20:11:57

I understand that and I agree that your suggestion of using a subprocess per project-test-run is most probably a good solution to this. What I'm saying is that whether or not the tests run in only a separate classloader or a separate process is one concern, and how and what tests are run is a different one. If you can separate the two then a developer can choose either the default or the kaocha runner, plus they can choose whether they want to stay in-process or use subprocesses. Right now: • poly ◦ classloader1 ▪︎ test-runner ▪︎ test1 ◦ classloader2 ▪︎ test-runner ▪︎ test2 suggestion: • poly ◦ subprocess1 ▪︎ classloader of subprocess1 • test-runner • test1 ◦ subprocess2 ▪︎ classloader of subprocess2 • test-runner • test2

imre20:11:10

The test runner doesn't need to have lots of polylith dependencies. The kaocha one only uses the testrunner interface for example

imre20:11:00

It does depend on kaocha on the other hand but that would be the case with a barebones single depsedn project tested with kaocha as well

seancorfield20:11:43

Having a subprocess that is passed the classpath and then builds a classloader and then runs the tests inside it, is not a solution to the problem we have at World Singles.

seancorfield20:11:13

It might be an interesting, elegant way to structure things, but it is the wrong solution to this problem.

seancorfield20:11:48

If someone else wants to build that "solution" and submit a PR, that's fine. But it won't be something World Singles' folks do.

seancorfield20:11:33

If the Polylith team don't want a simple, clean subprocess test runner, then we'll continue to use our fork indefinitely.

imre20:11:15

I can't speak for the Polylith team but I understand your reasons

seancorfield20:11:02

Trying to run tests in a classloader inside an already running process breaks for some of our code at work. We have reported several problems with it, we've had one very small PR accepted that mitigates one small aspect of the problems, and we've already had to modify our code in... unfortunate... ways to work around another aspect of the problems. But, fundamentally, having the test runner create a classloader and attempt to run arbitrary test code inside it is broken for us.

1
seancorfield20:11:27

Cognitect's test-runner works for us: it has very minimal dependencies and runs the tests in-process without a custom classloader. I don't know how the Kaocha test runner works but I see it uses dynapath and add-classpath stuff so I suspect we'd have problems with that too...

imre20:11:21

Sure I understand, won't be pressing further

imre20:11:39

(it still isn't Cognitect's test runner that's used though)

seancorfield20:11:53

I know that. Prior to using poly test, we were using Cognitect's test-runner. And we still use it for legacy subprojects, where it works for our complex, multi-threaded test code.

seancorfield20:11:35

When we migrated our chat server to a Polylith base, that's when we first started seeing problems with the Polylith test runner -- we hadn't had problems up until then. We worked around those problems. Then we migrated our rebilling engine to Polylith and ran into more problems. Hence this whole discussion and the subprocess test runner.

seancorfield20:11:31

Both of those projects' tests ran just fine with the Cognitect test-runner (no classloader stuff). Both of those projects' tests failed somewhat randomly with Polylith's test runner (often only in CI).

seancorfield20:11:59

We still have a few, equally complex, equally multi-threaded legacy projects to migrate (whose tests run just fine with the Cognitect runner and may or may not run fine with Polylith's test runner but I'm going to be a lot more confident about them running with the subprocess test runner).