Fork me on GitHub
#cljs-dev
<
2016-08-19
>
mfikes04:08:41

I’ve been able to isolate attempts to write non-supported values into the Transit analysis cache as being related to :parallel-build true, at least for me (I'm hoping this will lead to a minimal repro so we can arrive at a definitive explanation): https://gist.github.com/mfikes/e3313ec059b604674f60d326e701a7f1

dnolen11:08:53

@mfikes interesting, haven’t seen that before

mfikes13:08:19

@dnolen Emperically, if I place a course grained mutex around the transit write call, it works fine. A theory is that transit has a thread-safety bug, perhaps around the WriteHandlerMap stuff down in transit-java. I’ve locally upgraded the compiler to the latest transit-clj to check and the defect still occurs.

mfikes13:08:54

(The WriteHandlerMap is mutative.)

dnolen14:08:47

@mfikes huh, but we construct a new writer/reader every time

mfikes14:08:18

@dnolen Good point. I added a a tiny bit of logic to try to capture the number of concurrent threads writing to see if it is correlated with the failures (essentially (util/debug-prn 'starting (swap! write-concurrency inc))), and it every time it occurs, the there are 3 writers in flight.

dnolen14:08:50

@mfikes are you saying they share handlers?

mfikes14:08:28

@dnolen No… just a desperate and quick attempt to try to find a root cause. Not good analysis.

mfikes14:08:43

@dnolen All, I know is: 1) It fails only if parallel builds enabled 2) When so, it fails, maybe 30 to 50% of the time 3) If I put a coarse-grained mutex around the write calls, it succeeds 4) It appears to occur under heave write load (the 3 above)

dnolen14:08:24

@mfikes ok so patch for 3) is acceptable to me - would like to get that in this coming release if you have it

mfikes14:08:34

OK… will do 🙂

mfikes14:08:14

When testing the patch for http://dev.clojure.org/jira/browse/CLJS-1759 I encountered a self-host error which I'm assuming is unrelated:

ERROR in (test-cljs-1757) (TypeError:NaN:NaN)
expected: (s/exercise-fn (quote cljs.spec-test/cljs-1757-x))
  actual: #object[TypeError TypeError: Cannot read property 'call' of undefined]
Gonna re-run the tests off master to be sure.

mfikes14:08:42

Confirmed the above is unrelated; we’re good to go. Captured it as http://dev.clojure.org/jira/browse/CLJS-1760

anmonteiro15:08:20

@mfikes working on fixing CLJS-1760

anmonteiro15:08:59

not a valid excuse, but I assumed self-parity tests were passing and didn’t bother to run them because CLJS-1756 wasn’t there yet 🙂

mfikes15:08:37

Cool. The cljs.spec$macros/gen warning is really odd if it is referring to the gen symbol on this line: https://github.com/clojure/clojurescript/blob/master/src/main/cljs/cljs/spec.cljc#L77

mfikes15:08:21

@anmonteiro For what it’s worth, I quickly tried with http://dev.clojure.org/jira/browse/TCHECK-105 and that didn’t fix anything

anmonteiro15:08:58

@mfikes it seems that my patch didn’t introduce the bug, it just happens that exercise-fn had no tests before 🙂

anmonteiro15:08:52

attached patch to CLJS-1760

anmonteiro15:08:33

probably something you missed when you added a patch to qualify symbols & keywords

rohit15:08:38

I’ve got a docker container running Clojurescript tests on Shippable. Some of the tests are failing for some reason. But atleast its a start. My Dockerfile and the config file for Shippable are here: https://github.com/ducky427/clojurescript

anmonteiro15:08:51

@rohit if you apply CLJS-1756 (or rebase on current master) failures should go away

rohit15:08:12

oh cool. let me do that.

rohit15:08:53

@anmonteiro: it looks much better now though some tests are still failing.

FAIL in (test-reader) (at builds/out-adv/core-advanced-test.js:1106:263)
Character Literals
expected: (= ["\t" "\r" "\n" " " "\b" "\f" "?"] (reader/read-string "[\\tab \\return \\newline \\space \\backspace \\formfeed \\u1234]"))
  actual: (not (= ["\t" "\r" "\n" " " "\b" "\f" "?"] ["\t" "\r" "\n" " " "\b" "\f" "ሴ"]))

rohit15:08:07

some charset related issue I am guessing.

rohit15:08:25

Here’s another one:

FAIL in (test-819) (at builds/out-adv/core-advanced-test.js:1106:263)
Testing reading, CLJS-819
expected: (= m " ?")
  actual: (not (= nil " ?"))

rohit15:08:39

I’ll investigate these.

anmonteiro15:08:37

@rohit btw there’s 1 more test to run

anmonteiro15:08:40

lein test is also a thing

rohit15:08:44

@anmonteiro: oh cool. I didn’t realise that. I need to install lein in that image in that case. cheers!

mfikes15:08:25

There is also script/test-simple. (But those may not be passing right now.)

rohit15:08:39

I’ve added this to the list of tests as well. Its a good starting point.

anmonteiro15:08:21

@rohit the self parity test has just been fixed on master 🙂

anmonteiro15:08:29

so I suppose only encoding issues remain

rohit15:08:47

@anmonteiro: thats awesome! 🙂

mfikes15:08:27

Looks like script/test-simple is passing. 🙂

rohit15:08:16

@mfikes: neat. just added lein and triggered a new build.

anmonteiro15:08:52

@mfikes probably worth tweaking script/test to compile with :parallel-build enabled

anmonteiro16:08:16

re: your issue with transit & parallel build

anmonteiro16:08:31

maybe we could spot regressions that way

mfikes16:08:34

Right… otherwise we really don’t have anything testing :parallel-build I’d guess

rohit16:08:31

Thats awesome!

anmonteiro16:08:50

thought it might be private

rohit16:08:55

lein test - all a-ok!

rohit16:08:15

I am guessing public logs for public repos.

rohit16:08:57

Update: thanks to @anmonteiro all tests are now passing!

dnolen17:08:02

Thanks everyone!

mfikes22:08:42

I wonder if there is some way to do something akin to https://github.com/clojure/clojurescript/releases/latest that could avoid having to PR like https://github.com/clojure/clojurescript-site/pull/22 for releases. (The front page of http://clojurescript.org seems to have a dynamically-updated link.)

mfikes22:08:02

@favila With respect to CLJS-1761 I was reproducing it via stuff in https://clojurians.slack.com/archives/cljs-dev/p1471581041001784

mfikes22:08:26

@favila Unfortunately, I failed to create a minimal repro 😞

favila22:08:56

@mfikes If my hunch is correct, then if transit-write-opts and transit-read-opts :handlers map are wrapped in transit-clj/write-handler-map call, issue will be solved without locking

Alex Miller (Clojure team)22:08:22

@mfikes the http://clojurescript.org page is doing git describe --abbrev=0 --tags | sed "s/^r//“ during the build

favila22:08:24

read-handler-map and write-handler-map pre-merge the default handlers and bypass all the handler caching

favila22:08:39

(where the races are)

Alex Miller (Clojure team)22:08:11

@mfikes I tried to make that something that could make a variable for the quick start page but that turned out to be annoyingly difficult. could be done with just some hacking on the files though. If you could file an issue on clojurescript-site I’ll get to it

mfikes22:08:41

@favila: cool. Since I can easily repro, willing to try things. Will see if I can test your idea. :)