Clojurians
#cljs-dev
<
2018-03-07
>

This page is not created by, affiliated with, or supported by Slack Technologies, Inc.

mfikes01:03:28

Patch in CLJS-2623 (updated browser REPL copy) renders as

mfikes01:03:09

Uh, oh, when I pasted that image it indicated: "Your file was uploaded — it’s safe and sound in Slack. Unfortunately your workspace doesn’t have any storage space left. To get more space, you can upgrade to a paid account or delete some of your older files."

mfikes01:03:31

(Deleted some older browser render pics in this channel and was able to post.)

mfikes01:03:36

There is a patch covering the src="/main.js" in the rendering above in https://dev.clojure.org/jira/browse/CLJS-2624

mfikes04:03:20

^ Actually, you need src="/main.js" or src="main.js" to make things work. Not sure if there is any correct way to address this, other than to leave it as is.

mfikes12:03:20

Perhaps if index.html exists then :output-dir defaults to "out"?

mfikes12:03:01

An approach along these lines seems to work out; updated patch in CLJS-2624

mfikes13:03:07

Three pieces of feedback received regarding the template: - Make the DOCTYPE declaration in lower-case - Omit the script type tag - Use `<textarea/> to facilitate select / copy I have no opinion on these.

jannis13:03:16

@dnolen Patch with tests added for the :npm-deps {"/path/to/tarball.tgz" "..."} feature: https://dev.clojure.org/jira/browse/CLJS-2622

juhoteperi13:03:06

@jannis so they only difference is that npm install is called without version

jannis13:03:59

Pretty much, since the version is implicitly provided by the tarball contents.

juhoteperi13:03:14

I wonder how much is makes sense to try to support different cases with :npm-deps, I would maybe just remove the whole option and make everyone use package.json

jannis13:03:45

Yep, that'd be worth considering.

jannis13:03:52

Until we do that, this feels like a feature worth adding to support e.g. unpublished packages.

juhoteperi13:03:18

Or we could just document "if you need to do this, use package.json or call npm install yourself"

mfikes14:03:09

Perhaps newly-entered https://dev.clojure.org/jira/browse/CLJS-2625 is related to the browser races David was fighting with yesterday

bhauman14:03:38

@dnolen my comments yesterday about the browser REPL were spurred by my attempts to use the Browser REPL to work on a more isolated figwheel, I was at first pleasantly surprised and hopeful, everything worked! I was getting feedback in the browser, and my work was clipping along nicely but then after 15-30 minutes of use the Browser REPL connection died :disappointed:. Then in another session the watcher inexplicably died. This was ultimately very disheartening. As this issue has persisted from the very early days of CLJS.

bhauman14:03:46

my ranting tone wasn't helpful

dnolen14:03:28

@mfikes we don’t fully wait, but I’m less concerned about that, I think I have an idea (heartbeat thing) which might address a bunch of things at once

mfikes14:03:30

Oh, so that points to a problem using a long-lived connection. That's unexplored by me.

bhauman14:03:04

its good to see work is happening in this direction

dnolen14:03:08

@bhauman heh, I think we’re on the same page here - like I said I’d like to actually start using browser REPL on stuff so these kind of problems can get sorted out

dnolen14:03:22

and of couse happy to take patches to push it along from anyone who is interested

potetm14:03:05

Yeah my immediate reaction was: might this be addressed by automatic reconnects? https://dev.clojure.org/jira/browse/CLJS-1147

potetm14:03:22

but a heartbeat would likely tackle a bunch of things at once

bhauman14:03:42

well my gut says the protobuffer stuff is fragile in general and that HTTP polling is simpler and more robust

bhauman14:03:56

but I haven't messed with it

dnolen14:03:18

could also switch to WebSockets

potetm14:03:23

It’s complicated (because of the extra layer of async), but it should work. But yeah, I think removing dynamic vars from the server and ajax polling might be better.

bhauman14:03:39

well that would be extremely robust

potetm14:03:40

websockets is preferred imo

bhauman14:03:36

I personally haven't found any resources that would help us include websocket support minimally

dnolen14:03:04

yes we would need to build WebSocket support from scratch

bhauman14:03:12

yeah that's intense

bhauman14:03:24

and could be buggy

dnolen14:03:37

It’s possible, but I would actually be very surprised

dnolen14:03:47

we don’t have to do everything, just enough to make our thing work

potetm14:03:52

oh on the server side?

dnolen14:03:07

@mfikes a lot of the stuff out there use existing web servers w/ websocket support

potetm14:03:12

yeah weasel uses httpkit

dnolen14:03:13

we don’t want to pull in that kind of dep

bhauman14:03:18

@mfikes we can't pull it in

bhauman14:03:21

:slightly_smiling_face:

mfikes14:03:29

Not suggesting pulling in a dep, but learning how it works.

dnolen14:03:34

in general we don’t need to play the full webserver game

bhauman14:03:36

I wish fighweel didn't have httpkit in it

dnolen14:03:41

just the tiniest thing to make it work

dnolen14:03:07

i.e. browser REPL is toy webserver + real websocket thing

dnolen14:03:43

then you can integrate into anyold project

dnolen14:03:53

and ignore the webserver part

bhauman14:03:48

I found it a while back it's a decent resource

bhauman14:03:10

but HTTP polling would more than likely work very well

potetm14:03:23

yeah I’m weighing the diff

dnolen14:03:34

@bhauman actually that looks pretty nice - and maybe a reasonable dep?

bhauman14:03:41

I haven't used it

dnolen14:03:47

the problem with browser REPL is actually that stuff like SSL isn’t supported

bhauman14:03:48

so I have no idea

potetm14:03:53

websockets certainly more efficient, but not necessarily more robust

dnolen14:03:05

those are the kinds of things you would want from a “real” browser REPL

bhauman14:03:24

an http endpoint can be embedded in anything

bhauman14:03:30

so ssl is handled

dnolen14:03:08

@bhauman but you would have to configure that at the web server level no?

bhauman14:03:56

I'm saying the user could embed our endpoint in their server

bhauman14:03:23

but when it comes to ssl you are going to be doing shenanigans anyway

dnolen14:03:40

@bhauman so httpkit gave you wss:// support?

bhauman14:03:02

not really no

bhauman14:03:30

I make users go through a tunnel

bhauman14:03:26

wow that project seems more active than I thought it was

potetm14:03:48

java-websocket?

bhauman14:03:10

yeah, and very few dependencies

bhauman14:03:18

junit and json

potetm14:03:33

Even w/ it, wss will be a hassle, right? Almost seems easier to provide instructions for running over SSL.

potetm14:03:10

and both of those are test scope

bhauman14:03:12

it says something about ssl support

potetm14:03:13

so not real deps

potetm14:03:19

yeah generating your own certs

potetm14:03:01

Just a hassle. Not sure if we can make that a seamless experience.

bhauman14:03:44

its funny though, tunnels works without a hitch

bhauman15:03:07

so the certificate problem can't be real

potetm15:03:16

That would work over http as well though, right?

bhauman15:03:25

it doesn't require creating a cert

bhauman15:03:44

yeah it tunnels ssl to the http endpoint

bhauman15:03:51

including websockets

potetm15:03:07

Wait, what? You don’t have a proxy serving https?

bhauman15:03:25

yes tunnels creates a proxy

potetm15:03:42

ah okay, but it doesn’t require generating a cert

potetm15:03:45

wonder how they do that

bhauman15:03:24

well I just read the code, they do nothing

potetm15:03:52

sry, the start_tls method

bhauman15:03:07

yeah I was there too

bhauman15:03:38

:fail_if_no_peer_cert false

bhauman15:03:51

it seems they just ignore it completely

potetm15:03:11

yeah, I… I thought cert checking was browser enforced.

bhauman15:03:53

i know sooo very little about this though

bhauman15:03:03

or we could embed a cert as a resource in the jar

potetm15:03:10

I mean, that looks pretty legit. At least worth a shot. Might still require the user to override alerts in the browser.

bhauman15:03:42

that hasn't been my experience at all with tunnels

dnolen15:03:25

the JavaWebSocket looks promising to me, looks very active, looks like a lot of users

dnolen15:03:39

we don’t really care about building a webserver, we do care about building a REPL

dnolen15:03:52

so I think adding a dep for this is definitely on the table

potetm15:03:21

I mean, it has no deps so… this would be the one to add.

potetm15:03:20

And the main upside vs ajax is, this maybe gives us ssl out of the box? It sounds like we may be able to pull off ajax+https via the same trick.

dnolen15:03:28

if we had websocket browser REPL would become way simpler

dnolen15:03:37

can split the static serving, from the REPL bits

dnolen15:03:48

and all this racing stuff goes away

dnolen15:03:28

means dropping IE6 etc., but it is 2018 after all

potetm15:03:18

yeah I think that’s about as safe a break as you will find

potetm15:03:25

We could fire up 2 http servers on different ports to get that same isolation though, right?

potetm15:03:44

just trying to bounce more ideas around, not saying that’s a super idea comparatively

dnolen15:03:26

other option which I also think is fine and less churn is supporting the old interface, and refactoring it a bit and then doing the WebSocket one as the recommended way

dnolen15:03:51

@potetm I would think all threads use the same websocket server, and we tag the messages

potetm15:03:07

Yeah, I meant, as an alternative to websockets, we could have 2 webservers: one that serves static content, one that handles the repl. This is another way to break up the repl from the static serve.

bhauman15:03:17

thats the way to go

dnolen15:03:19

a concurrent queue in front of the WebSocketServer seems fine

roman01la15:03:39

@dnolen not sure if it was discussed before, but it would be helpful to clarify the order of command line flags in cljs.main, perhaps in Quick Start or even in --help?

dnolen15:03:59

@roman01la we do

dnolen15:03:08

in the Quick Start and in --help

roman01la15:03:10

You are right. If only it could shout in the face :slightly_smiling_face: I guess I’m not the only one who will miss this line in help

dnolen15:03:08

@roman01la there’s not really much we can do because -m -r -c must come last

dnolen15:03:24

but in -m and -r case they can appear again because we forward remaining args

dnolen15:03:19

forwarding the remaining args is the big tradeoff in the original design

jannis15:03:45

Interesting. enhanced-resolve (which we’re using in @cljs-oss/module-deps) always takes what’s in the main entry fields, regardless of what extensions or moduleExtensions you give it through its resolver configuration. Those just define preference/order in case there is no package.json.

jannis15:03:12

I wonder if I can write a basic plugin for it that catches resolved .mjs files and translates them to .js if possible.

jannis15:03:18

Yep, this could work.

bhauman15:03:29

@dnolen I'm assuming that the :sources in the compiler env is in topological order

mfikes15:03:29

Here is a browser REPL stress test that provokes a failure fairly quickly. If we get it to survive for more than a few seconds, then it can also serve as a way to simulate a user using a long-lived session. https://gist.github.com/mfikes/765ac8e43d70b743e3bfbd02ff4703a1

mfikes15:03:09

Within a few seconds at most I get

[Error] Failed to load resource: The network connection was lost. (localhost, line 0)

mfikes15:03:49

If you increase the Thread/sleep parameters you can get things to run longer.

bhauman15:03:52

@mfikes did you try that on the other repls?

mfikes15:03:07

No, let me try node. It is real easy to do.

bhauman15:03:18

actually nashorn

bhauman15:03:22

actually comparing it to figwheel would be more accurate

bhauman15:03:44

@mfikes that's really impressive idea

mfikes16:03:08

Yeah, let me try to figure out how to get some parameters that would launch Figwheel. That shouldn't be to hard I suppose.

mfikes16:03:31

Figwheel against Browser REPL cage match!

bhauman16:03:33

lein new figwheel test

bhauman16:03:41

lein figwheel

bhauman16:03:56

browser should open automatically

mfikes16:03:01

ok. trying that

bhauman16:03:18

You may have to set :readline false in the figwheel options

bhauman16:03:41

or simply use 0.5.14

mfikes16:03:05

Oh, yeah @bhauman Figwheel is having no issue with that test.

bhauman16:03:27

yeah so nashorn shouldn't either

mfikes16:03:31

For me browser REPL craps out within seconds.

bhauman16:03:03

yep that's been my experience

mfikes16:03:58

In that test, it does a console log of how many iterations it has done. Browser REPL makes it to 15 or 20. Figwheel is already past 1000 for me.

dnolen16:03:15

@mfikes oh so it’s just rapidly writing forms to the target REPL

bhauman16:03:25

freaking ingenious

mfikes16:03:27

Yes, simulating you typing at the REPL

mfikes16:03:42

And getting very quick cups of coffeee

bhauman16:03:05

:slightly_smiling_face:

mfikes16:03:54

Gotta love how easy Clojure makes it to cobble together a "pasteable" test like that.

mfikes16:03:50

FWIW, the browser REPL survives that test for me on a faster box, so there is perhaps some CPU-related race?

mfikes16:03:16

@bhauman Do you have an ancient computer?

bhauman16:03:09

MacBook Pro (Retina, 15-inch, Mid 2014)

mfikes16:03:16

(The box that is failing for me is a 2009 iMac)

mfikes16:03:35

It is passing on a 2013 iMac

bhauman16:03:29

a processing speed difference of that magnitude shouldn't cause connection corruption??

dnolen16:03:23

@mfikes but when you say browser REPL “craps out” you mean there’s an exception? or it locks up?

mfikes16:03:00

In the JavaScript console, it logs 10 or 15 iterations, and then prints

[Error] Failed to load resource: The network connection was lost. (localhost, line 0)
and stops.

mfikes16:03:20

The URL associated with that is , IIRC. checking...

dnolen16:03:30

but no exception from the Clojure process?

mfikes16:03:20

No... but I can try to ascertain more about precicely how it is failing. Perhaps the simple web server we are running craps out in a way that says something...

dnolen16:03:39

probably an exception in a thread or a deadlock

mfikes16:03:06

Yeah, I'll get to the bottom of it; it is trivial to repro so I'll have more info shortly

dnolen16:03:22

I think the way that the browser REPL works is susceptible to locking up

mfikes16:03:22

The amount of time it can run before craping out is heavily dependent on the parameters you put in the Thread/sleep call in the stress test... if you use larger values, things work OK

bhauman16:03:44

we are, of course, assuming that the stress test is revealing the longevity issue, which occurs in the absence of constant interaction

mfikes16:03:22

Yeah, I was planning on having longer sleeps in it and just letting it run for hour.s

bhauman16:03:48

15 minutes should be enough

bhauman16:03:25

anyway this is super helpful

mfikes16:03:26

@dnolen Yeah, perhaps it is a "livelock" (or promise delivery failure). Anyway, you can see that in the last stack here https://gist.github.com/mfikes/71db4c4a50f314824ac9dd0892d5c134

richiardiandrea16:03:56

Checking on my mobile, it looks like browser/send-for-eval is starved waiting for the promise to a connection to be delivered

dnolen16:03:31

@mfikes boom :slightly_smiling_face:

dnolen16:03:41

if we fix that we’ll probably solves all the problems

dnolen16:03:51

that was my hunch yesterday

mfikes16:03:52

I'm wondering if it is just a downstream derailment after the browser side indicates the network connection was lost. I'm trying a test involving commenting out some of the .close calls in cljs.repl.server to see if something is prematurely closing. Dunno.

dnolen17:03:19

@mfikes I don’t think so

dnolen17:03:25

I think the livelock issue is in the code

dnolen17:03:40

if we get a :result we store the promise for later

dnolen17:03:50

but if we get another result we store the promise for later

dnolen17:03:58

but what happened to the first one?

mfikes17:03:01

Repro is even easier if you do cat repl-input.txt | clj -A:cljs where the repl-input.txt has a lot of lines with keywords to evaluate

mfikes17:03:32

Ahh, OK, interesting :thinking_face:

dnolen17:03:26

but this is great, this is best repro we’ve had so far

dnolen17:03:36

@mfikes and also explains your occasional never connect issue

dnolen17:03:44

which I can’t repro on my newer machines

mfikes17:03:20

Yeah, I wish I had taken better notes, but perhaps my 20% failure rate claim came from working on the 2009 iMac, where on the 2013 iMac things are solid

dnolen17:03:38

all you need is for eval results to come in out of order and I think this will happen

mfikes17:03:24

Cool. I'll check around that code. If you think of any speculative changes, happy to try them on the 2009 iMac.

dnolen17:03:05

I need to review the code more closely, no ideas yet, but I suspect a simple fix

dnolen17:03:25

@mfikes actually I have something for you to try :slightly_smiling_face:

dnolen17:03:37

:promised-conn should be :promised-conns

dnolen17:03:57

a vector of conns, pop and use

jannis17:03:00

Eh. So falling back to .js in module_deps.js works but of course the inter-package dependencies that involve .mjs files are still broken—that’s Closure’s part.

mfikes17:03:05

Yeah, that bit of code looked a little suspect to me as well.

jannis17:03:36

I’m not sure how likely a patch that implements the same fallback in Closure would be accepted.

mfikes17:03:38

I'll see if turning into a "queue" of promises helps

richiardiandrea17:03:54

Yeah I am exploring that part as well, as I wrote above..do you even need promises there?

bhauman17:03:25

just a note: keep in mind that currently you can reload the browser and the connection is maintained

bhauman17:03:49

we don't want to lose that as this is webdev after all

jannis17:03:23

@dnolen @anmonteiro What do you reckon about the above? Looks like the .mjs -> .js fallback would have to be implemented in Closure as well to keep things in sync. Is it realistic at all to get that kind of change into Closure?

jannis17:03:07

Right now, with an enhanced-resolver plugin, both graphql and iterall index fine by using only .js files but the dependencies between graphql and iterall sources are still broken (the graphql source files have module$iterall in a few places and no goog.require('module$...$iterall) at all).

jannis17:03:39

That is Closure’s part, right?

dnolen17:03:01

@mfikes I think queue + a new thing from the client :accept would make this work for all cases

dnolen17:03:22

current thing works, and then if we get a new REPL thread, :accept can let it start

dnolen17:03:03

we store current thread names in state

dnolen17:03:11

if new thread name, we use :accept connection

mfikes17:03:43

I'm currently trying a less aggressive change to simply retain the current code but to properly do things atomically in cljs.repl.server/connection and cljs.repl.server/set-connection (in other words properly use the atom instead of deref, and then swap on it, leaving a window for corruption to sneak in)

richiardiandrea17:03:57

@mfikes lol I am doing the same :smile:

dnolen17:03:18

@mfikes we will still need :accept to make it work for Socket REPL though

richiardiandrea17:03:21

the browser sends two requests on two different ports

mfikes17:03:33

The code just looks like it admits a race, and it might be an easy, less risky change to see what that does for this failure mode

dnolen17:03:37

current browser REPL assumes one REPL

dnolen17:03:13

@richiardiandrea you don’t want different ports

mfikes17:03:18

(I don't understand the bigger picture of all of the code, and it sounds like there is more to be done, but looking at just those two functions in isolation, they could be made thread-safe, so just gonna try an experiment.)

dnolen17:03:19

this is impossible to configure on the client

richiardiandrea17:03:53

yeah David, I know :smile: I am just reporting what is happening, anyways...let me keep digging, interesting stuff :smile:

richiardiandrea17:03:48

this is what happens here:

Conn requested
Promise a connection
Set connection #object[java.net.Socket 0x1e5a28e3 Socket[addr=/0:0:0:0:0:0:0:1,port=48686,localport=9000]]
Queue nil
Deliver a promised connection #object[java.net.Socket 0x1e5a28e3 Socket[addr=/0:0:0:0:0:0:0:1,port=48686,localport=9000]]
Set connection #object[java.net.Socket 0x913d1b8 Socket[addr=/0:0:0:0:0:0:0:1,port=48688,localport=9000]]
Queue nil
Conn requested
Deliver #object[java.net.Socket 0x913d1b8 Socket[addr=/0:0:0:0:0:0:0:1,port=48688,localport=9000]]
Set connection #object[java.net.Socket 0x2e4e0bcb Socket[addr=/0:0:0:0:0:0:0:1,port=48738,localport=9000]]
Queue #object[clojure.lang.PersistentQueue 0x61b4fa49 [email protected]]
Conn requested
Promise a connection (waiting)

dnolen17:03:58

those are just the webserver accepting connections, they will be different

dnolen17:03:20

the frontend uses XHR for each :result

dnolen17:03:37

then client waits for response which is the next thing to eval

dnolen17:03:53

so a long poll hooked to the fact that we evaled something before

dnolen17:03:52

the thing is kicked off by :ready

dnolen17:03:15

which evals bootstrap, this setups the browser REPL loop, because that will send back :result

dnolen17:03:29

eval -> :result -> eval -> :result

dnolen17:03:38

but there’s no place in this loop for a new connection

richiardiandrea17:03:49

will get there, thanks for the explanation, I added some println :smile:

dnolen17:03:03

@mfikes ^ I’m 80% sure the single REPL bug is that you get back two :results before you send a response

dnolen17:03:31

so you could lock / coordinate, or just make a queue and allow more concurrency

mfikes17:03:58

OK. This path I'm going down is probably a waste. I'll wrap it up and revisit the queue idea.

dnolen18:03:25

you shouldn’t have to change very much for the queue thing

dnolen18:03:03

looks like 2 line change to me

dnolen18:03:04

set-connection need to be cleaned up for sure, but it’s small

mfikes18:03:53

single swap! revisions didn't affect the test I'm doing, so gonna look at the two :results concept FWIW here is my revised code that I think might be thread-safe / immune to races https://gist.github.com/mfikes/0c8cf0630b3c86dea5e7423046745b79 perhaps @richiardiandrea wrote something along the same lines

richiardiandrea18:03:53

Yes, something like that, basically using the result of swap! for doing things

mfikes18:03:27

Cool. They have a nice appealing symmetric "dual" look to them when written that way.

dnolen18:03:17

@mfikes so the problem your connection with that is two threads can enter if no connection or it’s already closed

dnolen18:03:41

then :promised-conn is overwritten

mfikes18:03:16

Well, it just looks like a potential "test" then "set" concurrency issue could arise there. No specific reason to believe that's happening.

dnolen18:03:43

losing :promised-conn looks like the issue to me

dnolen18:03:52

now and in your gist

mfikes18:03:18

I'm gonna do some logging to see if I can catch it in the act of two :results (or at least better undertstand what hapens right when it locks up)

dnolen18:03:38

let’s put it this way if we’re doing this without all this swap! stuff

mfikes18:03:43

Oh, my gist has a problem... cool.

dnolen18:03:54

we would just put these promises into a concurrent queue and pull them off

richiardiandrea18:03:39

that's what I am doing :wink:

richiardiandrea18:03:35

I am fine with both getting to the same result, no rush, for these kind of things we need double/triple validation :smile:

mfikes18:03:55

Yep, I agree @dnolen, two threads running through my connection impl would overwrite the single :promised-conn slot.

mfikes18:03:54

I've updated my gist to detect / log if overwrites are occuring (without actually doing anything to address the issue), and I'm not seeing any logging. So, even though that code probably deserves attention, I'm thinking the problem on my 2009 iMac might be two :results coming in or somesuch. Looking there now.

richiardiandrea19:03:56

I am close to a solution I think, I will need to try Mikes' test

mfikes19:03:03

FWIW, the problem is not reproducible on my 2009 iMac with Chrome.

richiardiandrea19:03:58

do I need -c for the test? it basically stays in this forever:

Compiling client js ...
Serving HTTP on localhost port 9000
Listening for browser REPL connect ...

richiardiandrea19:03:10

this is also happening in 1.10.126

richiardiandrea19:03:20

command was clj -m cljs.main -r

richiardiandrea19:03:32

browser opened correctly

richiardiandrea19:03:03

actually lemme check the quick start :smile:

mfikes19:03:28

Which test are you running @richiardiandrea?

mfikes19:03:49

Oh, sorry, that gist refers to an alias, but here is what I have for that alias:

{:extra-deps {org.clojure/clojurescript {:mvn/version "1.10.138"}}
           :main-opts ["-m" "cljs.main"]}

richiardiandrea19:03:53

oh yeah so my command above should work

richiardiandrea19:03:08

ok works let me try the gist

richiardiandrea20:03:56

I will attach the patch to CLJS-2627 and @mfikes if you can confirm that it works it would be awesome

mfikes20:03:25

So far for me, it reproduces on Safari and Safari Technology Preview, but not Chrome or Firefox. And when it fails, there is zero evidence of anything going wrong in our code that I can find. I've added logging prints everywhere. When it fails, Safari just indicates that it lost the connection so we end up with a form that needs to be evaluated but no response.

richiardiandrea20:03:01

uhm, don't have Safari here, so that is even more important that you check for me as well, maybe I am just on the wrong path :smile:

mfikes20:03:59

I think there is definitely ample room for bugs in our code, but for the particular test I'm doing it might be something isolated to Safari. This report, while under completely different circumstances, results in the same error I'm seeing https://stackoverflow.com/questions/33895463/safari-ajax-request-failed-to-load-resource-the-network-connection-was-lost

richiardiandrea20:03:08

oh ok so my patch my be a bit off then, I still think it is an improvement though

richiardiandrea20:03:11

by the way it seems that a FIFO queue on open connections is also good for reusing the web browser connections like @bhauman was hoping for

mfikes21:03:15

If I comment out the Content-Length header here https://github.com/clojure/clojurescript/blob/master/src/main/clojure/cljs/repl/server.clj#L166 then things work reliably in Safari on my 2009 iMac.

richiardiandrea21:03:38

I always thought concurrency was tough...you win! :))

mfikes21:03:04

See https://apple.stackexchange.com/a/107868 A theory could be that (on this slower box) we are failing to serve all the bytes indicated by the Content-Length header and Safari balks. That would explain why removing the Content-Length header would prevent Safari from balking, but it this theory doesn't explain why things generally appear to work without this header. (In other words, you would think that other things would fail if, in fact we were failing to serve all of the bytes.)

richiardiandrea21:03:49

Maybe safari guesses the length?

richiardiandrea21:03:06

Or the actual payload arrives dunno

richiardiandrea21:03:49

If the payload arrives safari knows the length I guess

richiardiandrea21:03:32

But it is weird because the payload should be either cut or going through

mfikes21:03:39

Well, if you set the Content-Length header, and you deliver fewer bytes (or perhaps close the connection before delivering all the bytes), then Safari will indicate

Failed to load resource: The network connection was lost.

mfikes21:03:31

If you leave this header off, then Safari won't indicate this message in that scenario because the scenario is no longer applicable.

richiardiandrea21:03:16

So it really is a fail fast mechanisms

richiardiandrea21:03:12

@mfikes have you tried my patch out of curiosity?

mfikes21:03:34

No. Which ticket is it in?

mfikes21:03:01

Ahh, I see it. CLJS-2627?

mfikes21:03:54

@richiardiandrea Your patch does not appear to solve the particular issue I'm looking at.

richiardiandrea21:03:38

should solve the race

richiardiandrea21:03:30

thanks for trying out, i will try to scroll back and see if I can reproduce the race then

mfikes21:03:32

https://dev.clojure.org/jira/browse/CLJS-2629 captures the issue I'm seeing and mentions the odd Content-Length "solution"

dnolen23:03:53

@mfikes @richiardiandrea I have a fix for 2627 and it’s much simpler :slightly_smiling_face:

richiardiandrea23:03:39

I would say mine is robust :smile: less simple

richiardiandrea23:03:55

also fun to implement :wink:

dnolen23:03:37

well I think you’ll see this version is as robust :slightly_smiling_face:

richiardiandrea23:03:58

you removed the mechanism :smile:

dnolen23:03:08

mostly deletes code

dnolen23:03:27

even if I set the the thread delay in @mfikes gist to something very very short this doesn’t break

richiardiandrea23:03:52

well, it is the same solution, except you store outside state

richiardiandrea23:03:56

same robustness :smile:

dnolen23:03:10

less code :wink:

richiardiandrea23:03:41

better, true, happy that you have chosen two queues there, I think we need both

richiardiandrea23:03:53

the only difference in my patch

potetm23:03:03

defonce on shared state? (in case of accidental reload?)

richiardiandrea23:03:04

is that I filter out closed connections on usage

richiardiandrea23:03:39

I actually stumbled upon a case where I had a Socket Closed exception

dnolen23:03:54

@richiardiandrea well give a shot at breaking this and I’ll think about that

richiardiandrea23:03:56

but I cannot say it is reproducible

dnolen23:03:08

@mfikes https://dev.clojure.org/jira/browse/CLJS-2629 seems like a dupe maybe, try master

mfikes23:03:12

Dang. My son is using that 2009 iMac. Will try soon though :slightly_smiling_face:

richiardiandrea23:03:41

happy that my morning hasn't gone to waste :smile:

richiardiandrea23:03:13

trying Mike's stress test against master, no problems so far

richiardiandrea23:03:55

it also looks faster (eye balling)

richiardiandrea23:03:24

so I totally approve :+1: good job folks!

mfikes23:03:01

Yeah, basing it on ConcurrentLinkedQueue makes it much more easy to reason about.

mfikes23:03:49

I'm exercising that code on Late 2013 iMac and it is fine on that box.

dnolen23:03:22

I had the delay between 10-20ms and it was smooth

dnolen23:03:57

on this machine 2014 iMac 4 core 3.6ghz Xeon at least

dnolen23:03:13

actually I think I’m wrong about the accept problem too

dnolen23:03:49

yes I am :slightly_smiling_face:

dnolen23:03:10

every time you eval, you construct a long waiting connection which can be used

dnolen23:03:21

so a new REPL can just use that one

dnolen23:03:33

but when it’s done it will make a new one

dnolen23:03:04

so you’re chasing a tail, so accept is not a problem, you’ll always have a connection to use

mfikes23:03:15

The stress test is working fine with that code on a Mid-2012 dual hexacore

dnolen23:03:34

hrm or not :slightly_smiling_face:

dnolen23:03:53

I still see issues in the Socket REPL case, digging

dnolen23:03:40

it’s definitely less likely now, so that’s a good since, but there’s clearly still a missing thing

mfikes23:03:56

Ahh, after 19936 requests in the stress test the 2012 Mac appears to have repro’d https://dev.clojure.org/jira/browse/CLJS-2629

mfikes23:03:06

Yeah, jstack shows the same blocked promise deref in browser-eval