I'm having issues with websockets staying alive. Instead, they keep predictably timing out, even for local dev. Is this something that's peculiar to dev mode for some reason... or is this something that will also happen for a prod deployment?
What I see in the browser is "Connection Closed: 1001 Connection Idle Timeout". I am expecting for HTMX to do auto-retries, but for whatever reason, it does not. Setting a logger as per https://htmx.org/docs/#logging gives me the following in my console, which doesnt give me much, but does show wasClean: false
Logging from biff in connect (websocket connection handler) and printing the status-code and reason gives me 1001, and Connection Idle Timeout
yeah; I threw everything at the wall to get nginx out of the way for testing - will certainly back out anything that isn't necessary. I'm so far from production use, it's not a real concern yet.
it was nginx, in this case; I can't recall whether I was even using it (or if I tested without it), when I was debugging the timeout about a year ago. I could probably retrace my steps, but that wouldn't be productive. The point is, having uttered the correct incantations in nginx.conf[1], and disabled my server-side-ping kludge, the underlying behavior of http-kit and ring-jetty-adapter [2] are revealed:
http-kit 2.8.0 (current): timeout after 3600s as expected with nginx changes
ring-jetty9-adapter[2]: timeout after 500 seconds
1. some subset of this is likely sufficient, or already in your config if you're proxying websockets:
proxy_pass ;
proxy_pass_request_headers on;
proxy_socket_keepalive on;
proxy_read_timeout 3600s;
proxy_connect_timeout 3600s;
proxy_send_timeout 3600s;
send_timeout 3600s;
proxy_http_version 1.1;
proxy_set_header X-Forwarded-By $server_addr:$server_port;
proxy_set_header X-Forwarded-For $remote_addr;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection $connection_upgrade;
proxy_set_header Host $host;
2. or the version of I've been using recently (not sure when I might have last updated; I like to keep everything current in the early/mid dev phase): info.sunng/ring-jetty9-adapter {:mvn/version "0.30.2"}π ring-jetty9-adapter 500seconds: that's precisely what I've been getting!
I notice I've 'apologized' for my memory a lot; honestly, I should apologize for my laziness - it wouldn't be that hard to dig up the answers from git[hub] and logseq, but my tools-for-thought are dull π
cool π
just noticed that u mentioned 60s earlier. A lot of timeouts are 60s by default with nginx fyi
you should probably trim down some of that config. I wouldnt put "proxy_connect_timeout 3600s", for example. If you have to do that, something's wrong.
just a little note: it seems like switching out jetty for http-kit has resolved this, so that I dont get the websocket timeouts anymore. Fingers crossed, we'll see how this goes.
huh interesting, wasn't aware of this behavior. how long does it take to time out?
Interesting @jf.slack-clojurians - I ran into this[1] and tried all sorts of things; ultimately, I needed to be able to have the client seamlessly catch up to an event stream when disconnected during a session anyway, so I solved that first. Then, by tweaking timeout values (and maybe some stuff I don't recall), I was able to get the idle-timeout up to 5m (iirc). The websocket RFC seems to imply that pings can be initiated by either side, and the jetty server code seemed to support the reply (again iirc; this was about a year ago), but didn't actually 'pong' (even when I implemented the ping callback myself). I currently have server-initiated pings, and the clients don't disconnect under normal circumstances, but I'd love to get rid of that, or at least ping from the client. Was the switch to http-kit pretty painless? Do you know if there's ping/pong'ing going on behind the scenes? 1. in the browser, with and without HTMX, as well as with a [clojure]dart (flutter) client.
@foo what led you to choose the particular ring/jetty lib you did?
I just wanted something mature/supported that had websocket support. this was a good read: https://ericnormand.me/mini-guide/clojure-web-servers
I think the Kit framework uses Undertow instead of Jetty? Could be worth looking into.
If anyone wants to give this a try and see if it solves the websocket thing, that would be pretty interesting: https://github.com/luminus-framework/ring-undertow-adapter
@foo anecdotally, not long. Short enough that I was able to test it and have it occur often enough to conclude that this is a real thing. I didnt bother to, but i'll switch things back to jetty, and then do some math to get a more precise value for the time (did a quick search; dont know the solution to subtract 2 timestamps right now)
I'm deep in other stuff right now, but when that's done, I'll try ring-undertow-adapter and, if that doesn't help, http-kit and report back
@michael403 thank you for your input and helpful info! it's a relief to know that I'm not going crazy and it's not just me. The switch to http-kit was actually pretty painless once I got over the mental hurdle of it all. Credit has to go to @foo for a great presentation ("The Design of Biff", iirc), and emphasizing that biff is hackable and that you can switch things out :)
most recent timeout: 8 min 26s 20s (apologies: itβs not 26s)
and again: 8min 20s again!
3rd test: 8min 20s again
4th test: 8min 20s again
I think we can safely conclude that 8min 20s is it. π There might be something to this interval: 8min 20s = 500s
for the record, in my search for a solution I also tried updating the htmx versions (from 1.9.10, to 1.9.12), and also specifying the version of the ws extension (also 1.9.12). That didnt work. It was the switch to http-kit that did. http-kit does pitch itself as being great for websockets... so I decided to give it a try in a bid to see if that would help. In terms of performance, perhaps things have changed with http-kit? https://github.com/http-kit/http-kit/wiki/4-Benchmarking
After a bit of a look at both options, I liked http-kit better:
β’ zero dependencies
β’ already in my deps tree
β’ the as-channel abstraction over async http & ws looks nice
β’ pretty readable code (at least the chunk of the clojure side I skimmed)
Unfortunately, while my quick and dirty refactoring of a slice of my codebase didn't turn up any immediate problems, the slice I chose (a websocket from typescript in browser) does experience the timeout problem as before, and in-fact my server-side ping doesn't keep it alive (probably just a shorter default timeout); I might be missing something, though. The big downside of http-kit is that the docs are... sparse.
To clarify, are you saying that switching did not resolve your timeout issues? The docs could be better. What are you missing? There is #http-kit here fyi that you could pop into
Just to check, did you change the system map?
Correct; and my timeouts are 60s, which is interesting. I meant there might be something I did wrong or didn't do, or something different about my setup (are you reverse proxying? I am, with nginx). Good call on the channel; I want to retrace my steps a bit first; I was tired when I did the refactor.
I assume you mean the context map that biff components use; I did put the server in there for clean shutdown. I haven't updated biff for a while, and I know Jacob did some redesigning of the minimalist component system; I'm not sure what it looks like at the moment. I have a vector of use-* function references which each start something, often add something to the context map, and add to the shutdown list. I replaced use-jetty with a new use-http-kit and changed the way the ws endpoint in question handles the upgrade; replaced the jetty/send! calls with http-kit ones, and bingo, the prototype I refactored works. But it times out after 60s. I haven't investigated further yet; it could be client-side, or some vestige of my attempts to get ring-jetty-adapter to pong or have a long timeout; or something I haven't thought of π
I am not. With proxying, you (obviousy) add another variable into the mix. I am for now in dev mode (so going directly to the application server). I would fix the issues with dev first. Re the changes you did: yes, bingo! that is what I did as well. I havent had issues since after the switch. If you noted down all the changes you did (hopefully you did), it would be easier to track things down. I'm actually surprised at your pre-http-kit timeouts actually. You can see that mine are at 8min 20s, above the 5 min that you get after all your changes
yeah; it is my dev environment, but i need the nginx proxy so that some of my other services aren't cross-site
I'm already happy I switched (or started to), for the reasons above, but also the timeout thing should be easier to debug, if only by triangulating with the previous server's behavior
I was just off on a "should i port sente (and core.async π°) to clojuredart?" tangent. I might need to step away from this problem for a bit, before i start thinking about using UDP for the mobile client again π
if you're interested in debugging pings and pongs, you can switch to using the Ring WebSocket API (I've got sample code at https://github.com/http-kit/http-kit/pull/572, btw) and then create a handler for on-ping as well (see https://github.com/ring-clojure/ring/wiki/WebSockets)