This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2020-08-22
Channels
- # babashka (2)
- # beginners (81)
- # calva (5)
- # chlorine-clover (3)
- # cider (1)
- # cljsjs (1)
- # cljsrn (24)
- # clojure (67)
- # clojure-europe (3)
- # clojurescript (37)
- # code-reviews (2)
- # conjure (12)
- # core-async (4)
- # datalog (1)
- # datomic (6)
- # emacs (2)
- # figwheel-main (1)
- # graalvm (12)
- # java (4)
- # kaocha (9)
- # meander (3)
- # other-lisps (1)
- # pathom (14)
- # re-frame (2)
- # sci (32)
- # shadow-cljs (77)
- # sql (88)
- # xtdb (54)
Anyone using no.disassemble
with a deps.edn
project? What does one do to make the bytecode available when not running via leiningen?
@mac01021 I can't speak to no.disassemble
but I have used this from a deps.edn
project: https://github.com/clojure-goes-fast/clj-java-decompiler
(there's an alias for that in my dot-clojure
repo's deps.edn
file)
Oh, I see you've opened an issue there... so it's about specifying a JVM option, to the actual path of the no.disassemble JAR...
@seancorfield Yes that's right. But thanks for the pointer to clj-java-decompiler
! It looks like if I can't get the java agent thing to work, then your link will meet my needs
I answered you in that issue on the repo.
I copy/pasted what worked for me locally.
I'll paste it in here for anyone else:
$ clj -Sdeps '{:deps {nodisassemble/nodisassemble {:mvn/version "0.1.3"}}}' -J-javaagent:$HOME/.m2/repository/nodisassemble/nodisassemble/0.1.3/nodisassemble-0.1.3.jar
Initializing NoDisassemble Transformer
Clojure 1.10.1
user=> (require '[no.disassemble :as nope])
nil
user=> (println (nope/disassemble (fn [])))
// Compiled from NO_SOURCE_FILE (version unknown : 52.0, super bit)
public final class user$eval360$fn__361 extends clojure.lang.AFunction {
Thank you! Sorry I didn't reply earlier. I had to run away at that moment to deal with my 3-year-old.
Hi, I am trying to pinpoint why sending larger WebSocket messages (about 400 kB) is so slow on our production server. All the sent data are already loaded in the memory, so I instantly call Sente send! function. I am using transit as a message protocol. The data get to the client after multiple seconds, frequently taking over 30 seconds. When I tried to send a similar amount of data from my development server on localhost, it was almost instant. Any tips what could be going wrong or how to debug this better?
To me this sounds like maybe you have response buffering enabled on an intermediate proxy / load balancer.
@U5RCSJ6BB Why do you think this is responsible? What should we look for? We are using nginx.
Because I've seen similar things before and some proxies by default accumulate packets from the upstream server before sending any to the client. This is not behavior you want when using web sockets. Sounds like this is true of nginx as well: https://serverfault.com/questions/789417/should-proxy-buffering-be-disabled-in-nginx-to-support-sockjs-xhr-streaming
So sounds like you want to disable the proxy_buffering nginx directive
So we were measuring the situation by running tcpdump between our Clojure server and nginx in front of it, and it seems that the delay occurs in Clojure server, than when the data reach nginx, they are forwarded immediately to the client
We tested this just to be sure but it does not seem to improve the speed (it could improve latency probably, but not our problem)
Got it. Curious problem. I'm not familiar enough with sente and that stack to know what might be the cause. I just use ring-jetty9 websocket support and a thin adapter layer for core.async to implement the sending
If core.async is involved via sente and you're using core.async yourselves could there be a dispatch pool starvation problem because of blocking operations being run on dispatch threads? I've seen that manifest itself in strange ways like this. Jstack is useful in that case to understand what the core.async dispatch threads are doing
Sounds like it, was looking into Sente documentation and it seems to be a possible problem. The issue is that the precomputed values I am sending are computed lazily and their realization requires accessing DB on a different machine multiple times. Will rewrite the code so it works faster.
@U0CJ19XAM @U5RCSJ6BB Thanks a lot for your help and discussing potential issues. There were two problems in the end: computation and blocking database I/O when sending the response, and blocking of all threads in Sente pool so the server stopped responding to otherwise quick messages. I fixed the problems by setting Sente to use it's own thread pool (as discussed here: https://github.com/ptaoussanis/sente/issues/265) and by realizing all lazy sequences before sending them to Sente (in the case of administration, we were precomputing data every 15 minutes, but they contained lazy sequences which needed to read further values from DB for realization). The resulting speed looks like this:
Glad you got things resolved @pavel.klavik!
Cheers!
@pavel.klavik Are you being throttled by your network provider? can you use something like https://github.com/websockets/wscat to see if it's related to the browser, your network provider (aws, for example) or your server (can you hit your ws endpoint from within your prod vpc, for example and see similar timings?)
Is it possible you have a large number of connections in your prod env and you're looping over them to find the right one / broadcasting to all? If you're broadcasting and looping over conns, is it possible your algorithm is just O(n)?
The problem should not be with my internet connection, it is slow everywhere. Also the app downloads a lot of images and other resources while running and they are very fast. Small WS messages run quickly as shown in the screenshot.
No, this particular message is send to just a single connection and overall I don't expect to have more than ~10 connections active at any moment.
Not sure, setting it up like this:
(sente/make-channel-socket!
(aleph/get-sch-adapter)
{:user-id-fn (fn [req] (:client-id req))
:packer (sente-transit/get-transit-packer)})
Can you actually identify how many conns you have in prod? Maybe you're not cleaning them up?
so it seems as :json, by checking the code
is throttling larger WS messages common?
hmm, so by checking into :connected-uids, we have 5 connection at the moment, we don't really have high traffic on the server and the same happens after restarting it
how would you use wscat to pinpoint the problem?
chrome devtools has good support for showing websocket connection info. have you narrowed it down whether the issue on the browser side or the server side?
What should I look for? Not sure where is the problem, I will definitely try to look at my server traffic how long it takes before it is send.
I got this in headers tab, and there are frames available
@pavel.klavik , check the messages tab. It should give the timing of the messages. if the message finishes sending quickly, then it's probably a server issue. if it's slow, then it's probably the browser
It takes about 30 seconds there from asking for the data till receiving it.
seems like it's a server side issue
client-side is very unlikely since I am doing very little there, just displaying the data
so it either happens on the server-side, or in the network in between, will try to find out tomorrow when our devops guy comes back from vacation
Is it only for ws? or for any large message? Just maybe you have small tcp_sndbuf (to handle many connections) Also local tests as far as I remember dont use network adapter at all.
Happens only for large WS messages. Downloading or uploading images or even large files works fine.
Spin up a machine in the same datacenter / VPC as your prod server, install wscat on it, simulate the ws connection like above, time it using unix time
or a stopwatch. You need to bisect the problem.
@pavel.klavik To confirm, when running locally you can send that large payload no problem?
Ya, that is in the second screenshot, it took about 200 ms
Nginx might be your issue. Did this "just start happening" or did you try this in prod for the first time now?
I think it was always slower but became much more noticable recently as the number of our users/data is growing
It might be something involving our nginx configuration, we will need to test everything to see where the problem could be.
Ok. Bisect the problem by doing the wscat mentioned above in your Higher environment. Then you will know if the slowness is in your server code or not.
sure, plus we can look into nginx logs and network data there to see how fast it is, thx for pointers
Or clone prod data locally (depending on your industry) and try it again locally with the same amount of data and see if it's still 200ms or not. Good luck!
we also have a staging server, so we can play there, data should not be very different from our testing dev data I have
Btw. Sente or WS directly are merging multiple messages into a single frame. Is there a way to not do it?
In the figures above, I am sending two messages but get a combined reply in a single frame.
@U0CJ19XAM So we did some digging and by running tcpdump in between of our Clojure server and nginx, we found out that the delay is caused by Clojure server
Further, we did an experiment on our staging server running the same code and it is much faster there, so I am quite puzzled
By reading the code, it seems that sente send-fn! is async, not sure how to get insight further
Add some instrumentation. What are your observability capabilities? Are you running out of memory? Disk? Is your VPS provider throttling that environment?
After more digging, I think it is just related to this https://github.com/ptaoussanis/sente/issues/265 and that the precomputed data are stored lazily, so they are realized when we ask for them, costing the extra delay. I will need to do some further experiments with it.