This page is not created by, affiliated with, or supported by Slack Technologies, Inc.
2023-09-13
Channels
- # announcements (15)
- # babashka (48)
- # beginners (5)
- # biff (4)
- # calva (3)
- # cider (10)
- # clerk (16)
- # clj-kondo (6)
- # cljdoc (20)
- # cljs-dev (13)
- # clojure (117)
- # clojure-argentina (1)
- # clojure-brasil (5)
- # clojure-europe (40)
- # clojure-nl (1)
- # clojure-norway (111)
- # clojure-uk (5)
- # clojurescript (16)
- # cursive (20)
- # datascript (2)
- # datomic (106)
- # etaoin (2)
- # events (3)
- # funcool (1)
- # graphql (1)
- # helix (8)
- # hyperfiddle (36)
- # leiningen (12)
- # matrix (1)
- # nrepl (1)
- # off-topic (61)
- # other-languages (10)
- # polylith (22)
- # practicalli (1)
- # reagent (28)
- # reitit (11)
- # remote-jobs (3)
- # ring (12)
- # shadow-cljs (109)
- # slack-help (6)
- # solo-full-stack (23)
- # squint (7)
- # xtdb (11)
http://cljdoc.org has been down for a bit today (along with other websites). Not sure if anything needs to happen other than wait but thought I'd post here 🙂
Ah, I was just coming here to mention that http://cljdoc.org seems to be down.
(sorry, after posting that I had a meeting, then lunch, then a doctor's appt, then I worked on a bunch of HTML/JS all afternoon!)
I had a quick look at "server vitals" for the last 24 hours and it's interesting that there was a significant spike in load before the server went offline...
So memory seems to be the culprit. Seeing lots of OOM errors in the logs, which is a bit surprising given the chart only shows 40-50% utilization. But I think it probably has to do with what Nomad allocates for the container.
Thanks for the charts @U050TNB9F! On Sentry I do see a couple of OOM exceptions. A while ago, I added something that should dump the heap on OOM. I'll see if that actually worked.
I guess it is questionable for cljdoc to try to continue after an OOM. But I guess it does right now.
yeah also feels like maybe it should restart? I've also occasionally gotten notices that its down but it became available a few minutes later... maybe it does restart in some scenarios?
I also rarely see OOM exceptions on http://Sentry.io. But maybe it is not capturing all of them?
We got another OOM that blew away the server yesterday. I rebooted cljdoc. (the down alert forwarding worked @U050TNB9F!) My change to save the heap dump seems to have worked, so I might take a peek at it to see if it gives any obvious clues. @U050TNB9F I don't have access to stats on cljdoc JVM heap usage, do you see any evidence of a memory leak in heap usage graphs over time?
I have this chart from DigitalOcean but it sounds like you're maybe looking for something at the JVM level? (this is whole-system)