Fork me on GitHub
#clojureverse-ops
<
2021-10-27
>
oxalorg (Mitesh)09:10:45

Hey folks, Over the past few months http://Clojureverse.org has been down a handful of times, the first time was mysterious (read here: https://clojureverse.org/t/clojureverse-report-august-22-downtime-planned-changes-to-infra/8083) but the next few times it was an easy fix to get the site online whenever it went down. But it’s time to upgrade the infrastructure and bring things up-to date for more stability. I’ll be spending some time today to do the following tasks • move from DigitalOcean to Exoscale • setup and install required software on the new server • setup discourse on a staging subdomain and verify if it works • change existing discourse instance into read-only LOCK mode • export a fresh backup of the site • import into the new instance • verify if it works • change DNS of http://clojureverse.org to new instance • rebuild discourse with LetsEncrypt with changed domain settings • disable read-only mode Expect a bit of downtime today! 🙈 🔫

gratitude 4
🤞 1
oxalorg (Mitesh)04:10:58

http://Clojureverse.org is now in READONLY mode while I'm creating a fresh backup

oxalorg (Mitesh)05:10:29

WOW! Discourse isn't importing it's backups cleanly, their recommended solution is to manually delete entries and their links from postgres 🙈 (or to pay someone) Going to delete everything and try rebuilding it again

oxalorg (Mitesh)05:10:37

Argh! Trying to rebuild it once again, and turns out the DNS changes didn't propogate fully yet, and now hit the letsencrypt rate limit 🙈

oxalorg (Mitesh)06:10:19

Fresh rebuilds didn't work. I've reverted DNS back to old instance and after diving deep into whats wrong I found some issues in the data:

ERROR:  could not create unique index "index_incoming_referers_on_path_and_incoming_domain_id"
DETAIL:  Key (path, incoming_domain_id)=(/r/clojure.compact, 1229) is duplicated.
EXCEPTION: psql failed: DETAIL:  Key (path, incoming_domain_id)=(/r/clojure.compact, 1229) is duplicated. 
Then I connected to our old instance pg container and tried to debug this
discourse=# select path, count(path) as cp from incoming_referers group by path, incoming_domain_id having COUNT(*) > 1;                                                                                        
        path         | cp
---------------------+----
 /t:clojure          |  2
 /r/clojure.compact  |  2
 /r/clojure/.compact |  2
 /r/clojure.compact  |  2
(4 rows)
Looks like there are duplicates where these shouldn't exist

oxalorg (Mitesh)06:10:22

okay I manually removed all conflicts from the postgres referrer table, these tables are related to link tables so I made sure to not break them:

discourse=# select * from incoming_referers where path LIKE '%/r/clojure/.compact%';
  id  |        path         | incoming_domain_id
------+---------------------+--------------------
 1259 | /r/clojure/.compact |                829
 8154 | /r/clojure/.compact |                829
(2 rows)

discourse=# update incoming_referers set path = '/r/clojure.compact/' where id=8154;
UPDATE 1
discourse=# select path, count(path) as cp from incoming_referers group by path, incoming_domain_id having COUNT(*) > 1;
creating a new backup and trying to restore that now

oxalorg (Mitesh)06:10:58

YES!! This worked 🎉 Going to try and work through the LetsEncrypt rate limit by manually syncing the certs from the previous instance 🙈

oxalorg (Mitesh)07:10:42

Something wrong with SMTP configuration, looking into it!

oxalorg (Mitesh)07:10:00

We're done with the complete successful migration 😁 sharkdance clojure-spin! Emails are also working now but there are some background jobs running for the import process to complete. Due to this email notifications are disabled until all these tasks are completed. Will re-enable them soon!

🎉 3
oxalorg (Mitesh)17:10:52

Processing of all background tasks are completed, I've enabled email notifications/digest! 🙂

borkdude13:10:51

Thank you for making http://clojureverse.org possible - it's a great forum.

4
gratitude 3