How I moved from QUIC.Cloud to BunnyNet CDN.
99.99% Uptime Goal in 2024
During late Q3 perhaps early Q4, My company introduced us to their new uptime goal of “Five Nines” which written with numbers looks like 99.999%. Overall this implies that the company would only tolerate 5.26 minutes of downtime all year. Considering we have over 300 locations across the country to manage, thats what I would call a stretch goal. As it stands, I am not even sure we have a way to measure that. Although it is refreshing to have something to work towards as a team.
Considering I love doing what I do for work, I’d like to practice my Engineering-craft and implement an uptime/availability goal here on my own web server. Since I am only a one-man operation with limited availability, I do not think Five Nines is feasible. Instead I will opt for 99.99% instead.
Availability is generally calculated based on how long a service was unavailable over some period. Assuming no planned downtime, Table 1-1 indicates how much downtime is permitted to reach a given availability level.
Architecture and Design
On December 25th 2023 - I rebuilt this web server to meet one of my favorite princples, keep it simple. Moving from WordPress to serving plain HTML5/CSS3 greatly reduces the complexity of the system. This removes the backend database requirement and also reduces resource overhead. I had learned that Wordpress (being PHP-based) runs a build process for each page requested by a visitor. This means that the server is doing more work to serve the same content. This also implies that a large spike in normal traffic would have the potential to DoS the backend server.
To further reducing the likelyhood of a DoS event, I’ll need a Content Delivery Network. During my not so long ago WordPress days, I would have used QUIC.Cloud. They have a plugin that has fantastic integration with WordPress and the backend OpenLightSpeed webserver. However I’ve learned that QUIC.Cloud seemingly struggles to effectivly cache plain HTML. I could go with Cloudflare and their free tier, but it does not have the SLO/SLA that I am looking to achieve. I’m also not willing to shill out $30/month for their CDN either. Instead I have opt-ed to use Bunny.net. They have about 40 more PoPs (Points of Presents) than QUIC.Cloud which should reduce latency to my website in some parts of the world, but the big selling factor was their documentation and ability to integrate painlessly with basic HTML5.
Another important note, is the location of my webserver. While it is generally okay to host a website from home; I will not. Instead this webserver lives and has lived in the Akamai Data Center (formally Linode) in Fremont, California. This removes my needs to worry about redundant cooling, electrcity, and network. It also allows me to scale my server both vertically and horizonally as my needs change. Inside this Fremont Data Center, I am also performing rolling backups. This way even if the server is broken beyond repair for whatever reason; I can rollback and restore to a known good state in about 10 minutes.
Now that we are comfortable with our hardware and operational software stack, we need to properly monitor these underlying services. Let’s think about this… I’ll need to have decent visibility into….
- Resource utilization.
- Error rates.
- SSL certificate status.
- Webserver latency.
- CDN latency.
NewRelic will be my primary monitoring solution. Using their locally installed agent, I can monitor resource utilization, error rates, and backend latency. NewRelic will also have the ability to alert me by email and a personal PagerDuty account in the event of a full-scale outage. The Linode Cloud Manager will act as my seconday method for alerting against unusually high resource utilization for an extended period of time.
UptimeKuma will be used to monitor SSL certificate status, webserver latency, and CDN latency. It will also alert me to outages via my personal PagerDuty account. This UptimeKuma instance is operated within the Oracle Cloud Infrastructure in a region seperate from myself and the webserver. By leveraging this third geographical point of monitoring, I can collect and analyze data to find latency and areas of improvement that otherwise would be hidden to myself.
2024 is here and this website/project finally for the first time in its existence has a practical goal. Hope to see you later in the year!
Hello and welcome to 2024! I hope all of my readers are doing well. We had a fantatsic Christmas Holiday with the family and did a bit of travelling. Today I...
Outlining my 99.99% uptime goal.
Personal notes for Magic
HTML Hobbiest Webring Landing Page/Post
Method of Procedure for migrating from WordPress to plain HTML!
My take on the W900 DLC!
Howdy Friends! Today I will be tuning the underlying webserver that is running this very WordPress application. My hope is to reduce overall CPU load as well...
How to resolve my Jekyll/Cloudflare Pages deployment error.
In High School I had one dream that stands out. Own a Porsche by the time I was 26. Looking back, I have no idea where this dream came from; because I was ra...
Personal ramblings about my new town!
Hello and welcome friends and family! I am glad that you have taken interest in my Jellyfin server or another of my home lab adventures. Below you will find ...
ProtonMail Review - Product I pay for.
How to manage Pi-Hole
My new Gaming PC. Its boring but it’ll do!
How to setup Pi-Hole and Wireguard on Linode
How to update the hostname of a Raspberry Pi!
Can a Raspberry Pi Zero host a family VPN Server?
Logitech G413 Keyboard review.
Razer Huntsman Mini review.
YouTube video cruising through Colorado!
Ramblings about PiAware after one month of operation.
Guide to setup a Raspberry Pi from start to finish!
Guide to configuring the Timezone on a Raspberry Pi.