This blog now runs on Kubernetes: Here's my architecture
As I alluded too in my previous post, I've recently been working in moving my blog (and all of my services) to Kubernetes. Up until this point, I've been running all of my services in containers, but on specific machines with Docker installed. I had a pretty sophisticated and automated deployment process but it was only ever an interim solution before I delved into Kubernetes.
The decision to move to Kubernetes was further expedited by being given access to the Digital Ocean Kubernetes Limited Availability Preview. I'd often considered moving to K8s, but faced with a choice between paying a small fortune for Google Kubernetes Engine, or deploying it manually, I decided to hold fire.
But with access to the Limited Availability and some careful planning and architecting, I did it! This blog is now running fully in Kubernetes.
The diagram below shows the architecture of this blog (click to enlarge):
There are several key components to my setup:
- Managed Kubernetes: Currently running on Digital Ocean Kubernetes - comparable to Google Kubernetes Engine.
- ingress-nginx and cert-manager: ingress-nginx provides a nginx powered ingress. It removes the requirement for a load balancer for each deployment, and hands traffic off to the backend based on host headers. cert-manager is a tool from Jetstack that automatically provisions LetsEncrypt Certificates for ingress-nginx.
- Amazon S3 & CloudFront: Amazon S3 is used for storing any images that I upload to my blog posts. It ensures that my Pods remain stateless. CloudFront sits in front of S3 to accelerate my images via CDN. I also use S3 as part of a cronjob that takes a database backup and uploads it to S3. There are lifecycle policies to change the storage class of the backups before eventually deleting them.
- MySQL Database: The database for this blog is external to Kubernetes, as I want everything running in K8s to remain stateless. The database servers are deployed via Terraform, and configured via Ansible. Backups are taken nightly via a cronjob scheduled in Kubernetes.
- GitHub, Google Cloud Build & Google Container Registry: All of my container images are developed locally on my laptop using Visual Studio Code. The source is then pushed to GitHub. Google Cloud Build has a trigger configured that watches the GitHub Repository. When a push is made to the Master Branch, Google Cloud Build builds the container image and stores it in Google Container Registry. Kubernetes then pulls these container images from GCR as part of the Deployment Spec.
- Google Cloud DNS: This is used for all of my external DNS Records.
What about those images? We don't want any Persistent Volumes!
As I hinted at above, one of the challenges I had when moving my blog to Kubernetes was that in its old environment, there was a need for volumes to persist across container reboots.
This was due to the way images were handled. I use Ghost as a blogging platform, because I much prefer it compared to the alternatives like Wordpress. When writing posts, Ghost conveniently allows you to upload images directly to the editor. This is great as it is really convenient.
The problem is, the Ghost default storage provider stores the images locally within the web root. When running a single container instance with a Persistent Docker Volume, this isn't an issue. Users are always going to hit the same container, and there is a volume saved to disk that will persist the images across container reboots.
With my Kubernetes architecture, this becomes a problem. I wanted my Ghost Deployment to take advantage of Horizontal Pod Autoscaling, and scale based on resource usage. If a node failed, I wanted the Pods just to be rescheduled onto another node, and to do that I needed my Pods to be stateless.
Luckily, there is a storage adapter for Ghost, that integrates with S3. This adapter changes the behaviour of Ghost, I can still upload images to the editor, but in the background they are uploaded to an S3 bucket. A link is then inserted into the post referencing the CloudFront distribution that sits in front of the bucket. This removes the need to store any images locally, and means that I can stop caring about individual pods. If a Pod is terminated, and re-scheduled somewhere else. I wouldn't even know or care.
Back it up, Back it up!
One thing I was keen to implement was a proper backup solution for my database. In the old environment I was just taking periodic snapshots of the data volume. This worked but wasn't optimal.
I identified that I was going to have a requirement to back up more than just my blog database, and so embarked on a mission to design a solution that I could use to back up multiple databases.
In the end, I settled on a custom written docker image based on Alpine Linux. This image (which I will release at some point to GitHub), contains aws-cli, s3cmd, and mysql-client. It takes a series of environment variables (AWS Credentials, S3 Target Information, and MySQL database details) and when run performs a full dump of the database, and uploads it to S3. Once performed, the container exits.
With this container created, I next setup a cronjob in Kubernetes. This Cronjob is configured to run at 1am each day. When run, it connects to my Ghost database, performs a full mysql dump, and uploads it to my S3 bucket.
Lifecycle Policies are configured on the backup bucket, backups stay in S3 Standard Storage for 30 days, at which point they are moved to S3 Infrequent Access storage. After 90 days they are automatically moved to Glacier, before being deleted after a year.
This means I can sleep easy knowing that backups are taking place nightly, are being kept for a year, and won't cost a fortune to keep.
Don't deploy manually!
Everything above is deployed in a fully automated fashion. Primarily using Terraform, with some Ansible. If someone were to gain access to my accounts and delete my entire environment, I could fully re-deploy it using Terraform and Ansible. All it would take is two commands and about 3 minutes. Everything would be re-created, DNS Records would be updated and my blog would be back online.
There were some challenges with the Terraform Kubernetes Provider, namely lack of support for some API Objects like Ingress, and Cronjob. In these scenarios, I had to manually write the Kubernetes yaml.
To make sure that my entire environment is still re-deployable via Terraform, I linked these configurations to the appropriate Terraform resources via local-exec. Below is an example of how the Ingress for this blog is created, by being linked to the DNS Record.
This has some limitations, for example if I want to change the ingress spec, I have to taint the DNS Record in Terraform to re-run the local-exec, but I can handle this small inconvenience for the wider convenience it adds.
But there is so much more...!
In setting up this architecture, and migrating my blog, there are so many more things I've implemented and learned. I'll cover some of these in a future blog posts.
I'd welcome any comments or feedback on my post and/or setup.