Renewing Kubernetes certificates
It has been a year since you initialised your cluster and suddenly things stopped working. Chances are, you only need to renew your certificates.
It has been more than a year since I initialised my bare-metal Kubernetes cluster on my Raspberry Pis, and the experience of deploying most of my work to it was amazing. And of course being busy with work left me with little time to maintain the cluster including checking for errors.
I tend to monitor my cluster using an iPad app called KubeNav where I gave access to this client by giving it a copy of my kubectl config file since I’m the only user of this cluster. Some time ago, this app stopped working for some reason, and every time I tried to access my cluster it gave me a generic error message which pointed to an error in the json response format from the API. I didn’t pay much attention to it as I thought the error could be a bug in the app itself.
Shortly after, one of the cluster nodes went down and I tried to restart it. However, I was still able to access my Kubernetes Dashboard via the web (which also explains why I didn’t pay much attention to the app at that time). The dashboard signaled that the node was down no matter how many times I restarted the node.
Brilliantly, I thought I could resolve the issue by restarting the master node as it would try to ping the nodes after everything reboots. I went ahead only to find out that my entire cluster had been shut down. It turned out that the master node could not ping any of the worker nodes. It was then I realised that the controller API container was stuck in a crash loop, preventing it from initialising.
It took a while for me to debug the issue as the error messages were not descriptive enough i.e. “permission denied”, and my kubectl client was useless at that point. Hence, I thought I'd document the steps I took to restore my cluster hoping it might help someone someday (perhaps myself next year).
Checking the expiry of certificates
Fact: When you initialise your cluster, all certificates generated by kubeadm are set to expire a year later. In fact, the official documentation does mention the 1 year validity however it’s only mentioned once at the top of their article here. I only wish this was made more obvious.
Client certificates generated by kubeadm expire after 1 year.
During which when the kubectl client is useless, you can still make use of the kubeadm client to check and renew your certificates.
To check if your certs are expired or about to expire, you may copy and paste the following command on your terminal.
$ sudo kubeadm certs check-expiration
You should get something similar to:
CERTIFICATE EXPIRES RESIDUAL TIME CERTIFICATE AUTHORITY EXTERNALLY MANAGED
admin.conf Jun 22, 2023 16:03 UTC 326d ca no
apiserver Jun 22, 2023 16:03 UTC 326d ca no
apiserver-etcd-client Jun 22, 2023 16:03 UTC 326d etcd-ca no
apiserver-kubelet-client Jun 22, 2023 16:03 UTC 326d ca no
controller-manager.conf Jun 22, 2023 16:03 UTC 326d ca no
etcd-healthcheck-client Jun 22, 2023 16:03 UTC 326d etcd-ca no
etcd-peer Jun 22, 2023 16:03 UTC 326d etcd-ca no
etcd-server Jun 22, 2023 16:03 UTC 326d etcd-ca no
front-proxy-client Jun 22, 2023 16:03 UTC 326d front-proxy-ca no
scheduler.conf Jun 22, 2023 16:03 UTC 326d ca no
CERTIFICATE AUTHORITY EXPIRES RESIDUAL TIME EXTERNALLY MANAGED
ca May 27, 2031 07:13 UTC 8y no
etcd-ca May 27, 2031 07:13 UTC 8y no
front-proxy-ca May 27, 2031 07:13 UTC 8y no
It is much easier to renew your certificates than you think. Thankfully, kubeadm comes equipped with a function to do just that. You may copy and paste the following command to your terminal to renew all your expired certificates.
$ sudo kubeadm certs renew all
Now that you’ve successfully renewed all of your certificates, you may validate this using the previous step. You'd see now that your certs will be expiring in a year.
Doing this alone, however, may not be enough to get your cluster up and running. There is one last step required that is to restart the kubelet. You can do so using the following command:
$ sudo systemctl restart kubelet
Your cluster may take a few minutes to initialise and discover the nodes, though I’d recommend a reboot instead.
Replacing the kubectl config file
At this point, I managed to get my cluster up and running again, seeing that all services were restored. Although, one problem persisted that is the fact that the kubectl client remained useless.
The error message was slightly clearer this time i.e. “Error: You must be logged in to the server (Unauthorized)”, and that was how I knew the administrator certificates provided to kubectl config had expired.
Luckily, when kubeadm renewed all the certificates it also renewed the administrator certificate. So now it’s only a matter of locating the new cert. I found that a new config file was automatically generated, and I could copy it using the command below.
But first, let’s backup the existing config file in case we need it in the future.
$ cp ~/.kube/config ~/.kube/config.bak
Once that is done, we can copy the newly generated file containing the new certificate.
$ sudo cp /etc/kubernetes/admin.conf ~/.kube/config
And there you have it, kubectl should now have access to the controller APIs, and your cluster functionality should be fully restored.
For me, this experience serves as a reminder to not take things for granted although everything worked just fine for months. As a DevOps practitioner, I’d always recommend automating things that you find yourself needing over and over again. Automating these steps can be done in a number of ways.
You can simply copy the commands we executed into a bash file and have that run on a yearly cronjob schedule.
Personally, I’d write a simple script in any familiar language to check the expiry date of each cert and then have it renew that particular cert. The entire script can then run on a cronjob on a weekly basis or so.
Whichever way you decide to automate the process of renewing the cluster certificates, I hope this article was useful for you and perhaps saved you a few hours of debugging.