Article

Published: July 31, 2022

Dexter Codo

Edited: August 24, 2022

Renewing Kubernetes certificates

It has been a year since you initialised your cluster and suddenly things stopped working. Chances are, you only need to renew your certificates.

Background

It has been more than a year since I initialised my bare-metal Kubernetes cluster on my Raspberry Pis, and the experience of deploying most of my work to it was amazing. And of course being busy with work left me with little time to maintain the cluster including checking for errors. 

I tend to monitor my cluster using an iPad app called KubeNav where I gave access to this client by giving it a copy of my kubectl config file since I’m the only user of this cluster. Some time ago, this app stopped working for some reason, and every time I tried to access my cluster it gave me a generic error message which pointed to an error in the json response format from the API. I didn’t pay much attention to it as I thought the error could be a bug in the app itself. 

Shortly after, one of the cluster nodes went down and I tried to restart it. However, I was still able to access my Kubernetes Dashboard via the web (which also explains why I didn’t pay much attention to the app at that time). The dashboard signaled that the node was down no matter how many times I restarted the node. 

Brilliantly, I thought I could resolve the issue by restarting the master node as it would try to ping the nodes after everything reboots. I went ahead only to find out that my entire cluster had been shut down. It turned out that the master node could not ping any of the worker nodes. It was then I realised that the controller API container was stuck in a crash loop, preventing it from initialising. 

It took a while for me to debug the issue as the error messages were not descriptive enough i.e. “permission denied”, and my kubectl client was useless at that point. Hence, I thought I'd document the steps I took to restore my cluster hoping it might help someone someday (perhaps myself next year). 

Checking the expiry of certificates

Fact: When you initialise your cluster, all certificates generated by kubeadm are set to expire a year later. In fact, the official documentation does mention the 1 year validity however it’s only mentioned once at the top of their article here. I only wish this was made more obvious. 

Client certificates generated by kubeadm expire after 1 year.

During which when the kubectl client is useless, you can still make use of the kubeadm client to check and renew your certificates.

To check if your certs are expired or about to expire, you may copy and paste the following command on your terminal.

$ sudo kubeadm certs check-expiration

You should get something similar to:

OUTPUT CERTIFICATE                EXPIRES                  RESIDUAL TIME   CERTIFICATE AUTHORITY   EXTERNALLY MANAGED admin.conf                 Jun 22, 2023 16:03 UTC   326d            ca                      no       apiserver                  Jun 22, 2023 16:03 UTC   326d            ca                      no       apiserver-etcd-client      Jun 22, 2023 16:03 UTC   326d            etcd-ca                 no       apiserver-kubelet-client   Jun 22, 2023 16:03 UTC   326d            ca                      no       controller-manager.conf    Jun 22, 2023 16:03 UTC   326d            ca                      no       etcd-healthcheck-client    Jun 22, 2023 16:03 UTC   326d            etcd-ca                 no       etcd-peer                  Jun 22, 2023 16:03 UTC   326d            etcd-ca                 no       etcd-server                Jun 22, 2023 16:03 UTC   326d            etcd-ca                 no       front-proxy-client         Jun 22, 2023 16:03 UTC   326d            front-proxy-ca          no       scheduler.conf             Jun 22, 2023 16:03 UTC   326d            ca                      no       CERTIFICATE AUTHORITY   EXPIRES                  RESIDUAL TIME   EXTERNALLY MANAGED ca                      May 27, 2031 07:13 UTC   8y              no       etcd-ca                 May 27, 2031 07:13 UTC   8y              no       front-proxy-ca          May 27, 2031 07:13 UTC   8y              no  

Renewing certificates

It is much easier to renew your certificates than you think. Thankfully, kubeadm comes equipped with a function to do just that. You may copy and paste the following command to your terminal to renew all your expired certificates.

$ sudo kubeadm certs renew all

Now that you’ve successfully renewed all of your certificates, you may validate this using the previous step. You'd see now that your certs will be expiring in a year. 

Doing this alone, however, may not be enough to get your cluster up and running. There is one last step required that is to restart the kubelet. You can do so using the following command:

$ sudo systemctl restart kubelet

Your cluster may take a few minutes to initialise and discover the nodes, though I’d recommend a reboot instead.

Replacing the kubectl config file

At this point, I managed to get my cluster up and running again, seeing that all services were restored. Although, one problem persisted that is the fact that the kubectl client remained useless. 

The error message was slightly clearer this time i.e. “Error: You must be logged in to the server (Unauthorized)”, and that was how I knew the administrator certificates provided to kubectl config had expired. 

Luckily, when kubeadm renewed all the certificates it also renewed the administrator certificate. So now it’s only a matter of locating the new cert. I found that a new config file was automatically generated, and I could copy it using the command below.

But first, let’s backup the existing config file in case we need it in the future.

$ cp ~/.kube/config ~/.kube/config.bak

Once that is done, we can copy the newly generated file containing the new certificate.

$ sudo cp /etc/kubernetes/admin.conf ~/.kube/config

And there you have it, kubectl should now have access to the controller APIs, and your cluster functionality should be fully restored. 

Final thoughts

For me, this experience serves as a reminder to not take things for granted although everything worked just fine for months. As a DevOps practitioner, I’d always recommend automating things that you find yourself needing over and over again. Automating these steps can be done in a number of ways.

You can simply copy the commands we executed into a bash file and have that run on a yearly cronjob schedule. 

Personally, I’d write a simple script in any familiar language to check the expiry date of each cert and then have it renew that particular cert. The entire script can then run on a cronjob on a weekly basis or so. 

Whichever way you decide to automate the process of renewing the cluster certificates, I hope this article was useful for you and perhaps saved you a few hours of debugging.