cert-manager too old …

Today cert-manager stopped issuing certificates, and all requests said “insecure website”. Uncool, since this affected our Confluence and our sign-in mechanism. So let’s find out what was happening, right? Turns out cert-manager considered itself “too old” (“your ACME client is too old”, literally) and wanted to be updated.

So far, so good. Just perform helm update cert-manager cert-manager, right?

Wrong.

  • First, I had to upgrade to helm3. All right, I could have used helm2, but helm3 was already on here, and it seemed easy. That went (fairly) easy.
  • Then I wanted to upgrade cert-manager. Turns out for that I actually had to upgrade the running k8s version 1.12.x to 1.13.x … otherwise I’d get errors from the helm chart. That just took ages cause AKS is a bit slow.
  • Finally done I wanted to upgrade cert-manager. Until I realized a lot of stateful pods were stuck in “initialization”. Turns out that AKS had issues moving the volumes around, I still don’t know why. (Did I mention I just don’t like pretty much anything about Azure? It’s just so incredibly cumbersome to use, and nothing is where you expect it). So I had to manually mount the volumes on the host the Pod was currently on and have an open TODO now, which just sucks.
  • Finally done I wanted to upgrade cert-manager. The upgrade went just peachy, until I realized that … nothing happened. Turns out they changed pretty much all API versions and annotation keys. So I had to rewrite / upgrade all ingress annotations, update the ClusterIssuer resources and delete the now obsolete former K8S CRDs.

And just like that I had my certificates back. Wasn’t that easy? 😀