Categories
Uncategorized

cert-manager too old …

Today cert-manager stopped issuing certificates, and all requests said “insecure website”. Uncool, since this affected our Confluence and our sign-in mechanism. So let’s find out what was happening, right? Turns out cert-manager considered itself “too old” (“your ACME client is too old”, literally) and wanted to be updated.

So far, so good. Just perform helm update cert-manager cert-manager, right?

Wrong.

  • First, I had to upgrade to helm3. All right, I could have used helm2, but helm3 was already on here, and it seemed easy. That went (fairly) easy.
  • Then I wanted to upgrade cert-manager. Turns out for that I actually had to upgrade the running k8s version 1.12.x to 1.13.x … otherwise I’d get errors from the helm chart. That just took ages cause AKS is a bit slow.
  • Finally done I wanted to upgrade cert-manager. Until I realized a lot of stateful pods were stuck in “initialization”. Turns out that AKS had issues moving the volumes around, I still don’t know why. (Did I mention I just don’t like pretty much anything about Azure? It’s just so incredibly cumbersome to use, and nothing is where you expect it). So I had to manually mount the volumes on the host the Pod was currently on and have an open TODO now, which just sucks.
  • Finally done I wanted to upgrade cert-manager. The upgrade went just peachy, until I realized that … nothing happened. Turns out they changed pretty much all API versions and annotation keys. So I had to rewrite / upgrade all ingress annotations, update the ClusterIssuer resources and delete the now obsolete former K8S CRDs.

And just like that I had my certificates back. Wasn’t that easy? 😀

 

Categories
Infrastructure Snippets

Helm in a kops cluster with RBAC

I created a K8S cluster on AWS with kops.

I ran helm init to install tiller in the cluster.

I ran helm list  to see if it worked.

I got this:

Error: configmaps is forbidden: User "system:serviceaccount:kube-system:default" \ 
    cannot list configmaps in the namespace "kube-system"

That sucked. And google proved … reluctant. What I could figure out is:

Causes

  • kops sets up the cluster with RBAC enabled (which is good)
  • helm (well, tiller) uses a standard role for doing things (which might be ok, at least it was with my stackpoint cluster), but in that case (for whatever reason) it did not have sufficient privileges
  • so we need to prepare some cluster admin roles for helm to use

Fixes

Just do exactly as it says in the helm docs 🙂 :

  • apply the RBAC yaml file which creates the kube-system/tiller service account, and binds this to the cluster-admin  role.
  • install helm with: helm init –service-account tiller

Is that secure? Not so much. With helm you can still do anything to the cluster at all. I might get to this in a later post.