Featured image of post Highly available k3s control plane with HAProxy

Highly available k3s control plane with HAProxy

To control my k3s cluster, I had a kubeconifg file that includes an IP address of one of my master nodes.

1
      server: https://10.0.69.104:6443

This works, except when that particular master node is down, either for maintenance or due to an unexpected failure. In these situations I had to manually edit the kubeconfig to point to another surviving master node.

To make my life better, I decided to put the control plane behind HAProxy.

Environment

  1. Kubernetes: v1.33.4+k3s1
  2. HAProxy: 2.9

Deploying HAProxy

As I had done previously with Proxmox web interface, I decided to set up HAProxy on my k3s cluster.

This time, I wanted to load balance two things for the master nodes:

  1. Kubernetes API server (port 6443)
  2. SSH access (port 22)

Load balancing SSH is to enable my git pipeline to grab kubeconfig from master node:

config.yml
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
apiVersion: v1
kind: ConfigMap
metadata:
  name: k8s-master-proxy
  namespace: network
data:
  haproxy.cfg: |
    global
        log stdout format raw local0
        maxconn 4096

    defaults
        log     global
        mode    tcp
        option  tcplog
        option  dontlognull
        timeout connect 5000ms
        timeout client  50000ms
        timeout server  50000ms

    frontend k8s-api
        bind *:6443
        mode tcp
        option tcplog
        default_backend k8s-masters

    backend k8s-masters
        mode tcp
        balance roundrobin
        option tcp-check
        server master1 10.0.69.102:6443 check fall 3 rise 2
        server master2 10.0.69.103:6443 check fall 3 rise 2
        server master3 10.0.69.104:6443 check fall 3 rise 2

    frontend ssh
        bind *:22
        mode tcp
        option tcplog
        default_backend ssh-masters

    backend ssh-masters
        mode tcp
        balance roundrobin
        option tcp-check
        server master1 10.0.69.102:22 check fall 3 rise 2
        server master2 10.0.69.103:22 check fall 3 rise 2
        server master3 10.0.69.104:22 check fall 3 rise 2    
deploy.yml
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
apiVersion: apps/v1
kind: Deployment
metadata:
  name: k8s-master-proxy
  namespace: network
  labels:
    app: k8s-master-proxy
spec:
  replicas: 1
  selector:
    matchLabels:
      app: k8s-master-proxy
  template:
    metadata:
      labels:
        app: k8s-master-proxy
    spec:
      containers:
      - name: k8s-master-proxy
        image: haproxy:2.9-alpine
        ports:
        - name: api
          containerPort: 6443
          protocol: TCP
        - name: ssh
          containerPort: 22
          protocol: TCP
        volumeMounts:
        - name: config
          mountPath: /usr/local/etc/haproxy
        livenessProbe:
          tcpSocket:
            port: 6443
          initialDelaySeconds: 10
          periodSeconds: 10
        readinessProbe:
          tcpSocket:
            port: 6443
          initialDelaySeconds: 5
          periodSeconds: 5
      volumes:
      - name: config
        configMap:
          name: k8s-master-proxy
service.yml
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
apiVersion: v1
kind: Service
metadata:
  name: k8s-master-proxy
  namespace: network
  labels:
    app: k8s-master-proxy
spec:
  type: LoadBalancer
  ports:
  - name: api
    port: 6443
    targetPort: 6443
    protocol: TCP
  - name: ssh
    port: 22
    targetPort: 22
    protocol: TCP
  loadBalancerIP: 10.0.69.239
  selector:
    app: k8s-master-proxy

k3s config update

With HAProxy deployed, I updated my kubeconfig to point to the HAProxy IP address:

1
      server: https://10.0.69.239:6443

However, this was not enough. When I ran kubectl get nodes, I got the following error:

E1115 21:41:13.704202 473851 memcache.go:265] “Unhandled Error” err=“couldn’t get current server API group list: Get "https://10.0.69.239:6443/api?timeout=32s": tls: failed to verify certificate: x509: certificate is valid for [internal_ips], 127.0.0.1, ::1, not 10.0.69.239” Unable to connect to the server: tls: failed to verify certificate: x509: certificate is valid for [internal_ips], 127.0.0.1, ::1, not 10.0.69.239

To make the certificate also valid for the HAProxy service IP, I added the following to each k3s master node’s /etc/rancher/k3s/config.yaml:

1
2
tls-san:
  - 10.0.69.239

And restarted all k3s services.

With that, everything was back to normal.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
$ k get node
NAME        STATUS   ROLES                       AGE    VERSION
gwork01     Ready    <none>                      88d    v1.33.3+k3s1
gwork02     Ready    <none>                      35d    v1.33.5+k3s1
kmaster02   Ready    control-plane,etcd,master   84d    v1.33.4+k3s1
kmaster04   Ready    control-plane,etcd,master   67d    v1.33.4+k3s1
kmaster05   Ready    control-plane,etcd,master   49d    v1.33.4+k3s1
kwork01     Ready    <none>                      129d   v1.32.6+k3s1
kwork02     Ready    <none>                      129d   v1.32.6+k3s1
kwork03     Ready    <none>                      49d    v1.33.4+k3s1
kwork04     Ready    <none>                      118d   v1.32.6+k3s1
Built with Hugo
Theme Stack designed by Jimmy