WireGuard on K8s (road-warrior-style VPN server)

(See comments on Hacker News.)

WireGuard first appears in Linux kernel 5.6, but Ubuntu 20.04 LTS includes a backport in its 5.4 kernel.

So if your K8s nodes are running Ubuntu 20.04 LTS, they come with WireGuard installed as a kernel module that will automatically load when needed. This means that if you can set CAP_NET_ADMIN on containers, you can run a road-warrior-style WireGuard server in K8s without making changes to the node.

Here's my deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: wireguard
  labels:
    app: wireguard
spec:
  replicas: 1
  strategy:
    type: Recreate
  selector:
    matchLabels:
      app: wireguard
  template:
    metadata:
      labels:
        app: wireguard
    spec:
      restartPolicy: Always
      volumes:
        - name: wg0-key
          secret:
            secretName: wg0-key
        - name: wg0-conf
          configMap:
            name: wg0-conf
      containers:
        - name: wireguard
          image: sclevine/wg
          imagePullPolicy: Always
          lifecycle:
            postStart:
              exec:
                command: ["wg-quick",  "up", "wg0"]
            preStop:
              exec:
                command: ["wg-quick",  "down", "wg0"]
          command: ["tail",  "-f", "/dev/null"]
          volumeMounts:
            - name: wg0-key
              mountPath: /etc/wireguard/wg0.key
              subPath: wg0.key
              readOnly: true
            - name: wg0-conf
              mountPath: /etc/wireguard/wg0.conf
              subPath: wg0.conf
              readOnly: true
          ports:
            - containerPort: 51820
              hostPort: 51820
              protocol: UDP
          securityContext:
            capabilities:
              add:
                - NET_ADMIN

I set this up on a single-node K3s cluster, so I used hostPort and the Recreate strategy. On production cluster, you would want to map 51820/udp to wherever you need with a Service.

Also, you may have noticed that the container is running tail -f /dev/null. This is intentional -- we're only using the pod to configure the host kernel and hold open a network namespace. Is this a K8s anti-pattern? Maybe. But I think the alternative is to configure WireGuard on the host without K8s.

Here's the Dockerfile for sclevine/wg:

FROM ubuntu:focal as builder

RUN apt-get update && \
  apt-get install -y wireguard && \
  rm -rf /var/lib/apt/lists/*

FROM ubuntu:focal

RUN apt-get update && \
  apt-get install -y iproute2 iptables && \
  rm -rf /var/lib/apt/lists/*

COPY --from=builder /usr/bin/wg /usr/bin/wg-quick /usr/bin/

This gets you the WireGuard userspace utility and setup script without the kernel module and associated dependencies (like build-essentials). You could build a smaller image with Alpine, but I decided to use the same build of wg as the Ubuntu Focal host node.

You'll need to generate a key pair for the server and each peer:

$ wg genkey | tee private.key | wg pubkey > public.key

And store the server's private key in a Secret:

$ kubectl -n my-vpn create secret generic wg0-key --from-file=wg0.key=./path/to/private.key

(This assumes you created the Deployment in namespace my-vpn.)

Here's an example ConfigMap for the server config (/etc/wireguard/wg0.conf):

apiVersion: v1
kind: ConfigMap
metadata:
  name: wg0-conf
  labels:
    app: wireguard
data:
  wg0.conf: |
    [Interface]
    Address = 10.1.30.1/24,fdb0:5dfe:70d8:7f0b::1/64
    ListenPort = 51820
    PostUp = wg set %i private-key /etc/wireguard/wg0.key; iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE
    PostDown = iptables -t nat -D POSTROUTING -o eth0 -j MASQUERADE
    MTU = 1500
    SaveConfig = false

    [Peer]
    # first peer
    PublicKey = <client #1 public key>
    AllowedIPs = 10.1.30.3/32,fdb0:5dfe:70d8:7f0b::3/128

    [Peer]
    # second peer
    PublicKey = <client #2 public key>
    AllowedIPs = 10.1.30.4/32,fdb0:5dfe:70d8:7f0b::4/128

This assumes the VPN subnets will be 10.1.30.0/24 (IPv4) and fdb0:5dfe:70d8:7f0b::/64 (IPv6). I picked the IPv4 address arbitrarily and generated the IPv6 ULA here. You may not need to manually set the MTU. The private key (/etc/wireguard/wg0.key) is loaded dynamically from a Secret.

For WireGuard clients, I use this config:

Address = 10.1.30.3/24, fdb0:5dfe:70d8:7f0b::3/64
DNS = 1.1.1.1, 1.0.0.1

[Peer]
PublicKey = <server public key>
AllowedIPs = 0.0.0.0/0, ::/0
Endpoint = <k8s node/LB ip>:51820
PersistentKeepalive = 25

If you want to debug the server, you can get shell access with:

$ kubectl -n my-vpn exec -it $(kubectl -n my-vpn get pods|tail -n 1|cut -f1 -d' ') bash

And check out the server stats:

root@wireguard-6dbf689864-5cnxb:/# wg
interface: wg0
  public key: <server public key>
  private key: (hidden)
  listening port: 51820

peer: <peer 1 public key>
  endpoint: <peer 1 IP>:24189
  allowed ips: 10.1.30.3/32, fdb0:70d8:7f0b:5dfe::3/128
  latest handshake: Now
  transfer: 4.64 MiB received, 53.40 MiB sent

Or confirm WireGuard is listening:

root@wireguard-6dbf689864-5cnxb:/# ss -lun 'sport = :51820'
State              Recv-Q             Send-Q                         Local Address:Port                           Peer Address:Port             Process
UNCONN             0                  0                                    0.0.0.0:51820                               0.0.0.0:*
UNCONN             0                  0                                       [::]:51820                                  [::]:*