KhueApps
Home/DevOps/Run Docker Swarm over a WireGuard VPN

Run Docker Swarm over a WireGuard VPN

Last updated: October 06, 2025

Overview

Running Docker Swarm over a VPN isolates cluster control and data traffic from the public internet. This guide shows a practical setup using WireGuard, with Swarm advertising and data-path bound to the VPN interface. You’ll get a minimal working example, ports to open, pitfalls, and performance tips.

Prerequisites

  • Two or more Linux hosts with Docker Engine 20.10+ and WireGuard installed
  • Root/sudo access
  • Basic familiarity with networking and Docker Swarm

Quickstart

  1. Create a WireGuard tunnel between all Swarm nodes (site-to-site or mesh).
  2. Ensure each node has a stable VPN IP (e.g., 10.66.0.0/24 on wg0).
  3. Enable IP forwarding and bring up wg0 on each node.
  4. Initialize Swarm advertising over wg0; ensure data-path is also wg0.
  5. Join workers via the VPN IP and deploy an encrypted overlay network.
  6. Deploy a test stack and verify cross-node reachability.

Minimal working example

The example below uses two nodes: manager (10.66.0.1) and worker (10.66.0.2), with WireGuard interface wg0.

1) Enable forwarding (both nodes)

sudo sysctl -w net.ipv4.ip_forward=1
sudo sysctl -w net.ipv6.conf.all.forwarding=1
# persist across reboots
printf '\nnet.ipv4.ip_forward=1\nnet.ipv6.conf.all.forwarding=1\n' | sudo tee -a /etc/sysctl.conf

2) WireGuard configs

Manager: /etc/wireguard/wg0.conf

[Interface]
Address = 10.66.0.1/24
PrivateKey = <manager-private-key>
ListenPort = 51820

[Peer]
PublicKey = <worker-public-key>
AllowedIPs = 10.66.0.2/32
PersistentKeepalive = 25

Worker: /etc/wireguard/wg0.conf

[Interface]
Address = 10.66.0.2/24
PrivateKey = <worker-private-key>

[Peer]
PublicKey = <manager-public-key>
Endpoint = <manager-public-ip-or-dns>:51820
AllowedIPs = 10.66.0.1/32
PersistentKeepalive = 25

Bring up the tunnel:

sudo wg-quick up wg0
sudo wg

Optional MTU tuning (common for VPN+VXLAN):

# Start with 1420 for WireGuard, then adjust if needed
echo "PostUp = ip link set mtu 1420 dev wg0" | sudo tee -a /etc/wireguard/wg0.conf
sudo wg-quick down wg0 && sudo wg-quick up wg0

3) Initialize Swarm (manager)

# Bind both advertise and data-path to the VPN interface
sudo docker swarm init \
  --advertise-addr 10.66.0.1 \
  --data-path-addr 10.66.0.1

# Get worker token
sudo docker swarm join-token worker

4) Join the worker (worker)

sudo docker swarm join \
  --token <token-from-manager> \
  10.66.0.1:2377

5) Create an encrypted overlay network (manager)

sudo docker network create \
  --driver overlay \
  --attachable \
  --opt encrypted \
  frontnet

6) Deploy a test stack (manager)

stack.yaml

version: "3.8"
services:
  web:
    image: nginx:alpine
    deploy:
      replicas: 2
      placement:
        constraints: ["node.role==worker"]
    ports:
      - "8080:80"
    networks:
      - frontnet
networks:
  frontnet:
    external: true

Deploy and check:

sudo docker stack deploy -c stack.yaml demo
sudo docker stack ps demo
sudo docker service ls

Access http://<any-node-ip>:8080 and refresh to see load-balancing.

Verify

  • Ensure node gossip and VXLAN use wg0: services should start on both nodes without flapping.
  • Check ports are bound on wg0 only and not on public interfaces.
  • Confirm overlay encryption:
sudo docker network inspect frontnet | grep -i encrypted -A2

Required ports (over the VPN only)

  • 2377/TCP: Swarm management (manager <-> worker)
  • 7946/TCP/UDP: Node discovery and gossip
  • 4789/UDP: VXLAN data plane (overlay networks)
  • 51820/UDP: WireGuard (tunnel)

Example iptables rules to restrict Swarm to wg0:

# Allow Swarm ports only on wg0
sudo iptables -A INPUT -i wg0 -p tcp --dport 2377 -j ACCEPT
sudo iptables -A INPUT -i wg0 -p tcp --dport 7946 -j ACCEPT
sudo iptables -A INPUT -i wg0 -p udp --dport 7946 -j ACCEPT
sudo iptables -A INPUT -i wg0 -p udp --dport 4789 -j ACCEPT
# Drop these from other interfaces (adjust policy/firewall as needed)

Common pitfalls

  • Mixed interfaces: Forgetting --data-path-addr causes VXLAN (4789/UDP) to use a public NIC while control uses VPN. Always set both advertise-addr and data-path-addr.
  • MTU/fragmentation: VPN + VXLAN adds overhead. Symptoms: intermittent packet loss or service flaps. Lower wg0 MTU (e.g., 1420→1380) and retest.
  • NAT traversal: If workers sit behind NAT, set PersistentKeepalive and ensure manager’s endpoint is reachable.
  • Firewall scope: Opening Swarm ports on the internet defeats the point. Limit to wg0.
  • Clock skew: TLS certs and Raft can fail with large NTP drift. Sync time (chrony or systemd-timesyncd).
  • DNS leaks: Ensure containers resolve via expected DNS across sites; consider pushing consistent resolvers.
  • Overlapping CIDRs: Avoid overlap between VPN subnet and container subnets.

Performance notes

  • Overhead: VPN encryption + VXLAN encapsulation costs CPU and MTU. Use modern CPUs with AES/GRO offload where possible.
  • Data path pinning: --data-path-addr keeps VXLAN on wg0, reducing cross-Internet hops and jitter.
  • Overlay encryption vs VPN: If the VPN is trusted, you may omit --opt encrypted to reduce double encryption overhead.
  • Placement: Reduce cross-node chatter by co-locating tightly coupled services using constraints or affinities.
  • Packet sizing: After setting wg0 MTU, consider Docker’s overlay MTU via daemon.json if needed.

Example /etc/docker/daemon.json snippet:

{
  "mtu": 1400
}

Restart Docker after changes.

Variations

  • Multi-manager: Repeat join for additional managers over VPN. Ensure odd number of managers and spread across failure domains.
  • Multi-site: Use a hub-and-spoke or full-mesh WireGuard; ensure all nodes can reach each other’s wg0 IPs.
  • Rootless Docker: Not recommended for Swarm data-plane; use rootful for predictable networking.

Tiny FAQ

  • Can I run managers only on one site? Yes; place workers remotely and keep quorum local to reduce latency risk.
  • Do I still need overlay encryption? Optional. VPN already encrypts; disabling overlay encryption can improve throughput.
  • How do I rotate join tokens? On a manager: docker swarm join-token --rotate worker|manager.
  • Can I expose services on public IPs? Yes; publish ports as usual. Swarm internode traffic stays in the VPN; client traffic can use public interfaces.
  • What if VPN drops? Swarm will try to reconnect. Consider multiple managers and health checks, and ensure WireGuard auto-starts at boot.

Series: Docker

DevOps