Overview
Running Docker Swarm over a VPN isolates cluster control and data traffic from the public internet. This guide shows a practical setup using WireGuard, with Swarm advertising and data-path bound to the VPN interface. You’ll get a minimal working example, ports to open, pitfalls, and performance tips.
Prerequisites
- Two or more Linux hosts with Docker Engine 20.10+ and WireGuard installed
- Root/sudo access
- Basic familiarity with networking and Docker Swarm
Quickstart
- Create a WireGuard tunnel between all Swarm nodes (site-to-site or mesh).
- Ensure each node has a stable VPN IP (e.g., 10.66.0.0/24 on wg0).
- Enable IP forwarding and bring up wg0 on each node.
- Initialize Swarm advertising over wg0; ensure data-path is also wg0.
- Join workers via the VPN IP and deploy an encrypted overlay network.
- Deploy a test stack and verify cross-node reachability.
Minimal working example
The example below uses two nodes: manager (10.66.0.1) and worker (10.66.0.2), with WireGuard interface wg0.
1) Enable forwarding (both nodes)
sudo sysctl -w net.ipv4.ip_forward=1
sudo sysctl -w net.ipv6.conf.all.forwarding=1
# persist across reboots
printf '\nnet.ipv4.ip_forward=1\nnet.ipv6.conf.all.forwarding=1\n' | sudo tee -a /etc/sysctl.conf
2) WireGuard configs
Manager: /etc/wireguard/wg0.conf
[Interface]
Address = 10.66.0.1/24
PrivateKey = <manager-private-key>
ListenPort = 51820
[Peer]
PublicKey = <worker-public-key>
AllowedIPs = 10.66.0.2/32
PersistentKeepalive = 25
Worker: /etc/wireguard/wg0.conf
[Interface]
Address = 10.66.0.2/24
PrivateKey = <worker-private-key>
[Peer]
PublicKey = <manager-public-key>
Endpoint = <manager-public-ip-or-dns>:51820
AllowedIPs = 10.66.0.1/32
PersistentKeepalive = 25
Bring up the tunnel:
sudo wg-quick up wg0
sudo wg
Optional MTU tuning (common for VPN+VXLAN):
# Start with 1420 for WireGuard, then adjust if needed
echo "PostUp = ip link set mtu 1420 dev wg0" | sudo tee -a /etc/wireguard/wg0.conf
sudo wg-quick down wg0 && sudo wg-quick up wg0
3) Initialize Swarm (manager)
# Bind both advertise and data-path to the VPN interface
sudo docker swarm init \
--advertise-addr 10.66.0.1 \
--data-path-addr 10.66.0.1
# Get worker token
sudo docker swarm join-token worker
4) Join the worker (worker)
sudo docker swarm join \
--token <token-from-manager> \
10.66.0.1:2377
5) Create an encrypted overlay network (manager)
sudo docker network create \
--driver overlay \
--attachable \
--opt encrypted \
frontnet
6) Deploy a test stack (manager)
stack.yaml
version: "3.8"
services:
web:
image: nginx:alpine
deploy:
replicas: 2
placement:
constraints: ["node.role==worker"]
ports:
- "8080:80"
networks:
- frontnet
networks:
frontnet:
external: true
Deploy and check:
sudo docker stack deploy -c stack.yaml demo
sudo docker stack ps demo
sudo docker service ls
Access http://<any-node-ip>:8080 and refresh to see load-balancing.
Verify
- Ensure node gossip and VXLAN use wg0: services should start on both nodes without flapping.
- Check ports are bound on wg0 only and not on public interfaces.
- Confirm overlay encryption:
sudo docker network inspect frontnet | grep -i encrypted -A2
Required ports (over the VPN only)
- 2377/TCP: Swarm management (manager <-> worker)
- 7946/TCP/UDP: Node discovery and gossip
- 4789/UDP: VXLAN data plane (overlay networks)
- 51820/UDP: WireGuard (tunnel)
Example iptables rules to restrict Swarm to wg0:
# Allow Swarm ports only on wg0
sudo iptables -A INPUT -i wg0 -p tcp --dport 2377 -j ACCEPT
sudo iptables -A INPUT -i wg0 -p tcp --dport 7946 -j ACCEPT
sudo iptables -A INPUT -i wg0 -p udp --dport 7946 -j ACCEPT
sudo iptables -A INPUT -i wg0 -p udp --dport 4789 -j ACCEPT
# Drop these from other interfaces (adjust policy/firewall as needed)
Common pitfalls
- Mixed interfaces: Forgetting --data-path-addr causes VXLAN (4789/UDP) to use a public NIC while control uses VPN. Always set both advertise-addr and data-path-addr.
- MTU/fragmentation: VPN + VXLAN adds overhead. Symptoms: intermittent packet loss or service flaps. Lower wg0 MTU (e.g., 1420→1380) and retest.
- NAT traversal: If workers sit behind NAT, set PersistentKeepalive and ensure manager’s endpoint is reachable.
- Firewall scope: Opening Swarm ports on the internet defeats the point. Limit to wg0.
- Clock skew: TLS certs and Raft can fail with large NTP drift. Sync time (chrony or systemd-timesyncd).
- DNS leaks: Ensure containers resolve via expected DNS across sites; consider pushing consistent resolvers.
- Overlapping CIDRs: Avoid overlap between VPN subnet and container subnets.
Performance notes
- Overhead: VPN encryption + VXLAN encapsulation costs CPU and MTU. Use modern CPUs with AES/GRO offload where possible.
- Data path pinning: --data-path-addr keeps VXLAN on wg0, reducing cross-Internet hops and jitter.
- Overlay encryption vs VPN: If the VPN is trusted, you may omit --opt encrypted to reduce double encryption overhead.
- Placement: Reduce cross-node chatter by co-locating tightly coupled services using constraints or affinities.
- Packet sizing: After setting wg0 MTU, consider Docker’s overlay MTU via daemon.json if needed.
Example /etc/docker/daemon.json snippet:
{
"mtu": 1400
}
Restart Docker after changes.
Variations
- Multi-manager: Repeat join for additional managers over VPN. Ensure odd number of managers and spread across failure domains.
- Multi-site: Use a hub-and-spoke or full-mesh WireGuard; ensure all nodes can reach each other’s wg0 IPs.
- Rootless Docker: Not recommended for Swarm data-plane; use rootful for predictable networking.
Tiny FAQ
- Can I run managers only on one site? Yes; place workers remotely and keep quorum local to reduce latency risk.
- Do I still need overlay encryption? Optional. VPN already encrypts; disabling overlay encryption can improve throughput.
- How do I rotate join tokens? On a manager: docker swarm join-token --rotate worker|manager.
- Can I expose services on public IPs? Yes; publish ports as usual. Swarm internode traffic stays in the VPN; client traffic can use public interfaces.
- What if VPN drops? Swarm will try to reconnect. Consider multiple managers and health checks, and ensure WireGuard auto-starts at boot.