Wire Guard
Wire Guard
Jason A. Donenfeld
[email protected]
Abstract—WireGuard is a secure network tunnel, operating WireGuard simply gives a virtual interface—wg0 for example—
at layer 3, implemented as a kernel virtual network interface for which can then be administered using the standard ip(8) and
Linux, which aims to replace both IPsec for most use cases, as well ifconfig(8) utilities. After configuring the interface with a
as popular user space and/or TLS-based solutions like OpenVPN, private key (and optionally a pre-shared symmetric key as
while being more secure, more performant, and easier to use. explained in section V-B) and the various public keys of
The virtual tunnel interface is based on a proposed fundamental
principle of secure tunnels: an association between a peer public
peers with whom it will communicate securely, the tunnel
key and a tunnel source IP address. It uses a single round trip simply works. Key exchanges, connections, disconnections,
key exchange, based on NoiseIK, and handles all session creation reconnections, discovery, and so forth happen behind the
transparently to the user using a novel timer state machine scenes transparently and reliably, and the administrator does
mechanism. Short pre-shared static keys—Curve25519 points— not need to worry about these details. In other words, from the
are used for mutual authentication in the style of OpenSSH. The perspective of administration, the WireGuard interface appears
protocol provides strong perfect forward secrecy in addition to a to be stateless. Firewall rules can then be configured using
high degree of identity hiding. Transport speed is accomplished the ordinary infrastructure for firewalling interfaces, with the
using ChaCha20Poly1305 authenticated-encryption for encapsu- guarantee that packets coming from a WireGuard interface will
lation of packets in UDP. An improved take on IP-binding cookies be authenticated and encrypted. Simple and straightforward,
is used for mitigating denial of service attacks, improving greatly
on IKEv2 and DTLS’s cookie mechanisms to add encryption
WireGuard is much less prone to catastrophic failure and
and authentication. The overall design allows for allocating no misconfiguration than IPsec. It is important to stress, however,
resources in response to received packets, and from a systems that the layering of IPsec is correct and sound; everything
perspective, there are multiple interesting Linux implementation is in the right place with IPsec, to academic perfection. But,
techniques for queues and parallelism. Finally, WireGuard can as often happens with correctness of abstraction, there is a
be simply implemented for Linux in less than 4,000 lines of code, profound lack of usability, and a verifiably safe implementation
making it easily audited and verified. is very difficult to achieve. WireGuard, in contrast, starts from
the basis of flawed layering violations and then attempts to
I. Introduction & Motivation rectify the issues arising from this conflation using practical
engineering solutions and cryptographic techniques that solve
In Linux, the standard solution for encrypted tunnels is real world problems.
IPsec, which uses the Linux transform (“xfrm”) layer. Users
fill in a kernel structure determining which ciphersuite and On the other end of the spectrum is OpenVPN, a user space
key, or other transforms such as compression, to use for which TUN/TAP based solution that uses TLS. By virtue of it being in
selector of packets traversing the subsystem. Generally a user user space, it has very poor performance—since packets must
space daemon is responsible for updating these data structures be copied multiple times between kernel space and user space—
based on the results of a key exchange, generally done with and a long-lived daemon is required; OpenVPN appears far
IKEv2 [12], itself a complicated protocol with much choice from stateless to an administrator. While TUN/TAP interfaces
and malleability. The complexity, as well as the sheer amount (say, tun0) have similar wg0-like benefits as described above,
of code, of this solution is considerable. Administrators have OpenVPN is also enormously complex, supporting the entire
a completely separate set of firewalling semantics and secure plethora of TLS functionality, which exposes quite a bit of code
labeling for IPsec packets. While separating the key exchange to potential vulnerabilities. OpenVPN is right to be implemented
layer from the transport encryption—or transformation—layer in user space, since ASN.1 and x509 parsers in the kernel have
is a wise separation from a semantic viewpoint, and similarly historically been quite problematic (CVE-2008-1673, CVE-
while separating the transformation layer from the interface 2016-2053), and adding a TLS stack would only make that
layer is correct from a networking viewpoint, this strictly correct issue worse. TLS also brings with it an enormous state machine,
layering approach increases complexity and makes correct as well as a less clear association between source IP addresses
implementation and deployment prohibitive. and public keys.
WireGuard does away with these layering separations. For key distribution, WireGuard draws inspiration from
Instead of the complexity of IPsec and the xfrm layers, OpenSSH, for which common uses include a very simple
approach toward key management. Through a diverse set of
Permission to freely reproduce all or part of this paper for noncommercial out-of-band mechanisms, two peers generally exchange their
purposes is granted provided that copies bear this notice and the full citation
on the first page. Reproduction for commercial purposes is strictly prohibited static public keys. Sometimes it is simple as PGP-signed email,
without the prior written consent of the Internet Society or the first-named and other times it is a complicated key distribution mechanism
author (for reproduction of an entire paper only). using LDAP and certificate authorities. Importantly, for the most
NDSS ’17, 26 February – 1 March 2017, San Diego, CA, USA part OpenSSH key distribution is entirely agnostic. WireGuard
Copyright 2017 Internet Society, ISBN 1-891562-46-0
http://dx.doi.org/10.14722/ndss.2017.23160
follows suit. Two WireGuard peers exchange their public keys
Permanent ID: 4846ada1492f5d92198df154f48c3d54205657bc through some unspecified mechanism, and afterward they are
able to communicate. In other words, WireGuard’s attitude The interface itself has a private key and a UDP port on
toward key distribution is that this is the wrong layer to address which it listens (more on that later), followed by a list of peers.
that particular problem, and so the interface is simple enough Each peer is identified by its public key. Each then has a list
that any key distribution solution can be used with it. As an of allowed source IPs.
additional advantage, public keys are only 32 bytes long and
can be easily represented in Base64 encoding in 44 characters, When an outgoing packet is being transmitted on a Wire-
which is useful for transferring keys through a variety of Guard interface, wg0, this table is consulted to determine which
different mediums. public key to use for encryption. For example, a packet with
a destination IP of 10.192.122.4 will be encrypted using
Finally, WireGuard is cryptographically opinionated. It the secure session derived from the public key TrMv...WXX0.
intentionally lacks cipher and protocol agility. If holes are Conversely, when wg0 receives an encrypted packet, after
found in the underlying primitives, all endpoints will be decrypting and authenticating it, it will only accept it if its
required to update. As shown by the continuing torrent of source IP resolves in the table to the public key used in the
SSL/TLS vulnerabilities, cipher agility increases complexity secure session for decrypting it. For example, if a packet is
monumentally. WireGuard uses a variant of Trevor Perin’s decrypted from xTIB...qp8D, it will only be allowed if the
Noise [22]—which during its development received quite a bit decrypted packet has a source IP of 10.192.122.3 or in the
of input from the authors of this paper for the purposes of range of 10.192.124.0 to 10.192.124.255; otherwise it is
being used in WireGuard—for a 1-RTT key exchange, with dropped.
Curve25519 [6] for ECDH, HKDF [15] for expansion of ECDH
results, RFC7539 [16]’s construction of ChaCha20 [7] and With this very simple principle, administrators can rely
Poly1305 [5] for authenticated encryption, and BLAKE2s [2] on simple firewall rules. For example, an incoming packet
for hashing. It has built-in protection against denial of service on interface wg0 with a source IP of 10.10.10.230 may be
attacks, using a new crypto-cookie mechanism for IP address considered as authentically from the peer with a public key
attributability. of gN65...Bz6E. More generally, any packets arriving on a
WireGuard interface will have a reliably authentic source IP
Similarly opinionated, WireGuard is layer 3-only; as ex- (in addition, of course, to guaranteed perfect forward secrecy
plained below in section II, this is the cleanest approach for of the transport). Do note that this is only possible because
ensuring authenticity and attributability of the packets. The WireGuard is strictly layer 3 based. Unlike some common VPN
authors believe that layer 3 is the correct way for bridging protocols, like L2TP/IPsec, using authenticated identification
multiple IP networks, and the imposition of this onto WireGuard of peers at a layer 3 level enforces a much cleaner network
allows for many simplifications, resulting in a cleaner and more design.
easily implemented protocol. It supports layer 3 for both IPv4
and IPv6, and can encapsulate v4-in-v6 as well as v6-in-v4. In the case of a WireGuard peer who wishes to route all
traffic through another WireGuard peer, the cryptokey routing
WireGuard puts together these principles, focusing on table could be configured more simply as:
simplicity and an auditable codebase, while still being extremely Configuration 2a
high-speed and suitable for a modicum of environments. By
Interface Public Key Private Key UDP Port
combining the key exchange and the layer 3 transport encryption gN65...z6EA gI6E...fWGE 9182
into one mechanism and using a virtual interface rather than a
transform layer, WireGuard indeed breaks traditional layering Peer Public Key Allowed IPs
principles, in pursuit of a solid engineering solution that is both HIgo...8ykw 0.0.0.0/0
more practical and more secure. Along the way, it employs
several novel cryptographic and systems solutions to achieve Here, the peer authorizes HIgo...f8yk to put packets onto
its goals. wg0 with any source IP, and all packets that are outgoing on
wg0 will be encrypted using the secure session associated with
II. Cryptokey Routing that public key and sent to that peer’s endpoint.
2
to have the initial endpoint of a peer: 4) A header containing various fields, explained in sec-
Configuration 2b tion V-D, is prepended to the now encrypted packet.
5) This header and encrypted packet, together, are sent as a
Interface Public Key Private Key UDP Port
gN65...z6EA gI6E...fWGE 9182 UDP packet to the Internet UDP/IP endpoint associated
with peer TrMv...WXX0, resulting in an outer UDP/IP packet
Peer Public Key Allowed IPs Internet Endpoint containing as its payload a header and encrypted inner-
HIgo...8ykw 0.0.0.0/0 192.95.5.69:8746 packet. The peer’s endpoint is either pre-configured, or it
is learned from the outer external source IP header field
Then, this host, gN65...z6EA, sends an encrypted packet of the most recent correctly-authenticated packet received.
to HIgo...f8yk at 192.95.5.69:8746. After HIgo...f8yk re- (Otherwise, if no endpoint can be determined, the packet is
ceives a packet, it updates its table to learn that the endpoint dropped, an ICMP message is sent, and -EHOSTUNREACH
for sending reply packets is, for example, 192.95.5.64:9182: is returned to user space.)
Configuration 1b A UDP/IP packet reaches UDP port 8746 of the host, which is
Interface Public Key Private Key UDP Port the listening UDP port of interface wg0:
HIgo...8ykw yAnz...fBmk 8746
1) A UDP/IP packet containing a particular header and an
Peer Public Key Allowed IPs Internet Endpoint
xTIB...p8Dg 10.192.122.3/32,
encrypted payload is received on the correct port (in this
10.192.124.0/24 particular case, port 8746).
TrMv...WXX0 10.192.122.4/32, 2) Using the header (described below in section V-D),
192.168.0.0/16 WireGuard determines that it is associated with peer
gN65...z6EA 10.10.10.230/32 192.95.5.64:9182 TrMv...WXX0’s secure session, checks the validity of the
message counter, and attempts to authenticate and decrypt
Note that the listen port of peers and the source port of it using the secure session’s receiving symmetric key. If
packets sent are always the same, adding much simplicity, it cannot determine a peer or if authentication fails, the
while also ensuring reliable traversal behind NAT. And since packet is dropped.
this roaming property ensures that peers will have the very 3) Since the packet has authenticated correctly, the source IP
latest external source IP and UDP port, there is no requirement of the outer UDP/IP packet is used to update the endpoint
for NAT to keep sessions open for long. (For use cases in for peer TrMv...WXX0.
which it is imperative to keep open a NAT session or stateful 4) Once the packet payload is decrypted, the interface has a
firewall indefinitely, the interface can be optionally configured plaintext packet. If this is not an IP packet, it is dropped.
to periodically send persistent authenticated keepalives.) Otherwise, WireGuard checks to see if the source IP
address of the plaintext inner-packet routes correspondingly
This design allows for great convenience and minimal in the cryptokey routing table. For example, if the source
configuration. While an attacker with an active man-in-the- IP of the decrypted plaintext packet is 192.168.31.28,
middle could, of course, modify these unauthenticated external the packet correspondingly routes. But if the source IP is
source IPs, the attacker would not be able to decrypt or modify 10.192.122.3, the packet does not route correspondingly
any payload, which merely amounts to a denial-of-service attack, for this peer, and is dropped.
which would already be trivially possible by just dropping the 5) If the plaintext packet has not been dropped, it is inserted
original packets from this presumed man-in-the-middle position. into the receive queue of the wg0 interface.
And, as explained in section VI-E, hosts that cannot decrypt
and subsequently reply to packets will quickly be forgotten. It would be possible to separate the list of allowed IPs into two
lists—one for checking the source address of incoming packets
III. Send/Receive Flow and one for choosing peer based on the destination address.
The roaming design of section II-A, put together with the But, by keeping these as part of the same list, it allows for
cryptokey routing table of section II, amounts to the following something similar to reverse-path filtering. When sending a
flows when receiving and sending a packet on interface wg0 packet, the list is consulted based on the destination IP; when
using “Configuration 1” from above. receiving a packet, that same list is consulted for determining if
the source IP is allowed. However, rather than asking whether
the received packet’s sending peer has that source IP as part
A packet is locally generated (or forwarded) and is ready to be of its allowed IPs list, it instead is able to ask a more global
transmitted on the outgoing interface wg0: question—which peer would be chosen in the table for that
1) The plaintext packet reaches the WireGuard interface, wg0. source IP, and does that peer match that of the received packet.
2) The destination IP address of the packet, 192.168.87.21, This enforces a one-to-one mapping of sending and receiving
is inspected, which matches the peer TrMv...WXX0. (If it IP addresses, so that if a packet is received from a particular
matches no peer, it is dropped, and the sender is informed peer, replies to that IP will be guaranteed to go to that same
by a standard ICMP “no route to host” packet, as well as peer.
returning -ENOKEY to user space.)
3) The symmetric sending encryption key and nonce counter IV. Basic Usage
of the secure session associated with peer TrMv...WXX0
are used to encrypt the plaintext packet using ChaCha20- Before going deep into the cryptography and implementation
Poly1305. details, it may be useful to see a simple command line interface
3
for using WireGuard, to bring concreteness to the concepts processed asynchronously to transport data messages. These
thus far presented. messages use the “IK” pattern from Noise [22], in addition to a
novel cookie construction to mitigate denial of service attacks.
Consider a Linux environment with a single physical
The net result of the protocol is a very robust security system,
network interface, eth0, connecting it to the Internet with
which achieves the requirements of authenticated key exchange
a public IP of 192.95.5.69. A WireGuard interface, wg0,
(AKE) security [17], avoids key-compromise impersonation,
can be added and configured to have a tunnel IP address
avoids replay attacks, provides perfect forward secrecy, provides
of 10.192.122.3 in a /24 subnet with the standard ip(8)
identity hiding of static public keys similar to SIGMA [14],
utilities:
Adding the wg0 interface and has resistance to denial of service attacks.
$ ip link add dev wg0 type wireguard
$ ip address add dev wg0 10.192.122.3/24 A. Silence is a Virtue
$ ip route add 10.0.0.0/8 dev wg0
$ ip address show One design goal of WireGuard is to avoid storing any
1: lo: <LOOPBACK> mtu 65536 state prior to authentication and to not send any responses to
inet 127.0.0.1/8 scope host lo unauthenticated packets. With no state stored for unauthenti-
2: eth0: <BROADCAST> mtu 1500 cated packets, and with no response generated, WireGuard is
inet 192.95.5.69/24 scope global eth0
invisible to illegitimate peers and network scanners. Several
3: wg0: <POINTOPOINT,NOARP> mtu 1420
inet 10.192.122.3/24 scope global wg0 classes of attacks are avoided by not allowing unauthenticated
packets to influence any state. And more generally, it is possible
to implement WireGuard in a way that requires no dynamic
The cryptokey routing table can then be configured using memory allocation at all, even for authenticated packets, as
the wg(8) tool in a variety of fashions, including reading from explained in section VII. However, this property requires the
configuration files: very first message received by the responder to authenticate
Configuring the cryptokey routing table of wg0 the initiator. Having authentication in the first packet like this
$ wg setconf wg0 configuration-1.conf
$ wg show wg0
potentially opens up the responder to a replay attack. An attacker
interface: wg0 could replay initial handshake messages to trick the responder
public key: HIgo...8ykw into regenerating its ephemeral key, thereby invalidating the
private key: yAnz...fBmk session of the legitimate initiator (though not affecting the
listening port: 8746 secrecy or authenticity of any messages). To prevent this, a
peer: xTIB...p8Dg 12-byte TAI64N [4] timestamp is included, encrypted and
allowed ips: 10.192.124.0/24, 10.192.122.3/32 authenticated, in the first message. The responder keeps track
peer: TrMv...WXX0 of the greatest timestamp received per peer and discards packets
allowed ips: 192.168.0.0/16, 10.192.122.4/32 containing timestamps less than or equal to it. (In fact, it does
peer: gN65...z6EA not even have to be an accurate timestamp; it simply must
allowed ips: 10.10.10.230/32
endpoint: 192.95.5.70:54421
be a per-peer monotonically increasing 96-bit number.) If the
$ ip link set wg0 up responder restarts and loses this state, that is not a problem:
$ ping 10.10.10.230 even though an initial packet from earlier can be replayed, it
PING 10.10.10.230 56(84) bytes of data. could not possibly disrupt any ongoing secure sessions, because
64 bytes: icmp_seq=1 ttl=49 time=0.01 ms the responder has just restarted and therefore has no active
secure sessions to disrupt. Once the initiator reestablishes a
At this point, sending a packet to 10.10.10.230 on that secure session with the responder after its restart, the initiator
system will send the data through the wg0 interface, which will be using a greater timestamp, invalidating the previous
will encrypt the packet using a secure session associated one. This timestamp ensures that an attacker may not disrupt
with the public key gN65...z6EA and send that encrypted a current session between initiator and responder via replay
and encapsulated packet to 192.95.5.70:54421 over UDP. attack. From an implementation point of view, TAI64N [4] is
When receiving a packet from 10.10.10.230 on wg0, the very convenient because it is big-endian, allowing comparisons
administrator can be assured that it is authentically from between two 12-byte timestamps to be done using standard
gN65...z6EA. memcmp().
4
key is troublesome from a key management perspective and knowing its public key, and possibly the PSK), in order to
might be more likely stolen, the idea is that by the time elicit any kind of response. Under load or not under load, this
quantum computing advances to break Curve25519, this pre- first MAC (msg.mac1) always is required to be present and
shared symmetric key has been long forgotten. And, more valid. While the public key of the responder itself is not secret,
importantly, in the shorter term, if the pre-shared symmetric it is sufficiently secret within this attack model, in which the
key is compromised, the Curve25519 keys still provide more goal is to ensure stealthiness of services, and so knowing the
than sufficient protection. In lieu of using a completely post- responder’s public key is sufficient proof for already knowing
quantum crypto system, which as of writing are not practical of its existence. (And of course, if the PSK is in use, this adds
for use here, this optional hybrid approach of a pre-shared another even stronger layer).
symmetric key to complement the elliptic curve cryptography
provides a sound and acceptable trade-off for the extremely Likewise, to solve the second problem—that of sending
paranoid. In the following sections, “PSK” refers to this 32-byte MACs in clear text—we apply an AEAD to the cookie in transit,
pre-shared symmetric key. again using as a symmetric encryption key the responder’s
public key, optionally the PSK, and a random public salt. Again,
the mostly public values here are sufficient for our purposes
C. Denial of Service Mitigation & Cookies within the denial-of-service attack threat model.
Computing Curve25519 point multiplication is CPU inten- Finally, to solve the third problem, we use the “additional
sive, even if Curve25519 is an extremely fast curve on most data” field of the AEAD to encrypt the cookie in transit to
processors. In order to determine the authenticity of a handshake additionally authenticate the first MAC (msg.mac1) of the
message, a Curve25519 multiplication must be computed, which initiating message that provoked a cookie reply message. This
means there is a potential avenue for a denial-of-service attack. ensures that an attacker without a man-in-the-middle position
In order to fend off a CPU-exhaustion attack, if the responder— cannot send torrents of invalid cookie replies to initiators to
the recipient of a message—is under load, it may choose to prevent them from authenticating with a correct cookie. (An
not process a handshake message (either an initiation or a attacker with an man-in-the-middle position could simply drop
response handshake message), but instead to respond with a cookie reply messages anyway to prevent a connection, so that
cookie reply message, containing a cookie. The initiator then case is not relevant.) In other words, we use the AD field to
uses this cookie in order to resend the message and have it be bind cookie replies to initiation messages.
accepted the following time by the responder.
With these problems solved, we can then add the aforemen-
The responder maintains a secret random value that changes tioned second MAC (msg.mac2) using the securely transmitted
every two minutes. A cookie is simply the result of computing cookie as the MAC key. When the responder is under load, it
a MAC of the initiator’s source IP address using this changing will only accept messages that additionally have this second
secret as the MAC key. The initiator, when resending its MAC.
message, sends a MAC of its message using this cookie as
the MAC key. When the responder receives the message, if In sum, the responder, after computing these MACs as well
it is under load, it may choose whether or not to accept and and comparing them to the ones received in the message, must
process the message based on whether or not there is a correct always reject messages with an invalid msg.mac1, and when
MAC that uses the cookie as the key. This mechanism ties under load may reject messages with an invalid msg.mac2. If
messages sent from an initiator to its IP address, giving proof the responder receives a message with a valid msg.mac1 yet
of IP ownership, allowing for rate limiting using classical IP with an invalid msg.mac2, and is under load, it may respond
rate limiting algorithms (token bucket, etc—see section VII-D with a cookie reply message, detailed in section V-D7. This
for implementation details). considerably improves on the cookie scheme used by DTLS
and IKEv2.
This is more or less the scheme used by DTLS [23] and
IKEv2 [12]. However it suffers from three major flaws. First, In contrast to HIPv2 [19], which solves this problem by us-
as mentioned in section V-A, we prefer to stay silent by not ing a 2-RTT key exchange and complexity puzzles, WireGuard
sending any reply to unauthenticated messages; indiscriminately eschews puzzle-solving constructs, because the former requires
sending a cookie reply message when under load would break storing state while the latter makes the relationship between
this property. Second, the cookie should not be sent in clear initiator and responder asymmetric. In WireGuard, either peer
text, because a man-in-the-middle could use this to then send at any point might be motivated to begin a handshake. This
fraudulent messages that are processed. And third, the initiator means that it is not feasible to require a complexity puzzle
himself could be denial-of-service attacked by being sent from the initiator, because the initatior and responder may soon
fraudulent cookies, which it would then use with no success in change roles, turning this mitigation mechanism into a denial
computing a MAC of its message. The cookie mechanism of of service vulnerability itself. Our above cookie solution, in
WireGuard, which uses two MACs (msg.mac1 and msg.mac2), contrast, enables denial of service attack mitigation on a 1-
fixes these problems, the computations for which will be shown RTT protocol, while keeping the initiator and responder roles
in section V-D4 below. symmetric.
For the first problem, in order for the responder to remain
silent, even while under load, all messages have a first MAC D. Messages
(msg.mac1) that uses the responder’s public key and optionally
the PSK. This means that at the very least, a peer sending There are four types of messages, each prefixed by a single-
a message must know to whom it is talking (by virtue of byte message type identifier, notated as msg.type below:
5
• Section V-D2: The handshake initiation message that BLAKE2s hash function used in an HMAC construction,
begins the handshake process for establishing a secure returning 32 bytes of output.
session. Kdf(key, input) Sets τ B Hmac(key, input), τ 0 B
• Section V-D3: The handshake response to the initiation Hmac(τ, 0x1), τ 00 B Hmac(τ, τ 0 k 0x2), and returns a
message that concludes the handshake, after which a secure pair of 32 byte values, (τ 0, τ 00 ). This is the HKDF [15]
session can be established. function.
• Section V-D7: A reply to either a handshake initiation Mac(key, input, length) If key , , Keyed-Blake2s(key,
message or a handshake response message, explained in input, length), the keyed MAC variant of the BLAKE2s
section V-C, that communicates an encrypted cookie value hash function, and otherwise Hash(input, length), either
for use in resending either the rejected handshake initiation returning length bytes of output.
message or handshake response message. Timestamp() Returns the TAI64N timestamp [4] of the current
• Section V-D6: An encapsulated and encrypted IP packet time, which is 12 bytes of output, the first 8 bytes being
that uses the secure session negotiated by the handshake. a big-endian integer of the number of seconds since 1970
TAI and the last 4 bytes being a big-endian integer of
The initator of the handshake is denoted as subscript i, and the number of nanoseconds from the beginning of that
the responder of the handshake is denoted as subscript r, and second.
either one is denoted as subscript ∗. For messages that can Construction If Q , , the value Hash(“NoisePSK_IK_
be created by either an initiator or sender, if the peer creating 25519_ChaChaPoly_BLAKE2s”, 32), and otherwise the
the message is the initator, let (m, m 0 ) = (i, r), and if the peer value Hash(“Noise_IK_25519_ChaChaPoly_BLAKE2s”,
creating the message is the responder, let (m, m 0 ) = (r, i). The 32), 32 bytes of output.
two peers have several variables they maintain locally: Identifier The string literal “WireGuard v0 zx2c4
[email protected]”, 34 bytes of output.
I∗ A 32-bit index that locally represents the
other peer, analogous to IPsec’s “SPI”.
pr iv pub
S∗ , S∗ The static private and public key values. 1) Protocol Overview: In the majority of cases, the hand-
pr iv pub shake will complete in 1-RTT, after which transport data
E∗ , E∗ The ephemeral private and public key
values. follows:
Q The optional (sometimes , empty)
pre-shared symmetric key value from Responder—r
Initiator—i
section V-B.
H∗ , C∗ , K∗ A hash result value, a chaining key value,
and a symmetric key value. Handshake Initiation
T∗send , T∗r ec v Transport data symmetric key values for
sending and receiving. Handshake Response
N∗send , N∗r ec v Transport data message nonce counters for Transport Data
sending and receiving.
Transport Data
In the constructions that follow, several symbols, functions,
and operators are used. The binary operator k represents
concatenation of its operands, and the binary operator B
represents assignment of its right operand to its left operand. If one peer is under load, then a cookie reply message is
The annotation D n returns the value (n + 16), which is the added to the handshake, to prevent against denial-of-service
Poly1305 authentication tag length added to n. represents an attacks:
empty zero-length bitstring, 0n represents the all zero (0x0)
bitstring of length n, and ρn represents a random bitstring
of length n. Let τ be considered a temporary variable. All Initiator—i Responder—r
integer assignments are little-endian, unless otherwise noted.
The following functions and constants are utilized:
Handshake Initiation
DH(private key, public key) Curve25519 point multiplica- Cookie Reply
tion of private key and public key, returning 32 bytes
of output. Handshake Initiation
DH-Generate() Generates a random Curve25519 private key Handshake Response
and derives its corresponding public key, returning a pair
of 32 bytes values, (private, public). Transport Data
Aead(key, counter, plain text, auth text) Transport Data
ChaCha20Poly1305 AEAD, as specified in [16],
with its nonce being composed of 32 bits of zeros
followed by the 64-bit little-endian value of counter.
Hash(input, length) Blake2s(input, length), returning
length bytes of output. 2) First Message: Initiator to Responder: The initiator sends
Hmac(key, input) Hmac-Blake2s(key, input), the ordinary this message, msg:
6
The fields mac1 and mac2 are explained further in section V-D4.
type B 0x1 (1 byte) reserved B 03 (3 bytes) The above remaining fields are calculated [22] as follows:
pr iv pub
sender B Ii (4 bytes) (Er , Er ) B DH-Generate()
if Q , :
ephemeral (32 bytes)
pub
(Cr , Kr ) B Kdf(Cr , Er )
static (L
32 bytes) pub
msg.ephemeral B Er
timestamp (L
12 bytes) Hr B Hash(Hr k msg.ephemeral, 32)
pr iv pub
mac1 (16 bytes) mac2 (16 bytes) (Cr , Kr ) B Kdf(Cr , DH(Er , Ei ))
pr iv pub
(Cr , Kr ) B Kdf(Cr , DH(Er , Si ))
The timestamp field is explained in section V-A, and mac1 msg.empty B Aead(Kr , 0, , Hr )
and mac2 are explained further in section V-D4. Ii is generated
Hr B Hash(Hr k msg.empty, 32)
randomly (ρ4 ) when this message is sent, and is used to tie
subsequent replies to the session begun by this message. The When the responder receives this message, it does the same
above remaining fields are calculated [22] as follows: operations so that its final state variables are identical, replacing
the operands of the DH function to produce equivalent values.
Ci B Construction Note that this handshake response message is smaller than the
Hi B Hash(Ci k Identifier, 32) handshake initiation message, preventing amplification attacks.
if Q , : 4) Cookie MACs: In sections V-D2 and V-D3, the two hand-
(Ci , τ) B Kdf(Ci , Q) shake messages have the msg.mac1 and msg.mac2 parameters.
Hi B Hash(Hi k τ, 32) For a given handshake message, msgα represents all bytes of
pub
msg prior to msg.mac1, and msg β represents all bytes of msg
Hi B Hash(Hi k Sr , 32) prior to msg.mac2. The latest cookie received f
L ∗ seconds ago
pr iv pub
(Ei , Ei ) B DH-Generate() is represented by L ∗ . The msg.mac1 and msg.mac2 fields are
if Q , : populated as follows:
pub
(Ci , Ki ) B Kdf(Ci , Ei
pub
) msg.mac1 B Mac(Q, Sm0 k msgα, 16)
msg.ephemeral B
pub
Ei if L m = or g
L m ≥ 120:
Hi B Hash(Hi k msg.ephemeral, 32) msg.mac2 B 016
(Ci , Ki ) B Kdf(Ci , DH(Ei
pr iv
, Sr
pub
)) otherwise:
pub msg.mac2 B Mac(L m, msg β , 16)
msg.static B Aead(Ki , 0, Si ,
Hi )
Hi B Hash(Hi k msg.static, 32) 5) Transport Data Key Derivation: After the above two
pr iv pub messages have been exchanged, keys are calculated [22] by the
(Ci , Ki ) B Kdf(Ci , DH(Si , Sr ))
initiator and responder for sending and receiving transport data
msg.timestamp B Aead(Ki , 0, Timestamp(), Hi ) messages (section V-D6):
Hi B Hash(Hi k msg.timestamp, 32)
(Tisend = Trr ecv , Tir ecv = Trsend ) B Kdf(Ci = Cr , )
When the responder receives this message, it does the same Nisend = Nrr ecv = Nir ecv = Nrsend B 0
operations so that its final state variables are identical, replacing pr iv pub pr iv pub
the operands of the DH function to produce equivalent values. Ei = Ei = Er = Er = Ci = Cr = Ki = Kr B
On the last line, most prior states of the handshake are zeroed
3) Second Message: Responder to Initiator: The responder from memory (described in section VII-D), but the value Hi =
sends this message, after processing the first message above Hr is not necessarily zeroed, as it could potentially be useful
from the initiator and applying the same operations to arrive in future revisions of Noise [22].
at an identical state. Ir is generated randomly (ρ4 ) when this
message is sent, and is used to tie subsequent replies to the 6) Subsequent Messages: Transport Data Messages: The
session begun by this message, just as above. The responder initiator and the responder exchange transport data messages for
sends this message, msg: exchanging encrypted encapsulated packets. The inner plaintext
packet that is encapsulated is represented as P, of length kPk.
Both peers send this message, msg:
type B 0x2 (1 byte) reserved B 03 (3 bytes)
sender B Ir (4 bytes) receiver B Ii (4 bytes) type B 0x4 (1 byte) reserved B 03 (3 bytes)
ephemeral (32 bytes) receiver B Im0 (4 bytes)
empty (D
0 bytes) counter (8 bytes)
mac1 (16 bytes) mac2 (16 bytes) packet ( kPk
M bytes)
7
The remaining fields are populated as follows: peers from being attacked by sending them fraudulent cookie
16· d k P k/16e− k P k reply messages. A random salt is added to the message to avoid
PBPk0 key reuse. Also note that this message is smaller than either
msg.counter B Nmsend the handshake initiation message or the handshake response
msg.packet B Aead(Tmsend , Nmsend , P, ) message, avoiding amplification attacks.
Nmsend B Nmsend + 1 Upon receiving this message, if it is valid, the only thing
the recipient of this message should do is store the cookie
The recipient of this messages uses Tmr ec
0
v
to read the message. along with the time at which it was received. The mechanism
Note that no length value is stored in this header, since the described in section VI will be used for retransmitting hand-
authentication tag serves to determine whether the message shake messages with these received cookies; this cookie reply
is legitimate, and the inner IP packet already has a length message should not, by itself, cause a retransmission.
field in its header. The encapsulated packet itself is zero
padded (without modifying the IP packet’s length field) before VI. Timers & Stateless UX
encryption to complicate traffic analysis, though that zero
padding should never increase the UDP packet size beyond From the perspective of the user, WireGuard appears
the maximum transmission unit length. Prior to msg.packet, stateless. The private key of the interface is configured, followed
there are exactly 16 bytes of header fields, which means that by the public key of each of its peers, and then a user may
decryption may be done in-place and still achieve natural simply send packets normally. The maintenance of session
memory address alignment, allowing for easier implementation states, perfect forward secrecy, handshakes, and so forth is
in hardware and a significant performance improvement on completely behind the scenes, invisible to the user. While
many common CPU architectures. This is in part the result of similar automatic mechanisms historically have been buggy and
the 3 bytes of reserved zero fields, making the first four bytes disastrous, WireGuard employs an extremely simple timer state
readable together as a little-endian integer. machine, in which each state and transitions to all adjacent states
are clearly defined, resulting in total reliability. There are no
The msg.counter value is a nonce for the ChaCha20- anomalous states or sequences of states; everything is accounted
Poly1305 AEAD and is kept track of by the recipient using for. It has been tested with success on 10 gigabit intranets as
r ec v
Nm 0 . It also functions to avoid reply attacks. Since WireGuard well as on low-bandwidth high-latency transatlantic commercial
operates over UDP, messages can sometimes arrive out of airline Internet. The simplicity of the timer state machine is
order. For that reason we use a sliding window to keep track owed to the fact that only a 1-RTT handshake is required, that
of received message counters, in which we keep track of the initiator and responder can transparently switch roles, and
the greatest counter received, as well as a window of prior that WireGuard breaks down traditional layering, as discussed
messages received, using the algorithm detailed by appendix in section I, and can therefore use intra-layer characteristics.
C of RFC2401 [13] or by RFC6479 [25], which uses a larger
bitmap while avoiding bitshifts, enabling more extreme packet A. Preliminaries
reordering that may occur on multi-core systems.
The following constants are used for the timer state system:
7) Under Load: Cookie Reply Message: As mentioned in
Symbol Value
section V-C, when a message with a valid msg.mac1 is received,
but msg.mac2 is invalid or expired, and the peer is under load, Rekey-After-Messages 264 − 216 − 1 messages
the peer may send a cookie reply message. Im0 is determined Reject-After-Messages 264 − 24 − 1 messages
from the msg.sender field of the message that prompted this Rekey-After-Time 120 seconds
cookie reply message, msg: Reject-After-Time 180 seconds
Rekey-Attempt-Time 90 seconds
Rekey-Timeout 5 seconds
type B 0x3 (1 byte) reserved B 03 (3 bytes) Keepalive-Timeout 10 seconds
receiver B Im0 (4 bytes)
Under no circumstances will WireGuard send an initiation
salt B ρ32 (32 bytes) message more than once every Rekey-Timeout. A secure session
cookie (L
16 bytes) is created after the successful receipt of a handshake response
message (section V-D3), and the age of a secure session
is measured from the time of processing this message and
The secret variable, Rm , changes every two minutes to a random the immediately following derivation of transport data keys
value, Am0 represents the subscript’s external IP address, and (section V-D5).
M represents the msg.mac1 value of the message to which
this is in reply. The remaining encrypted cookie reply field is B. Transport Message Limits
populated as such:
After a secure session has first been established, WireGuard
τ B Mac(Rm, Am0, 16) will try to create a new session, by sending a handshake
msg.cookie B Aead(Mac(Q, Sm
pub
k msg.salt, 32), 0, τ, M) initiation message (section V-D2), after it has sent Rekey-After-
Messages transport data messages or after the current secure
By using M as the additional authenticated data field, we bind session is Rekey-After-Time seconds old, whichever comes first.
the cookie reply to the relevant message, in order to prevent If this secure session was created by a responder rather than an
8
initiator, the reinitiation is prompted instead after (Rekey-After- encapsulated encrypted inner-packet. Since all other transport
Time + Rekey-Timeout × 2) seconds, in order to prevent the data messages contain IP packets, which have a minimum
“thundering herd” problem in which both parties repeatedly try length of min(kIPv4 headerk, kIPv6 headerk), this keepalive
to initiate new sessions at the same time. After Reject-After- message can be easily distinguished by simple virtue of having
Messages transport data messages or after the current secure a zero length encapsulated packet. (Note that the msg.packet
session is Reject-After-Time seconds old, whichever comes field of the message will in fact be of length 16, the length
first, WireGuard will refuse to send any more transport data of the Poly1305 [5] authentication tag, since a zero length
messages using the current secure session, until a new secure plaintext still needs to be authenticated, even if there is nothing
session is created through the 1-RTT handshake. to encrypt.)
C. Key Rotation This passive keepalive is only sent when a peer has nothing
to send, and is only sent in circumstances when another peer is
New secure sessions are created approximately every Rekey- sending authenticated transport data messages to it. This means
After-Time seconds (which is far more likely to occur before that when neither side is exchanging transport data messages,
Rekey-After-Messages transport data messages have been the network link will be silent.
sent), due to the transport message limits described above in
section VI-B. This means that the secure session is constantly Because every transport data message sent warrants a reply
rotating, creating a new ephemeral symmetric session key of some kind—either an organic one generated by the nature
each time, for perfect forward secrecy. But, keep in mind of the encapsulated packets or this keepalive message—we can
that after an initiator receives a handshake response message determine if the secure session is broken or disconnected if a
(section V-D3), the responder cannot send transport data transport data message has not been received for (Keepalive-
messages (section V-D6) until it has received the first transport Timeout + Rekey-Timeout) seconds, in which case a handshake
data message from the initiator. And, further, transport data initiation message is sent to the unresponsive peer, once every
messages encrypted using the previous secure session might Rekey-Timeout seconds, as in section VI-D, until a secure
be in transit after a new secure session has been created. For session is recreated successfully or until Rekey-Attempt-Time
these reasons, WireGuard keeps in memory the current secure seconds have passed.
session and the previous secure session. Every time a new secure
session is created, the existing one rotates into the “previous”
slot, and the new one occupies the “current” slot. The “previous- F. Interaction with Cookie Reply System
previous” one is then discarded and its memory is zeroed (see
section VII-D for a discussion of memory zeroing). If no new As noted in sections V-C and V-D7, when a peer is under
secure session is created after (Reject-After-Time × 3) seconds, load, a handshake initiation message or a handshake response
both the current secure session and the previous secure session message may be discarded and a cookie reply message sent.
are discarded and zeroed out. On receipt of the cookie reply message, which will enable the
peer to send a new initiation or response message with a valid
D. Handshake Initiation Retransmission msg.mac2 that will not be discarded, the peer is not supposed
to immediately resend the now valid message. Instead, it should
The first time the user sends a packet over a WireGuard simply store the decrypted cookie value from the cookie reply
interface, the packet cannot immediately be sent, because no message, and wait for the expiration of the Rekey-Timeout
current session exists. So, after queuing the packet, WireGuard timer for retrying a handshake initiation message. This prevents
sends a handshake initiation message (section V-D2). potential bandwidth generation abuse, and helps to alleviate the
After sending a handshake initiation message, because of load conditions that are requiring the cookie reply messages in
a first-packet condition, or because of the limit conditions of the first place.
section VI-B, if a handshake response message (section V-D3)
is not subsequently received after Rekey-Timeout seconds, a
new handshake initiation message is constructed (with new VII. Linux Kernel Implementation
random ephemeral keys) and sent. This reinitiation is attempted
for Rekey-Attempt-Time seconds before giving up. Critically The implementation of WireGuard inside the Linux kernel
important future work includes adjusting the Rekey-Timeout has a few goals. First, it should be short and simple, so that
value to use exponential backoff, instead of the current fixed auditing and reviewing the code for security vulnerabilities is
value. not only easy, but also enjoyable; WireGuard is implemented
in less than 4,000 lines of code (excluding cryptographic
E. Passive Keepalive primitives). Second, it must be extremely fast, so that it is
competitive with IPsec on performance. Third, it must avoid
Most importantly, and most elegant, WireGuard implements allocations and other resource intensive allocations in response
a passive keepalive mechanism to ensure that sessions stay to incoming packets. Forth, it must integrate as natively and
active and allow both peers to passively determine if a con- smoothly as possible with existing kernel infrastructure and
nection has failed or been disconnected. If a peer has received userland expectations, tools, and APIs. And fifth, it must be
a validly-authenticated transport data message (section V-D6), buildable as an external kernel module without requiring any
but does not have any packets itself to send back for Keepalive- changes to the core Linux kernel. WireGuard is not merely an
Timeout seconds, it sends a keepalive message. A keepalive academic project with never-released laboratory code, but rather
message is simply a transport data message with a zero-length a practical project aiming for production-ready implementations.
9
A. Queuing System Likewise, handshake initiation and response messages and
cookie reply messages are processed on a separate parallel low-
The WireGuard device driver has flags indicating to the priority worker thread. As mentioned in section V-C, ECDH
kernel that it supports generic segmentation offload (GSO), operations are CPU intensive, so it is important that a flood of
scatter gather I/O, and hardware checksum offloading, which handshake work does not monopolize the CPU. Low priority
in sum means that the kernel will hand “super packets” to background workqueues are employed for this asynchronous
WireGuard, packets that are well over the MTU size, having handshake message handling.
been priorly queued up by the upper layers, such as TCP or
the TCP and UDP corking systems. This allows WireGuard to
operate on batch groups of outgoing packets. After splitting C. RTNL-based Virtual Interface & Containerization
packets into ≤MTU-sized chunks, WireGuard attempts to In order to integrate with the existing ip(8) utilities and
encrypt, encapsulate, and send over UDP all of these at once, the netlink-based Linux user space, the kernel’s RTNL layer
caching routing information, so that it only has to be computed is used for registering a virtual interface, known inside the
once per cluster of packets. This has the very important effect kernel as a “link”. This easily gives access to the kernel APIs
of also reducing cache misses: by waiting until all individual accessed by ip-link(8) and ip-set(8). For configuring the
packets of a super packet have been encrypted and encapsulated interface private key and the public keys and endpoints of
to pass them off to the network layer, the very complicated and peers, initially the RTM_SETLINK RTNL message was used, but
CPU-intensive network layer keeps instructions, intermediate this proved to be too limited. It proved to be much cleaner to
variables, and branch predictions in CPU cache, giving in many simply implement an ioctl(2)-based API, passing a series
cases a 35% increase in sending performance. of structures back and forth, through two different functions:
WG_GET_DEVICE and WG_SET_DEVICE. At the moment, a separate
As well, as mentioned in section VI-D, sometimes outgoing
packets must be queued until a handshake completes success- user space tool, wg(8), is used for this purpose, but future
fully. When packets are finally able to be sent, the entire queue plans involve integrating this functionality directly into ip(8).
of existing queued packets along are treated as a single super The RTNL subsystem allows for moving the WireGuard
packet, in order to benefit from the same optimizations as virtual interface between network namespaces. This enables the
above. sending and receiving sockets (for the outer UDP packets) to
Finally, in order to prevent against needless allocations, all be created in one namespace, while the interface itself remains
packet transformations are done in-place, avoiding the need for in another namespace. For example, a docker(1) or rkt(1)
copying. This applies not only to the encryption and decryption container guest could have as its sole network interface a
of data, which occur in-place, but also to certain user space WireGuard interface, with the actual outer encrypted packets
data and files sent using sendfile(2); these are processed being sent out of the real network interface on the host, creating
using this zero-copy super packet queuing system. end-to-end authenticated encryption in and out of the container.
Future work on the queuing system could potentially involve D. Data Structures and Primitives
integrating WireGuard with the FlowQueue [11]-CoDel [20]
scheduling algorithm. While the Linux kernel already includes two elaborate
routing table implementations—an LC-trie [21] for IPv4 and a
radix trie for IPv6—they are intimately tied to the FIB routing
B. Softirq & Parallelism
layer, and not at all reusable for other uses. For this reason, a
The xfrm layer, in contrast to WireGuard, has the advantage very minimal routing table was developed. The authors have
that it does not need to do cryptography in softirq, which opens had success implementing the cryptokey routing table as an
it up to a bit more flexibility. However, there is precedent for allotment routing table [10], an LC-trie [21], and a standard
doing cryptographic processing in softirq on the interface level: radix trie, with each one giving adequate but slightly different
the mac802111 subsystem used for wireless WPA encryption. performance characteristics. Ultimately the simplicity of the
WireGuard, being a virtual interface that does encryption, is venerable radix trie was preferred, having good performance
not architecturally so much different from wireless interfaces characteristics and the ability to implement it with lock-less
doing encryption at the same layer. While in practice it does lookups, using the RCU system [18]. Every time an outgoing
work very well, it is not parallel. For this reason, the kernel’s packet goes through WireGuard, the destination peer is looked
padata system is used for parallelizing into concurrent workers up using this table, and every time an incoming packet reaches
encryption and decryption operations for utilization of all CPUs WireGuard, its validity is checked by consulting this table, so
and CPU cores. As well, packet checksums can be computed performance is in fact important here.
in parallel with this method. When sending packets, however,
For all handshake initiation messages (section V-D2), the
they must be sent in order, which means each packet cannot
responder must lookup the decrypted static public key of the
simply be sent immediately after it is encrypted. Fortunately,
initiator. For this, WireGuard employs a hash table using the
the padata API divides operations up into a parallel step,
extremely fast SipHash2-4 [1] MAC function with a secret, so
followed by an in-order serial step. This is also helpful for
that upper layers, which may provide the WireGuard interface
parallel decryption, in which the message counter must be
with public keys in a more complicated key distribution scheme,
checked and incremented in the order that packets arrive, lest
cannot mount a hash table collision denial of service attack.
they be rejected unnecessarily. In order to reduce latency, if
there is only a single packet in a super packet and its length is While the Linux kernel’s crypto API has a large collection of
less than 256 bytes, or if there is only one CPU core online, primitives and is meant to be reused in several different systems,
the packet is processed in softirq. the API introduces needless complexity and allocations. Several
10
revisions of WireGuard used the crypto API with different have as their only networking means the WireGuard interface,
integration techniques, but ultimately, using raw primitives preventing any potential clear-text packet leakage.
with direct, non-abstracted APIs proved to be far cleaner
and less resource intensive. Both stack and heap pressure F. Potential Userspace Implementations
were reduced by using crypto primitives directly, rather than
going through the kernel’s crypto API. The crypto API also In order for WireGuard to have widespread adoption, more
makes it exceedingly difficult to avoid allocations when using implementations than our current one for the Linux kernel must
multiple keys in the multifaceted ways required by Noise. As be written. As a next step, the authors plan to implement a cross-
of writing, WireGuard ships with optimized implementations platform low-speed user space TUN-based implementation in
of ChaCha20Poly1305 for the various Intel Architecture vector a safe yet high-speed language like Rust, Go, or Haskell.
extensions, with implementations for ARM/NEON and MIPS on
their way. The fastest implementation supported by the hardware VIII. Performance
is selected at runtime, with the floating-point unit being used WireGuard was benchmarked alongside IPsec in two modes
opportunistically. All ephemeral keys and intermediate results and OpenVPN, using iperf3(1) between an Intel Core i7-
of cryptographic operations are zeroed out of memory after 3820QM and an Intel Core i7-5200U with Intel 82579LM and
use, in order to maintain perfect forward secrecy and prevent Intel I218LM gigabit Ethernet cards respectively, with results
against various potential leaks. The compiler must be specially averaged over thirty minutes. The results were quite promising:
informed about this explicit zeroing so that the “dead-store”
is not optimized out, and for this the kernel provides the Protocol Configuration
memzero_explicit function.
WireGuard 256-bit ChaCha20, 128-bit Poly1305
In contrast to crypto primitives, the existing kernel im- IPsec #1 256-bit ChaCha20, 128-bit Poly1305
plementations of token bucket hash-based rate limiting, for IPsec #2 256-bit AES, 128-bit GCM
rate limiting handshake initiation and response messages when OpenVPN 256-bit AES, HMAC-SHA2-256, UDP mode
under-load after cookie IP attribution has occurred, have been
very minimal and easy to reuse in WireGuard. WireGuard uses Throughput
the Netfilter hashlimit matcher for this.
OpenVPN 258
E. FIB Considerations IPsec #2 881
In order to avoid routing loops, one proposed change for IPsec #1 825
the Linux kernel—currently posted by the authors to the Linux
kernel mailing list [8]—is to allow for FIB route lookups that WireGuard 1,011
exclude an interface. This way, the kernel’s routing table could
have 0.0.0.0/1 and 128.0.0.1/1, for a combined coverage
0 200 400 600 800 1,000
of 0.0.0.0/0, while being more specific, sent to the wg0
interface. Then, the individual endpoints of WireGuard peers Megabits per Second
could be routed using the device that a FIB lookup would return Ping Time
if wg0 did not exist, namely one through the actual 0.0.0.0/0
route. Or more generally, when looking up the correct interface OpenVPN 1.541
for routing packets to particular peer endpoints, a route for
an interface would be returned that is guaranteed not to be IPsec #2 0.508
wg0. This is preferable to the current situation of needing to
add explicit routes for WireGuard peer endpoints to the kernel IPsec #1 0.501
routing table when the WireGuard-bound route has precedence.
This work is ongoing. WireGuard 0.403
Another approach, allude to above, is to use network 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6
namespaces to entirely isolate the WireGuard interface and
routing table from the physical interfaces and routing tables. Milliseconds
One namespace would contain the WireGuard interface and a
For both metrics, WireGuard outperformed OpenVPN and
routing table with a default route to send all packets over the
both modes of IPsec. The CPU was at 100% utilization during
WireGuard interface. The other namespace would contain the
the throughput tests of OpenVPN and IPsec, but was not
various physical interfaces (Ethernet devices, wireless radios,
completely utilized for the test of WireGuard, suggesting that
and so forth) along with its usual routing table. The incoming
WireGuard was able to completely saturate the gigabit Ethernet
and outgoing UDP socket for the WireGuard interface would
link.
live in the second physical interface namespace, not the first
WireGuard interface namespace. This way, packets sent in the While the AES-NI-accelerated AES-GCM IPsec ci-
WireGuard interface namespace are encrypted there, and then pher suite appears to outperform the AVX2-accelerated
sent using a socket that lives in the physical interface namespace. ChaCha20Poly1305 IPsec cipher suite, as future chips in-
This prevents all routing loops and also ensures total isolation. crease the width of vector instructions—such as the upcoming
Processes living in the WireGuard interface namespace would AVX512—it is expected that over time ChaCha20Poly1305
11
will outperform AES-NI [3]. ChaCha20Poly1305 is especially [3] D. J. Bernstein. Cpus are optimized for video games. [Online]. Available:
well suited to be implemented in software, free from side- https://moderncrypto.org/mail-archive/noise/2016/000699.html
channel attacks, with great efficiency, in contrast to AES, so [4] ——. Tai64, tai64n, and tai64na. [Online]. Available: https:
for embedded platforms with no dedicated AES instructions, //cr.yp.to/libtai/tai64.html
ChaCha20Poly1305 will also be most performant. [5] ——, “The poly1305-aes message-authentication code,” in Fast Software
Encryption: 12th International Workshop, FSE 2005, Paris, France,
Furthermore, WireGuard already outperforms both IPsec February 21-23, 2005, Revised Selected Papers, ser. Lecture Notes in
Computer Science, vol. 3557. Springer, 2005, pp. 32–49.
cipher suites, due to the simplicity of implementation and
lack of overhead. The enormous gap between OpenVPN and [6] ——, “Curve25519: new Diffie-Hellman speed records,” in Public Key
Cryptography – PKC 2006, ser. Lecture Notes in Computer Science,
WireGuard is to be expected, both in terms of ping time and M. Yung, Y. Dodis, A. Kiayias, and T. Malkin, Eds., vol. 3958. Berlin,
throughput, because OpenVPN is a user space application, Heidelberg: Springer-Verlag Berlin Heidelberg, 2006, pp. 207–228.
which means there is added latency and overhead of the [7] ——, “Chacha, a variant of salsa20,” in SASC 2008, 2008.
scheduler and copying packets between user space and kernel [8] J. A. Donenfeld. Inverse of flowi{4,6}_oif: flowi{4,6}_not_oif. [Online].
space several times. Available: http://lists.openwall.net/netdev/2016/02/02/222
[9] N. Ferguson and B. Schneier, “A cryptographic evaluation of ipsec,”
Counterpane Internet Security, Inc, Tech. Rep., 2000.
IX. Conclusion
[10] Y. Hariguchi. (2002) Allotment routing table: A fast free multibit trie
In less than 4,000 lines, WireGuard demonstrates that it based routing table. [Online]. Available: https://github.com/hariguchi/
art/blob/master/docs/art.pdf
is possible to have secure network tunnels that are simply
implemented, extremely performant, make use of state of [11] T. Hoeiland-Joergensen, P. McKenney, D. Taht, J. Gettys, and E. Dumazet,
“The flowqueue-codel packet scheduler and active queue management
the art cryptography, and remain easy to administer. The algorithm,” Internet Research Task Force, Internet Engineering Task
simplicity allows it to be very easily independently verified Force, RFC, March 2016.
and reimplemented on a wide diversity of platforms. The [12] C. Kaufman, P. Hoffman, Y. Nir, and P. Eronen, “Internet key exchange
cryptographic constructions and primitives utilized ensure high- protocol version 2,” Internet Research Task Force, RFC Editor, RFC
speed in a wide diversity of devices, from data center servers 5996, September 2010.
to cellphones, as well as dependable security properties well [13] S. Kent and R. Atkinson, “Security architecture for ip,” Internet Research
into the future. The ease of deployment will also eliminate Task Force, RFC Editor, RFC 2401, November 1998.
many of the common and disastrous pitfalls currently seen [14] H. Krawczyk, “Sigma: The ‘sign-and-mac’ approach to authenticated
diffie-hellman and its use in the ike-protocols,” in Advances in Cryptology
with many IPsec deployments. Described around the time of its - CRYPTO 2003, 23rd Annual International Cryptology Conference,
introduction by Furguson and Schneier [9], “IPsec was great Santa Barbara, California, USA, August 17-21, 2003, Proceedings, ser.
disappointment to us. Given the quality of the people that [sic] Lecture Notes in Computer Science, vol. 2729. Springer, 2003, pp.
worked on it and the time that was spent on it, we expected 400–425.
a much better result. [. . .] Our main criticism of IPsec is its [15] ——, Advances in Cryptology – CRYPTO 2010: 30th Annual Cryptology
complexity.” WireGuard, in contrast, focuses on simplicity and Conference, Santa Barbara, CA, USA, August 15-19, 2010. Proceedings.
Berlin, Heidelberg: Springer Berlin Heidelberg, 2010, ch. Cryptographic
usability, while still delivering a scalable and highly secure Extraction and Key Derivation: The HKDF Scheme, pp. 631–648.
system. By remaining silent to unauthenticated packets and [16] A. Langley and Y. Nir, “Chacha20 and poly1305 for ietf protocols,”
by not making any allocations and generally keeping resource Internet Research Task Force, RFC Editor, RFC 7539, May 2015.
utilization to a minimum, it can be deployed on the outer edges [17] K. Lauter and A. Mityagin, Public Key Cryptography - PKC 2006:
of networks, as a trustworthy and reliable access point, which 9th International Conference on Theory and Practice in Public-Key
does not readily reveal itself to attackers nor provide a viable Cryptography, New York, NY, USA, April 24-26, 2006. Proceedings.
attack target. The cryptokey routing table paradigm is easy to Berlin, Heidelberg: Springer Berlin Heidelberg, 2006, ch. Security
Analysis of KEA Authenticated Key Exchange Protocol, pp. 378–394.
learn and will promote safe network designs. The protocol is
[18] P. E. McKenny, D. Sarma, A. Arcangeli, A. Kleen, O. Krieger, and
based on cryptographically sound and conservative principles, R. Russell, “Read-copy update,” in Ottawa Linux Symposium, Jun 2002,
using well understood yet modern crypto primitives. WireGuard pp. 338–367.
was designed from a practical perspective, meant to solve real [19] R. Moskowitz, T. Heer, P. Jokela, and T. Henderson, “Host identity
world secure networking problems. protocol version 2,” Internet Research Task Force, RFC Editor, RFC
7401, April 2015.
[20] K. Nichols and V. Jacobson, “Controlling queue delay,” Commun. ACM,
Acknowledgments vol. 55, no. 7, pp. 42–50, July 2012.
WireGuard was made possible with the great advice and [21] S. Nilsson and G. Karlsson, “Ip-address lookup using lc-tries,” IEEE
Journal on Selected Areas in Communications, vol. 17, no. 6, pp. 1083–
guidance of many, in particular: Trevor Perrin, Jean-Philippe 1092, Jun 1999.
Aumasson, Steven M. Bellovin, and Greg Kroah-Hartman.
[22] T. Perrin. (2016) The noise protocol framework. [Online]. Available:
http://noiseprotocol.org/noise.pdf
References [23] E. Rescorla and N. Modadugu, “Datagram transport layer security version
1.2,” Internet Research Task Force, RFC Editor, RFC 6347, January
[1] J.-P. Aumasson and D. J. Bernstein, Progress in Cryptology - IN- 2012.
DOCRYPT 2012: 13th International Conference on Cryptology in India,
Kolkata, India, December 9-12, 2012. Proceedings. Berlin, Heidelberg: [24] K. Winstein and H. Balakrishnan, “Mosh: An interactive remote shell
Springer Berlin Heidelberg, 2012, ch. SipHash: A Fast Short-Input PRF, for mobile clients,” in USENIX Annual Technical Conference, Boston,
pp. 489–508. MA, June 2012.
[2] J.-P. Aumasson, S. Neves, Z. Wilcox-O’Hearn, and C. Winnerlein, [25] X. Zhang and T. Tsou, “Ipsec anti-replay algorithm without bit shifting,”
“Blake2: Simpler, smaller, fast as md5,” in Proceedings of the 11th Internet Research Task Force, RFC Editor, RFC 6479, January 2012.
International Conference on Applied Cryptography and Network Security,
ser. ACNS’13. Berlin, Heidelberg: Springer-Verlag, 2013, pp. 119–135.
12