Tor Protocol Specification 2
Tor Protocol Specification 2
Roger Dingledine
Nick Mathewson
Table of Contents
0. Preliminaries
0.1. Notation and encoding
0.2. Security parameters
0.3. Ciphers
0.4. A bad hybrid encryption algorithm, for legacy purposes
1. System overview
1.1. Keys and names
2. Connections
2.1. Picking TLS ciphersuites
2.2. TLS security considerations
3. Cell Packet format
4. Negotiating and initializing connections
4.1. Negotiating versions with VERSIONS cells
4.2. CERTS cells
4.3. AUTH_CHALLENGE cells
4.4. AUTHENTICATE cells
4.4.1. Link authentication type 1: RSA-SHA256-TLSSecret
4.4.2. Link authentication type 3: Ed25519-SHA256-RFC5705
4.5. NETINFO cells
5. Circuit management
5.1. CREATE and CREATED cells
5.1.1. Choosing circuit IDs in create cells
5.1.2. EXTEND and EXTENDED cells
5.1.3. The "TAP" handshake
5.1.4. The "ntor" handshake
5.1.5. CREATE_FAST/CREATED_FAST cells
5.2. Setting circuit keys
5.2.1. KDF-TOR
5.2.2. KDF-RFC5869
5.3. Creating circuits
5.3.1. Canonical connections
5.4. Tearing down circuits
5.5. Routing relay cells
5.5.1. Circuit ID Checks
5.5.2. Forward Direction
5.5.2.1. Routing from the Origin
5.5.2.2. Relaying Forward at Onion Routers
5.5.3. Backward Direction
5.5.3.1. Relaying Backward at Onion Routers
5.5.4. Routing to the Origin
5.6. Handling relay_early cells
6. Application connections and stream management
6.1. Relay cells
6.2. Opening streams and transferring data
6.2.1. Opening a directory stream
6.3. Closing streams
6.4. Remote hostname lookup
7. Flow control
7.1. Link throttling
7.2. Link padding
7.3. Circuit-level flow control
7.3.1. SENDME Cell Format
7.4. Stream-level flow control
8. Handling resource exhaustion
8.1. Memory exhaustion
9. Subprotocol versioning
9.1. "Link"
9.2. "LinkAuth"
9.3. "Relay"
9.4. "HSIntro"
9.5. "HSRend"
9.6. "HSDir"
9.7. "DirCache"
9.8. "Desc"
9.9. "Microdesc"
9.10. "Cons"
9.11. "Padding"
9.12. "FlowCtrl"
0. Preliminaries
PK -- a public key.
SK -- a private key.
K -- a key for a symmetric cipher.
0.3. Ciphers
We also use the Curve25519 group and the Ed25519 signature format in
several places.
"FFFFFFFFFFFFFFFFC90FDAA22168C234C4C6628B80DC1CD129024E08"
"8A67CC74020BBEA63B139B22514A08798E3404DDEF9519B3CD3A431B"
"302B0A6DF25F14374FE1356D6D51C245E485B576625E7EC6F44C42E9"
"A637ED6B0BFF5CB6F406B7EDEE386BFB5A899FA5AE9F24117C4B1FE6"
"49286651ECE65381FFFFFFFFFFFFFFFF"
KEY_LEN=16.
DH_LEN=128; DH_SEC_LEN=40.
PK_ENC_LEN=128; PK_PAD_LEN=42.
HASH_LEN=20.
1. System overview
The RSA identity key and Ed25519 master identity key together identify a
router uniquely. Once a router has used an Ed25519 master identity key
together with a given RSA identity key, neither of those keys may ever
be
used with a different key.
2. Connections
There are three ways to perform TLS handshakes with a Tor server. In
the first way, "certificates-up-front", both the initiator and
responder send a two-certificate chain as part of their initial
handshake. (This is supported in all Tor versions.) In the second
way, "renegotiation", the responder provides a single certificate,
and the initiator immediately performs a TLS renegotiation. (This is
supported in Tor 0.2.0.21 and later.) And in the third way,
"in-protocol", the initial TLS negotiation completes, and the
parties bootstrap themselves to mutual authentication via use of the
Tor protocol without further TLS handshaking. (This is supported in
0.2.3.6-alpha and later.)
TLS_DHE_RSA_WITH_AES_256_CBC_SHA
TLS_DHE_RSA_WITH_AES_128_CBC_SHA
SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA
The initiator then sends a VERSIONS cell to the responder, which then
replies with a VERSIONS cell; they have then negotiated a Tor
protocol version. Assuming that the version they negotiate is 3 or
higher
(the only ones specified for use with this handshake right now), the
responder sends a CERTS cell, an AUTH_CHALLENGE cell, and a NETINFO
cell to the initiator, which may send either CERTS, AUTHENTICATE,
NETINFO if it wants to authenticate, or just NETINFO if it does not.
TLS connections are not permanent. Either side MAY close a connection
if there are no circuits running over it and an amount of time
(KeepalivePeriod, defaults to 5 minutes) has passed since the last time
any traffic was transmitted over the TLS connection. Clients SHOULD
also hold a TLS connection with no circuits open, if it is likely that a
circuit will be built soon using that connection.
TLS1_ECDHE_ECDSA_WITH_AES_256_CBC_SHA
TLS1_ECDHE_RSA_WITH_AES_256_CBC_SHA
TLS1_DHE_RSA_WITH_AES_256_SHA
TLS1_DHE_DSS_WITH_AES_256_SHA
TLS1_ECDH_RSA_WITH_AES_256_CBC_SHA
TLS1_ECDH_ECDSA_WITH_AES_256_CBC_SHA
TLS1_RSA_WITH_AES_256_SHA
TLS1_ECDHE_ECDSA_WITH_RC4_128_SHA
TLS1_ECDHE_ECDSA_WITH_AES_128_CBC_SHA
TLS1_ECDHE_RSA_WITH_RC4_128_SHA
TLS1_ECDHE_RSA_WITH_AES_128_CBC_SHA
TLS1_DHE_RSA_WITH_AES_128_SHA
TLS1_DHE_DSS_WITH_AES_128_SHA
TLS1_ECDH_RSA_WITH_RC4_128_SHA
TLS1_ECDH_RSA_WITH_AES_128_CBC_SHA
TLS1_ECDH_ECDSA_WITH_RC4_128_SHA
TLS1_ECDH_ECDSA_WITH_AES_128_CBC_SHA
SSL3_RSA_RC4_128_MD5
SSL3_RSA_RC4_128_SHA
TLS1_RSA_WITH_AES_128_SHA
TLS1_ECDHE_ECDSA_WITH_DES_192_CBC3_SHA
TLS1_ECDHE_RSA_WITH_DES_192_CBC3_SHA
SSL3_EDH_RSA_DES_192_CBC3_SHA
SSL3_EDH_DSS_DES_192_CBC3_SHA
TLS1_ECDH_RSA_WITH_DES_192_CBC3_SHA
TLS1_ECDH_ECDSA_WITH_DES_192_CBC3_SHA
SSL3_RSA_FIPS_WITH_3DES_EDE_CBC_SHA
SSL3_RSA_DES_192_CBC3_SHA
[*] The "extended renegotiation is supported" ciphersuite, 0x00ff, is
not counted when checking the list of ciphersuites.
If the client sends the Fixed Ciphersuite List, the responder MUST NOT
select any ciphersuite besides TLS_DHE_RSA_WITH_AES_256_CBC_SHA,
TLS_DHE_RSA_WITH_AES_128_CBC_SHA, and SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA:
such ciphers might not actually be supported by the client.
If the client sends a v2+ ClientHello with a list of ciphers other then
the Fixed Ciphersuite List, the responder can trust that the client
supports every cipher advertised in that list, so long as that
ciphersuite
is also supported by OpenSSL 1.0.1.
Responders MUST NOT select any TLS ciphersuite that lacks ephemeral
keys,
or whose symmetric keys are less then KEY_LEN bits, or whose digests are
less than HASH_LEN bits. Responders SHOULD NOT select any SSLv3
ciphersuite other than the DHE+3DES suites listed above.
VPADDING/PADDING:
Payload contains padding bytes.
CREATE/CREATE2: Payload contains the handshake challenge.
CREATED/CREATED2: Payload contains the handshake response.
RELAY/RELAY_EARLY: Payload contains the relay header and relay body.
DESTROY: Payload contains a reason for closing the circuit.
(see 5.4)
Upon receiving any other value for the command field, an OR must
drop the cell. Since more cell types may be added in the future, ORs
should generally not warn when encountering unrecognized commands.
RELAY cells are used to send commands and data along a circuit; see
section 6 below.
There are multiple instances of the Tor link connection protocol. Any
connection negotiated using the "certificates up front" handshake (see
section 2 above) is "version 1". In any connection where both parties
have behaved as in the "renegotiation" handshake, the link protocol
version must be 2. In any connection where both parties have behaved
as in the "in-protocol" handshake, the link protocol must be 3 or
higher.
To determine the version, in any connection where the "renegotiation"
or "in-protocol" handshake was used (that is, where the responder
sent only one certificate at first and where the initiator did not
send any certificates in the first negotiation), both parties MUST
send a VERSIONS cell. In "renegotiation", they send a VERSIONS cell
right after the renegotiation is finished, before any other cells are
sent. In "in-protocol", the initiator sends a VERSIONS cell
immediately after the initial TLS handshake, and the responder
replies immediately with a VERSIONS cell. (As an exception to this rule,
if both sides support the "in-protocol" handshake, either side may send
VPADDING cells at any time.)
Any VERSIONS cells sent after the first VERSIONS cell MUST be ignored.
(To be interpreted correctly, later VERSIONS cells MUST have a
CIRCID_LEN
matching the version negotiated with the first VERSIONS cell.)
Since the version 1 link protocol does not use the "renegotiation"
handshake, implementations MUST NOT list version 1 in their VERSIONS
cell. When the "renegotiation" handshake is used, implementations
MUST list only the version 2. When the "in-protocol" handshake is
used, implementations MUST NOT list any version before 3, and SHOULD
list at least version 3.
The CERTS cell describes the keys that a Tor instance is claiming
to have. It is a variable-length cell. Its payload format is:
A CERTS cell may have no more than one certificate of each CertType.
AuthType [2 octets]
AuthLen [2 octets]
Authentication [AuthLen octets]
Modified values and new fields below are marked with asterisks.
[04] IPv4.
[06] IPv6.
ALEN MUST be 4 when ATYPE is 0x04 (IPv4) and 16 when ATYPE is 0x06
(IPv6). If the ALEN value is wrong for the given ATYPE value, then
the provided address should be ignored.
5. Circuit management
There are two kinds of CREATE and CREATED cells: The older
"CREATE/CREATED" format, and the newer "CREATE2/CREATED2" format. The
newer format is extensible by design; the older one is not.
or
ntor -- 'ntorNTORntorNTOR'
In general, clients SHOULD use CREATE whenever they are using the TAP
handshake, and CREATE2 otherwise. Clients SHOULD NOT send the
second format of CREATE cells (the one with the handshake type tag)
to a server directly.
Link specifiers describe the next node in the circuit and how to
connect to it. Recognized specifiers are:
Address [4 bytes]
Port [2 bytes]
Onion skin [TAP_C_HANDSHAKE_LEN bytes]
Identity fingerprint [HASH_LEN bytes]
Extending ORs MUST check _all_ provided identity keys (if they
recognize the format), and and MUST NOT extend the circuit if the
target OR did not prove its ownership of any such identity key.
If only one identity key is provided, but the extending OR knows
the other (from directory information), then the OR SHOULD also
enforce that key.
PK-encrypted:
Padding [PK_PAD_LEN bytes]
Symmetric key [KEY_LEN bytes]
First part of g^x [PK_ENC_LEN-PK_PAD_LEN-KEY_LEN
bytes]
Symmetrically encrypted:
Second part of g^x [DH_LEN-(PK_ENC_LEN-PK_PAD_LEN-
KEY_LEN)
bytes]
Once both parties have g^xy, they derive their shared circuit keys
and 'derivative key data' value via the KDF-TOR function in 5.2.1.
x,X = KEYGEN()
The server generates a keypair of y,Y = KEYGEN(), and uses its ntor
private key 'b' to compute:
The client then checks Y is in G^* [see NOTE below], and computes
Both parties check that none of the EXP() operations produced the
point at infinity. [NOTE: This is an adequate replacement for
checking Y for group membership, if the group is curve25519.]
Both parties now have a shared value for KEY_SEED. They expand this
into the keys needed for the Tor relay protocol, using the KDF
described in 5.2.2 and the tag m_expand.
Once both parties have X and Y, they derive their shared circuit keys
and 'derivative key data' value via the KDF-TOR function in 5.2.1.
5.2.1. KDF-TOR
From the base key material K0, they compute KEY_LEN*2+HASH_LEN*3 bytes
of
derivative key data as
The first HASH_LEN bytes of K form KH; the next HASH_LEN form the
forward
digest Df; the next HASH_LEN 41-60 form the backward digest Db; the next
KEY_LEN 61-76 form Kf, and the final KEY_LEN form Kb. Excess bytes from
K
are discarded.
5.2.2. KDF-RFC5869
For newer KDF needs, Tor uses the key derivation function HKDF from
RFC5869, instantiated with SHA256. (This is due to a construction
from Krawczyk.) The generated key material is:
When used in the ntor handshake, the first HASH_LEN bytes form the
forward digest Df; the next HASH_LEN form the backward digest Db; the
next KEY_LEN form Kf, the next KEY_LEN form Kb, and the final
DIGEST_LEN bytes are taken as a nonce to use in the place of KH in the
hidden service protocol. Excess bytes from K are discarded.
Ed25519 identity keys are not required in EXTEND2 cells, so all zero
keys SHOULD be accepted. If the extending relay knows the ed25519 key
from
the consensus, it SHOULD also check that key. (See section 5.1.2.)
If an EXTEND2 cell contains the ed25519 key of the relay that sent the
extend cell, the circuit will fail and be torn down.
ORs SHOULD NOT check the IPs that are listed in the server descriptor.
Trusting server IPs makes it easier to covertly impersonate a relay,
after
stealing its keys.
ORs MAY use multiple methods to check if they are the first hop:
When a relay cell is sent from an OP, the OP encrypts the payload
with the stream cipher as follows:
When speaking v2 of the link protocol or later, clients MUST only send
EXTEND/EXTEND2 cells inside RELAY_EARLY cells. Clients SHOULD send the
first ~8
RELAY cells that are not targeted at the first hop of any circuit as
RELAY_EARLY cells too, in order to partially conceal the circuit length.
Within a circuit, the OP and the end node use the contents of
RELAY packets to tunnel end-to-end commands and TCP connections
("Streams") across circuits. End-to-end commands can be initiated
by either edge; streams are initiated by the OP.
1 -- RELAY_BEGIN [forward]
2 -- RELAY_DATA [forward or backward]
3 -- RELAY_END [forward or backward]
4 -- RELAY_CONNECTED [backward]
5 -- RELAY_SENDME [forward or backward] [sometimes control]
6 -- RELAY_EXTEND [forward] [control]
7 -- RELAY_EXTENDED [backward] [control]
8 -- RELAY_TRUNCATE [forward] [control]
9 -- RELAY_TRUNCATED [backward] [control]
10 -- RELAY_DROP [forward or backward] [control]
11 -- RELAY_RESOLVE [forward]
12 -- RELAY_RESOLVED [backward]
13 -- RELAY_BEGIN_DIR [forward]
14 -- RELAY_EXTEND2 [forward] [control]
15 -- RELAY_EXTENDED2 [backward] [control]
(The digest does not include any bytes from relay cells that do
not start or end at this hop of the circuit. That is, it does not
include forwarded data. Therefore if 'recognized' is zero but the
digest does not match, the running digest at that node should
not be updated, and the cell should be forwarded on.)
All RELAY cells pertaining to the same tunneled stream have the same
stream ID. StreamIDs are chosen arbitrarily by the OP. No stream
may have a StreamID of zero. Rather, RELAY cells that affect the
entire circuit rather than a particular stream use a StreamID of zero
-- they are marked in the table above as "[control]" style
cells. (Sendme cells are marked as "sometimes control" because they
can include a StreamID or not depending on their purpose -- see
Section 7.)
bit meaning
1 -- IPv6 okay. We support learning about IPv6 addresses and
connecting to IPv6 addresses.
2 -- IPv4 not okay. We don't want to learn about IPv4 addresses
or connect to them.
3 -- IPv6 preferred. If there are both IPv4 and IPv6 addresses,
we want to connect to the IPv6 one. (By default, we connect
to the IPv4 address.)
4..32 -- Reserved. Current clients MUST NOT set these. Servers
MUST ignore them.
Upon receiving this cell, the exit node resolves the address as
necessary, and opens a new TCP connection to the target port. If the
address cannot be resolved, or a connection can't be established, the
exit node replies with a RELAY_END cell. (See 6.3 below.)
Otherwise, the exit node replies with a RELAY_CONNECTED cell, whose
payload is in one of the following formats:
or
[Tor exit nodes before 0.1.2.0 set the TTL field to a fixed value.
Later
versions set the TTL to the last value seen from a DNS server, and
expire
their own cached entries after a fixed interval. This prevents certain
attacks.]
If the exit node does not support optimistic data (i.e. its
version number is before 0.2.3.1-alpha), then the OP MUST wait
for a RELAY_CONNECTED cell before sending any data. If the exit
node supports optimistic data (i.e. its version number is
0.2.3.1-alpha or later), then the OP MAY send RELAY_DATA cells
immediately after sending the RELAY_BEGIN cell (and before
receiving either a RELAY_CONNECTED or RELAY_END cell).
[*] Older versions of Tor also send this reason when connections are
reset.
OPs and ORs MUST accept reasons not on the above list, since future
versions of Tor may provide more fine-grained reasons.
Reason [1 byte]
Reason [1 byte]
IPv4 or IPv6 address [4 bytes or 16 bytes]
TTL [4 bytes]
Tors SHOULD NOT send any reason except REASON_MISC for a stream that
they
have originated.
After sending a RELAY_END cell, the sender needs to give the recipient
time to receive that cell. In the meantime, the sender SHOULD remember
how many cells of which types (CONNECTED, SENDME, DATA) that it would
have
accepted on that stream, and SHOULD kill the circuit if it receives more
than permitted.
An exit (or onion service) connection can have a TCP stream in one of
three states: 'OPEN', 'DONE_PACKAGING', and 'DONE_DELIVERING'. For the
purposes of modeling transitions, we treat 'CLOSED' as a fourth state,
although connections in this state are not, in fact, tracked by the
onion router.
Type (1 octet)
Length (1 octet)
Value (variable-width)
TTL (4 octets)
"Length" is the length of the Value field.
"Type" is one of:
0x00 -- Hostname
0x04 -- IPv4 address
0x06 -- IPv6 address
0xF0 -- Error, transient
0xF1 -- Error, nontransient
For backward compatibility, if there are any IPv4 answers, one of those
must be given as the first answer.
7. Flow control
Communicants rely on TCP's default flow control to push back when they
stop reading.
The mainline Tor implementation uses token buckets (one for reads,
one for writes) for the rate limiting.
Since 0.2.0.x, Tor has let the user specify an additional pair of
token buckets for "relayed" traffic, so people can deploy a Tor relay
with strict rate limiting, but also use the same Tor as a client. To
avoid partitioning concerns we combine both classes of traffic over a
given OR connection, and keep track of the last time we read or wrote
a high-priority (non-relayed) cell. If it's been less than N seconds
(currently N=30), we give the whole connection high priority, else we
give the whole connection low priority. We also give low priority
to reads and writes for connections that are serving directory
information. See proposal 111 for details.
Version [1 byte]
Command [1 byte]
ito_low_ms [2 bytes]
ito_high_ms [2 bytes]
These two windows are respectively named: the package window (packaged
for
transmission) and the deliver window (delivered for local streams).
VERSION [1 byte]
DATA_LEN [2 bytes]
DATA [DATA_LEN bytes]
The DIGEST is the rolling digest value from the RELAY_DATA cell
that
immediately preceded (triggered) this RELAY_SENDME. This value is
matched on the other side from the previous cell sent that the
OR/OP
must remember.
* Free all memory held by the most stale circuit, and send
DESTROY
cells in both directions on that circuit. Count the amount of
memory we recovered towards the total.
9. Subprotocol versioning
This section specifies the Tor subprotocol versioning. They are broken
down
into different types with their current version numbers. Any new version
number should be added to this section.
The dir-spec.txt details how those versions are encoded. See the
"proto"/"pr" line in a descriptor and the "recommended-relay-protocols",
"required-relay-protocols", "recommended-client-protocols" and
"required-client-protocols" lines in the vote/consensus format.
Here are the rules a relay and client should follow when encountering a
protocol list in the consensus:
9.1. "Link"
The "link" protocols are those used by clients and relays to initiate
and
receive OR connections and to handle cells on OR connections. The "link"
protocol versions correspond 1:1 to those versions.
Two Tor instances can make a connection to each other only if they have
at
least one link protocol in common.
The current "link" versions are: "1" through "5". See section 4.1 for
more
information. All current Tor versions support "1-3"; versions from
0.2.4.11-alpha and on support "1-4"; versions from 0.3.1.1-alpha and on
support "1-5". Eventually we will drop "1" and "2".
9.2. "LinkAuth"
9.3. "Relay"
"1" -- supports the TAP key exchange, with all features in Tor 0.2.3.
Support for CREATE and CREATED and CREATE_FAST and CREATED_FAST
and EXTEND and EXTENDED.
"2" -- supports the ntor key exchange, and all features in Tor
0.2.4.19. Includes support for CREATE2 and CREATED2 and
EXTEND2 and EXTENDED2.
Bridges might not extend over IPv6, because they try to imitate
client behaviour.
In particular:
* relays without an IPv6 ORPort, and
* tor instances that are not relays,
have the following behaviour, regardless of their configuration:
* advertise support for "Relay=3" in their descriptor
(if they are a relay, bridge, or directory authority), and
* react to consensuses recommending or requiring support for
"Relay=3".
9.4. "HSIntro"
9.5. "HSRend"
9.6. "HSDir"
The "HSDir" protocols are the set of hidden service document types that
can
be uploaded to, understood by, and downloaded from a tor relay, and the
set
of URLs available to fetch them.
9.7. "DirCache"
The "DirCache" protocols are the set of documents available for download
from a directory cache via BEGIN_DIR, and the set of URLs available to
fetch them. (This excludes URLs for hidden service objects.)
9.8. "Desc"
9.9. "Microdesc"
9.10. "Cons"
9.11. "Padding"
9.12. "FlowCtrl"
Describes the flow control protocol at the circuit and stream level. If
there is no FlowCtrl advertised, tor supports the unauthenticated flow
control features (version 0).