0% found this document useful (0 votes)
25 views

Bit torrent

Uploaded by

patatapocha18
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views

Bit torrent

Uploaded by

patatapocha18
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 125

ÁNGEL CUEVAS RUMÍN ([email protected].

es)

1
 Introduction
 BT System Functionality
 BT content & .torrent files & Peer-ID
 Content Publishing
 Peer connecting to a swarm
 Peer-Tracker communication
 DHTs (Chord)
 Peer-Wire protocol
 Piece Selection
 Peer Selection
 BT Applications
 Debate

2
 This course takes a big portion of its content
from:
◦ http://cel.archives-ouvertes.fr/cel-00544132/en/

 It is a complete course on P2P technology


beyond BitTorrent

3
 Introduction
 BT System Functionality
 BT content & .torrent files & Peer-ID
 Content Publishing
 Peer connecting to a swarm
 Peer-Tracker communication
 DHTs (Chord)
 Peer-Wire protocol
 Piece Selection
 Peer Selection
 BT Applications
 Debate

4
 BitTorrent is a P2P file-sharing system
◦ Protocol
◦ Company
 BitTorrent Inc. was created by Bram Choen
and Ashwin Navin in 2004
 In 2009 it accounted between 43% and 70% of
all Internet traffic
 Nowadays it is still one of applications
generating relevant portion of Internet traffic
(in particular upstream).

5
6
7
8
9
10
11
12
13
 BT belongs to the category of unstructured
P2P networks
 BT can present some central infrastructure
◦ Torrent portals (e.g. The Pirate Bay)
◦ Trackers
 BT can be fully distributed
◦ DHTs
◦ Magnet links

14
 Introduction
 BT System Functionality
 BT content & .torrent files & Peer-ID
 Content Publishing
 Peer connecting to a swarm
 Peer-Tracker communication
 DHTs (Chord)
 Peer-Wire protocol
 Piece Selection
 Peer Selection
 BT Applications
 Debate

15
Leecher
Seeder

BitTorrent Portal

SWARM Tracker

Content Publisher = Initial Seeder

16
 Introduction
 BT System Functionality
 BT content & .torrent files & Peer-ID
 Content Publishing
 Peer connecting to a swarm
 Peer-Tracker communication
 DHTs (Chord)
 Peer-Wire protocol
 Piece Selection
 Peer Selection
 BT Applications
 Debate

17
 Content is divided into pieces that in turn are
divided into blocks
 Pieces
◦ Size: 256/512/1024/2048 Kbytes (small size
.torrent file)
◦ Each piece is hashed (SHA-1 function) and the
resulted is inserted in the .torrent file
◦ The smaller unit of retransmission
 Blocks
◦ Data information exchanged in the BT protocol
◦ 16 Kbytes
◦ There is between 16 and 128 blocks in a piece

18
 Info Key (infohash = SHA-1 of Info Key)
◦ Content length in bytes
◦ Md5 hash of the content (optional)
◦ File Name
◦ Piece length
◦ Concatenation of all pieces hashes (20 bytes)
 URL of the tracker (optional list of backup
trackers)
 Some optional fields
◦ Creation Date
◦ Comment
◦ Created by

19
d8:announce40:http://tracker.thepiratebay.org/announce13:announce
listll40:http://tracker.thepiratebay.org/announceel35:udp:
tracker.openbittorrent.com:80el23:udp://tracker.ccc.de:80el26:udp:
tracker.istole.it:80el29:udp://tracker.publicbt.com:80el32:http://exodus
1337x.org/announceel37:http://tracker.bittorrent.am
announceel28:http://10.rarbg.com/announceel35:http://bt.firebit.org:2710
announceel38:http://genesis.1337x.org:1337/announceel33:http://nemesis
1337x.org/announceel40:http://inferno.demonoid.me:3411
announceee7:comment47:Torrent downloaded from http:thepiratebay.org10:created
by13:uTorrent/300013:creation datei1321927810e8:durationi6128e12:encoded
ratei293228e8:encoding5:UTF-86:heighti688e4:infod5:filesld6:lengthi7346e4:pathl
6:Info_by_Cody.txteed6:lengthi1796901688e4:pathl49:Margin Call [2011] 720p
BRRip H264 AAC - CODY.mp4eed6:lengthi92519e4:pathl49:Margin Call [2011] 720p
BRRip H264 AAC - CODY.srteed6:lengthi46e4:pathl39:Torrent downloaded from
Demonoid.me.txteee4:name45:Margin Call [2011] 720p BRRip H264 AAC
CODY12:piece lengthi2097152e6:pieces17140:gÛr¿äY=©ªó¨ñõnN
€G«zZq˝ÑY<≈uû_,ÊΩ€flÊ~ȸ√áH¡ä¡}]>Á

20
FILE SIZE 350 MB 700 MB 1400 MB

PIECE SIZE 64KB 256KB 512KB 64KB 1MB

#PIECES 5800 1400 700 11200 1400

TORRENT SIZE 120KB 30KB 15KB 240KB 30KB

21
 Peer ID = client ID random string
◦ Client ID: name of BT client + its version
◦ Random String: it changes each time the client
starts
 e.g: Seed: System Clock + IP address of the peer
◦ Examples
 AZ2306-LwkWkRU95L9s
 AZ2402-YbqhPheosA4a
 BC0062->*\xb1\xfdMm\xb9\x96\x96\xf0\xb8\xd9
 UT1500-\xb5\x81\xf1+\xa3\xd3\xc7\xf3\x7f|\x1a\xb0

22
 Introduction
 BT System Functionality
 BT content & .torrent files & Peer-ID
 Content Publishing
 Peer connecting to a swarm
 Peer-Tracker communication
 DHTs (Chord)
 Peer-Wire protocol
 Piece Selection
 Peer Selection
 BT Applications
 Debate

23
 Create a .torrent using
◦ The content for this torrent. Used to compute the
SHA-1 of each piece and infohash
 The URL of the tracker (you can create your
own tracker or use a public tracker)
 Upload the .torrent file to a torrent discovery
site ( e.g., ThePirateBay)
 Seed the content
◦ Simply start a client using the .torrent file and the
content to be seeded

24
 Introduction
 BT System Functionality
 BT content & .torrent files & Peer-ID
 Content Publishing
 Peer connecting to a swarm
 Peer-Tracker communication
 DHTs (Chord)
 Peer-Wire protocol
 Piece Selection
 Peer Selection
 BT Applications
 Debate

25
1. A peer downloads a .torrent file from a web
server
2. The peer obtains the tracker’s URL and
connects to it (HTTP GET)
3. The peer sends
◦ Info_hash (SHA-1 of the info key)
◦ Peer ID
◦ Port the client listen on
◦ Number of peers wanted (default 50)
◦ Various statistics
 Uploaded, downloaded, left, event (started, stopped,
completed)

26
4. The tracker returns
◦ Failure reason
◦ Interval between statistics
◦ A random list of peers already in the torrent
 Peer ID, peer IP, peer port
 Typically 50 peers
◦ Statistics
 Complete/incomplete (seed/leechers)

27
5. Peer connects to each peer returned by
the tracker
◦ At most 40 outgoing connections
◦ Remaining peers kept as a pool of peers to
connect to
 Peer set (neighbor set) size 80
◦ Maximum 40 outgoing connections
◦ Maximum 80 connections in total
◦ Results in a well connected graph (random graph)
 Recent peers get a chance to be connected to old peers

28
 Magnet link
◦ They are mainly used to reference resources available for
download via p2p networks.
◦ They typically identify a file not by location or name, but
by content
◦ Magnet links allow to download the .torrent file from
another user (avoiding centralized websites)
 DHT
 Peer exchange (PEX)
◦ PEX greatly reduces the reliance of peers on a tracker by
allowing each peer to exchange their list of neighbors
◦ By reducing dependency on a centralized tracker, PEX
increases the speed, efficiency, and robustness of the
BitTorrent protocol.

29
 Introduction
 BT System Functionality
 BT content & .torrent files & Peer-ID
 Content Publishing
 Peer connecting to a swarm
 Peer-Tracker communication
 DHTs (Chord)
 Peer-Wire protocol
 Piece Selection
 Peer Selection
 BT Applications
 Debate

30
 The tracker is an HTTP/HTTPS service which
responds to HTTP GET requests
 The requests include metrics from clients that
help the tracker keep overall statistics about
the torrent
 The response includes a peer list that helps
the client participate in the torrent.

31
 info_hash: 20-byte SHA1 hash of the value of the info key from the
Metainfo file.
 peer_id
 Port: The port number that the client is listening on. Ports reserved for
BitTorrent are typically 6881-6889
 Uploaded: The total amount (of bytes) uploaded since the client sent the
'started' event to the tracker.
 Downloaded: The total amount (of bytes) downloaded since the client
sent the 'started' event to the tracker.
 left: The number of bytes this client still has to download
 compact: Setting this to 1 indicates that the client accepts a compact
response. The announce-list is replaced by a peers string with 6 bytes
per peer. The first four bytes are the host, the last two bytes are the
port.
◦ It should be noted that some trackers only support compact responses (for saving
bandwidth) and either refuse requests without "compact=1" or simply send a
compact response unless the request contains "compact=0" (in which case they will
refuse the request.)
 no_peer_id: Indicates that the tracker can omit peer id field in
announce-list dictionary. This option is ignored if compact is enabled

32
 event: If specified, must be one of started, completed, stopped, (or empty
which is the same as not being specified). If not specified, then this request
is one performed at regular intervals.
◦ started: The first request to the tracker must include the event key with this value.
◦ stopped: Must be sent to the tracker if the client is shutting down gracefully.
◦ completed: Must be sent to the tracker when the download completes. However, must
not be sent if the download was already 100% complete when the client started.
Presumably, this is to allow the tracker to increment the "completed downloads"
metric based solely on this event.
 ip: Optional. The true IP address of the client machine
 numwant: Optional. Number of peers that the client would like to receive
from the tracker. This value is permitted to be zero. If omitted, typically
defaults to 50 peers.
◦ Random Selection of peers
◦ Smarter techniques can be used (e.g., do not reply seeders to other seeders).
 key: Optional. An additional client identification mechanism that is not
shared with any peers. It is intended to allow a client to prove their identity
even after IP address change.
 tracker id: Optional. If a previous announce contained a tracker id, it should
be set here.

33
 failure reason: If present, then no other keys may be present. The value is a human-
readable error message as to why the request failed (string).
 warning message: (optional) Similar to failure reason, but the response still gets
processed normally. The warning message is shown just like an error.
 interval: Interval in seconds that the client should wait between sending regular requests
to the tracker
 min interval: (optional) Minimum announce interval. If present, clients must not
reannounce more frequently than this.
 tracker id: A string that the client should send back on its next announcements. If
absent and a previous announce sent a tracker id, do not discard the old value; keep
using it.
 complete: number of peers with the entire file, i.e. seeders (integer)
 incomplete: number of non-seeder peers, aka "leechers" (integer)
 peers: (dictionary model) The value is a list of dictionaries, each with the following keys:
◦ peer id: peer's self-selected ID, as described above for the tracker request (string)
◦ ip: peer's IP address either IPv6 (hexed) or IPv4 (dotted quad) or DNS name (string)
◦ port: peer's port number (integer)
 peers: (binary model) Instead of using the dictionary model described above, the peers
value may be a string consisting of multiples of 6 bytes. First 4 bytes are the IP address
and last 2 bytes are the port number. All in network (big endian) notation.

34
 Reconnect to the tracker
◦ If the peer set size falls below 20
 Ask for new peers
 Small enough to avoid frequent tracker requests
◦ Every 30 minutes
 For statistics: amount of bytes uploaded/downloaded,
number of bytes left
◦ When the peer leaves the torrent
 For statistics
 To update the list of peers

35
 Different URL to connect the tracker
◦ http://example.com/announce -> http://example.com/scrape
 Used to get statistics on torrents
◦ complete: # of seeds
◦ incomplete: # of leechers
◦ downloaded: # of peers who completed a download
◦ name: (optional) name of the torrent in the .torrent file
(not used in most popular tracker)
 If the scrape request contains an infohash
◦ Returns statistics for this torrent
 If no infohash is given
◦ Returns statistics for all torrents hosted by the tracker
◦ Might be a huge amount of data
 Some trackers return a compressed list (using, e.g., gzip)

36
 files
◦ a dictionary containing one key/value pair for each torrent
for which there are stats.
◦ If info_hash was supplied and was valid, this dictionary will
contain a single key/value.
◦ Each key consists of a 20-byte binary info_hash.
◦ The value of each entry is another dictionary containing the
following:
 complete: number of peers with the entire file, i.e. seeders
(integer)
 downloaded: total number of times the tracker has registered a
completion ("event=complete", i.e. a client finished
downloading the torrent)
 incomplete: number of non-seeder peers, aka "leechers"
(integer)
 name: (optional) the torrent's internal name, as specified by the
"name" file in the info section of the .torrent file

37
 Introduction
 BT System Functionality
 BT content & .torrent files & Peer-ID
 Content Publishing
 Peer connecting to a swarm
 Peer-Tracker communication
 DHTs (Chord)
 Peer-Wire protocol
 Piece Selection
 Peer Selection
 BT Applications
 Debate

38
Regular Hash Table Distributed Hash Table

Key hash (key) Key hash (key)

Harry Harry
3 3
Potter Potter

Avatar 12 Avatar 12

Avengers 9 Avengers 9

Austin Austin
5 5
Powers Powers

39
Distributed Hash Table
 Objects and nodes Id
use the same Key hash (key)
numbering space
Harry
3
Potter
 Algorithm to assign
Avatar 12
objects to nodes
Avengers 9
 Algorithm to look for
Austin
objects Powers
5

40
 Ring structure
 ID space [0…..2m] (2, 4, 8, 16, 32, 64, 128,...)
◦ Consistent hashing => New node implies few changes
◦ ID = hash(key) mod(2m)
◦ Hash operation SHA-1
 Each node knows its successor
 Resources are stored in the immediate available subsequent
node
 Operations
◦ Resource storage
◦ Resource Lookup
◦ Node insertion
◦ Node Leaving/Failure
◦ Lookup improvement
◦ Resource replication

*I. Stoica et al., "Chord: a scalable peer-to-peer lookup protocol for Internet applications,"
in IEEE/ACM Transactions on Networking, vol. 11, no. 1, pp. 17-32, Feb. 2003, doi:
10.1109/TNET.2002.808407.

41
N1
m=4; ID Space= [0…15]

Node ID = hash (IP address)


N15

N12
N4

N10

N8
42
N1
m=4; ID Space= [0…15]

Node ID = hash (IP address)


N15

Node Successor

N12
N4

N10

N8
43
N1
m=4; ID Space= [0…15]

Node ID = hash (IP address)


N15

Node Successor

N1 N4

N12
N4

N10

N8
44
N1
m=4; ID Space= [0…15]

Node ID = hash (IP address)


N15

Node Successor

N1 N4

N4 N8 N12
N4

N10

N8
45
N1
m=4; ID Space= [0…15]

Node ID = hash (IP address)


N15

Node Successor

N1 N4

N4 N8 N12
N4
N8 N10

N10

N8
46
N1
m=4; ID Space= [0…15]

Node ID = hash (IP address)


N15

Node Successor

N1 N4

N4 N8 N12
N4
N8 N10

N10 N12
N10

N8
47
N1
m=4; ID Space= [0…15]

Node ID = hash (IP address)


N15

Node Successor

N1 N4

N4 N8 N12
N4
N8 N10

N10 N12
N10
N12 N15

N8
48
N1
m=4; ID Space= [0…15]

Node ID = hash (IP address)


N15

Node Successor

N1 N4

N4 N8 N12
N4
N8 N10

N10 N12
N10

N12 N15

N8
N15 N1 49
N1 (0,1)
m=4; ID Space= [0…15]

Node ID = hash (IP address)

Resource Bittorrent file N15


(13, 14,15)

N12
N4 (2,
(11, 12)
3, 4)

N10
(9, 10)

N8 (5, 6, 7, 8)
50
N1
m=4; ID Space= [0…15]

Node ID = hash (IP address)


N15
Resource Bittorrent file

hash
Key Node
(key)
Avatar Harry
Potter
Harry
3 N4
Potter N12
N4
Avatar 12 N12

Avengers 9 N10
N10
Austin Avengers
5 N8
Powers

N8
Austin
Powers

51
N1 Query
N1, Avengers, ID=9
m=4; ID Space= [0…15]

Node ID = hash (IP address)


N15
Resource Bittorrent file

hash
Key Node
(key)

Harry
3 N4
Potter N12
N4
Avatar 12 N12

Avengers 9 N10
N10
Austin
5 N8
Powers

N8
52
N1
m=4; ID Space= [0…15]

Node ID = hash (IP address)


N15
Resource Bittorrent file

hash
Key Node
(key)

Harry
3 N4
Potter N12
N4
Query
Avatar 12 N12 N1, Avengers, ID=9

Avengers 9 N10
N10
Austin
5 N8
Powers

N8
53
N1
m=4; ID Space= [0…15]

Node ID = hash (IP address)


N15
Resource Bittorrent file

hash
Key Node
(key)

Harry
3 N4
Potter N12
N4
Avatar 12 N12

Avengers 9 N10
Query
N10 N1, Avengers, ID=9

Austin
5 N8
Powers

N8
54
N1
m=4; ID Space= [0…15]

Node ID = hash (IP address)


N15
Resource Bittorrent file

hash
Key Node
(key)

Harry
3 N4
Potter N12
N4
Avatar 12 N12

Avengers 9 N10
N10
Austin
5 N8 Query
Powers N1, Avengers, ID=9

N8
55
N1
m=4; ID Space= [0…15]

Node ID = hash (IP address)


N15
Resource Bittorrent file

hash
Key Node
(key)

Harry
3 N4
Potter N12
N4
Avatar 12 N12

Avengers 9 N10
N10
Austin
5 N8 Avengers
Powers

N8
56
N1
m=4; ID Space= [0…15]

Node ID = hash (IP address)


N15
Resource Bittorrent file

hash Query
Key Node N1, Avengers, ID=9
(key)

Harry
3 N4
Potter N12
N4
Avatar 12 N12

Avengers 9 N10
N10
Austin
5 N8
Powers

N8
57
N1
m=4; ID Space= [0…15]

Node ID = hash (IP address)


N15
Resource Bittorrent file

Reply N8, IPN8


hash
Key Node
(key)

Harry
3 N4
Potter N12
N4
Avatar 12 N12

Avengers 9 N10
N10
Austin
5 N8
Powers

N8
58
N1
m=4; ID Space= [0…15]

Node ID = hash (IP address)


N15
Resource Bittorrent file

hash
Key Node
(key)

Harry
3 N4
Potter N12 Query
N1, Avengers, ID=9
N4
Avatar 12 N12

Avengers 9 N10
N10
Austin
5 N8
Powers

N8
59
N1
m=4; ID Space= [0…15]

Node ID = hash (IP address)


N15
Resource Bittorrent file

hash
Key Node
(key)

Harry Reply N10, IPN10


3 N4
Potter N12
N4
Avatar 12 N12

Avengers 9 N10
N10
Austin
5 N8
Powers

N8
60
N1
m=4; ID Space= [0…15]

Node ID = hash (IP address)


N15
Resource Bittorrent file

hash
Key Node
(key)

Harry
3 N4
Potter N12 Query
N1, Avengers, ID=9
N4
Avatar 12 N12

Avengers 9 N10
N10
Austin
5 N8
Powers

N8
61
N1
m=4; ID Space= [0…15]

Node ID = hash (IP address)


N15
Resource Bittorrent file

hash
Key Node
(key)

Harry
3 N4
Potter N12
N4
Avatar 12 N12

Avengers 9 N10
N10
Austin
5 N8
Powers Avengers

N8
62
N1
Node Successor

N1 N4
N15
N3 N3

N4 N8
Avatar Harry
Potter

N8 N10
N12
N4
N10 N12

N12 N15
N10
N15 N1 Avengers

N8
Austin
Powers

63
N1
Node Successor

N1 N4
N15
N3 N4 N3

N4 N8
Avatar Harry
Potter

N8 N10
N12
N4
N10 N12

N12 N15
N10
N15 N1 Avengers

N8
Austin
Powers

64
N1
Node Successor

N1 N4 N3
N15
N3 N4 N3

N4 N8
Avatar
Harry
Potter
N8 N10
N12
N4
N10 N12

N12 N15
N10
N15 N1 Avengers

N8
Austin
Powers

65
N1

hash
Key Node
(key)
N15
Harry N3
3 N4 N3
Potter

Avatar 12 N12 Avatar


Harry
Potter

Avengers 9 N10 N12


N4
Austin
5 N8
Powers

N10
Avengers

N8
Austin
Powers

66
N1
Node Successor

N1 N3 Harry

N15 Potter

N3 N4 N3

N4 N8
Avatar

N8 N10
N12
N4
N10 N12

N12 N15
N10
N15 N1 Avengers

N8
Austin
Powers

67
N1
Node Successor

N1 N3 Harry

N15 Potter

N3 N4 N3

N4 N8
Avatar

N8 N10
N4
N10 N12 N15

N15 N1
N10
Avengers

N8
Austin
Powers

68
N1

hash
Key Node
(key) Harry

N15 Potter

Harry N3
3 N3
Potter

Avatar 12 N12 N15 Avatar

Avengers 9 N10
N4
Austin
5 N8
Powers

N10
Avengers

N8
Austin
Powers

69
Finger i = successor (N+2i-1) (mod 2m), where 1<= i <=m
N1
Finger Table N1
Finger Node
N15
N1+20= 2 N4

N1+21= 3 N4

N1+22=5 N8 N12

N4
N1+23= 9 N10

N10

N8
Finger i = successor (N+2i-1) (mod 2m), where 1<= i <=m
N1
Finger Table N4
Finger Node
N15
N4+20= 5 N8

N4+21= 6 N8

N4+22=8 N8 N12

N4
N4+23= 12 N12

N10

N8
Finger i = successor (N+2i-1) (mod 2m), where 1<= i <=m
N1
Finger Table N8
Finger Node
N15
N8+20= 9 N10

N8+21= 10 N10

N8+22=12 N12 N12

N4
N8+23= 14 N15

N10

N8
Finger i = successor (N+2i-1) (mod 2m), where 1<= i <=m
N1
Finger Table N10
Finger Node
N15
N10+20=11 N12

N10+21= 12 N12

N10+22=14 N15 N12

N10+23= 18 N4
N4
(mod 16) = 2

N10

N8
Finger i = successor (N+2i-1) (mod 2m), where 1<= i <=m
N1
Finger Table N10
Finger Node
N15
N10+20=11 N12

N10+21= 12 N12

N10+22=14 N15 N12

N10+23= 18 N4
N4
(mod 16) = 2

N10

N8
N1

N15

N12

N4

N10

N8
N1 Query
N1, Avatar, ID=12

hash
Key Node
(key) N15
Harry
3 N4
Potter
Avatar

Harry
Avatar 12 N12 Potter

N12
Avengers 9 N10 N4

Austin
5 N8
Powers

N10

Avengers

Austin
N8 Powers
76
76
N1 Query
N1, Avatar, ID=12

hash
Key Node
(key) N15
Harry
3 N4
Potter
Avatar

Harry
Avatar 12 N12 Potter

N12
Avengers 9 N10 N4

Austin
5 N8
Powers

N10

Avengers

Austin
N8 Powers
77
77
N1

hash
Key Node
(key) N15
Harry
3 N4
Potter
Avatar

Harry
Avatar 12 N12 Potter

N12
Avengers 9 N10 N4

Austin
5 N8
Powers

N10
Query
N1, Avatar, ID=12
Avengers

Austin
N8 Powers
78
78
N1

hash
Key Node
(key) N15
Harry
3 N4
Potter

Avatar 12 N12
N12
Avengers 9 N10 N4
Avatar

Austin
5 N8
Powers

N10

N8
79
79
N1

lookup(K54)
// ask node n to fi nd the successor of i d N8
K54 N56
n.fi n d su ccessor (i d) 5
if (i d ∈ (n, successor])
return successor; N51
N1 N14
else
N48 lookup(K54)
// forward the query around the circle
// ask node n to fi nd the successor of i d N8
n.fi n d su ccessor return
(i d) successor.fi nd successor(i d);
K54 N56

if (i d ∈ (n, successor])
return successor; N51 N21
N14
else N42
N48
// forward the query around the circle
(a)d);
return successor.fi nd successor(i
N38
N21 N32
N42

(a)
(b)
N38
a) Simple (but slow) pseudocode to fi nd the successor node of an identifi
N32 er i d. Remote procedure calls and variable lookups are preceded
(b) The path taken by a query from node 8 for key 54, using the pseudocode
(b) in Figure 3(a).
ple (but slow) pseudocode to fi nd the successor node of an identifi er i d. Remote procedure calls and variable lookups are preceded by the remote
he path taken by a query from node 8 for key 54, usingN1the pseudocode in Figure 3(a). N1

Finger table lookup(54)


N1 N8 N1
+1 N8 + 1 N14 N8
Finger table N8 + 2 N14 K54 N56 lookup(54)
N8
+1 N8 + 1 N14
+2 N8 + 4 N14 N8
N8 + 2 N14 K54 N56
N8 + 8 N21
N51 +4 N51
+2 N8 + 4 N14
+32 N8 + 8+8
N21 N14 N8 +16 N32
N14
N51 +4 N51 N8 +32 N42
N48 +32 +8 +16N8 +16 N32
N14
N48 N14
N8 +32 N42
N48 +16 N48

N21 N21 N21 N21


N42 N42
N42 N42

N38 N38
N32
N38 N32 N38
N32 N32
(a) (b)
Fig. 4. (a) The fi nger table entries for node 8. (b) The(a) (b)5.
path a query for key 54 starting at node 8, using the algorithm in Figure
Replica 1 Replica 1
Key hash (key)
hash (key+R1) hash (key+Rn)

Harry Potter 3 7 1

Avatar 12 15 9

Avengers 9 3 5

Austin Powers 5 11 1

81
N1 Harry Austin
Potter Powers

Avatar

hash
Key Node N15
(key)

N4 Austin
Harry
3, 7, 1 N8 Powers Avengers
Potter
N1
N12 Avatar Harry
Avatar 12, 15, 9 N15 Potter

N10
N12
N10 N4
Avengers 9, 3, 5 N4
N8
N8
Austin
5, 11, 1 N12
Powers
N1
N10
Avengers Avatar

Harry
N8
Austin Avengers
Powers Potter

82
 All improvements has an associated cost
 Fingers
◦ Ring maintenance more complex
◦ More memory use for finger tables
◦ More overhead to maintain up to date finger tables
◦ More message exchanged in node join/leave
◦ etc
 Replication
◦ More memory usage for storing replicas
◦ More overhead to maintain replicas information
◦ More message exchanged (replicas) in node join/leave
◦ Etc.

83
 Introduction
 BT System Functionality
 BT content & .torrent files & Peer-ID
 Content Publishing
 Peer connecting to a swarm
 Peer-Tracker communication
 DHTs (Chord)
 Peer-Wire protocol
 Piece Selection
 Peer Selection
 BT Applications
 Debate

84
 It uses TCP
 Facilitates the exchange of pieces as
described in the metainfo file.
 The peers exchange blocks until they
complete the pieces (so they can then
announce they have the piece)

85
 choked:
◦ Whether or not the remote peer has choked the local
peer (or client)
◦ When a peer chokes the local peer, it is a notification
that no requests will be answered until the client is
unchoked
◦ The local peer should not attempt to send requests for
blocks, and it should consider all pending (unanswered)
requests to be discarded by the remote peer
 interested:
◦ Whether or not the remote peer is interested in
something the local peer has to offer.
◦ This is a notification that the remote peer will begin
requesting blocks when the local peer unchokes them.

86
 Each peer maintains for each remote peer it is
connected to the following state info:
◦ am_choking: the local peer is choking this remote
peer
◦ am_interested: the local peer is interested in at least
one piece on this remote peer
◦ peer_choking: this remote peer is choking the local
peer
◦ peer_interested: this remote peer is interested in at
least one piece of this local peer
 Initially state for the local peer: 1 0 1 0
◦ Boths sides chocked
◦ Not initial interest

87
 A block is downloaded by the local peer when the
local peer is interested in a remote peer, and that
remote peer is not choking the client
 A block is uploaded by a local peer when the
local peer is not choking a remote peer, and that
remote peer is interested in the local peer.
 Unlike client-server architectures, this is not the
client who decides when to receive data
◦ A peer can always refuse to serve another peer
 The decision to unchoke a peer is taken by the
choke algorithm
 The choice of the piece to request is taken using
the rarest first algorithm

88
 All the connections among the peers are over
TCP
◦ TCP/IP header overhead: 40 bytes
 To initiate connections and maintain the
state, there are 11 messages in the P2P
protocol

89
 Two way handshake
 To initiate a connection between two peers
 Once initiated, the connection is symmetric
 Contains (49 bytes + len(pstr))
◦ Pstrlen (1 byte)
 protocol string identifier length
 Version 1.0 pstrlen=19
◦ Pstr=“BitTorrent protocol” (version 1.0, 19 bytes)
◦ Reserved (8 bytes)
◦ Info_hash (20 bytes)
◦ Peer ID (20 bytes)

90
 Once a connection is initiated, all the
messages on this connection are of the form
◦ <length prefix><message ID><payload>
 <length prefix>: 4 bytes (max length: 2 32)
 <message ID>: decimal char
 <payload>: message dependant

91
 KEEP ALIVE (4 bytes)
◦ <len=0000>
◦ Sent every 2 minutes unless another message is
sent
◦ Because there is no way to find that a TCP
connection is dead unless sending a message
 CHOKE (5 bytes)
◦ <len=0001><ID=0>
◦ Sent from A to B when A choke B

92
 UNCHOKE (5 bytes)
◦ <len=0001><ID=1>
◦ Sent from A to B when A unchoke B
 INTERESTED (5 bytes)
◦ <len=0001><ID=2>
◦ Sent from A to B when A is interested in B
 NOT_INTERESTED (5 bytes)
◦ <len=0001><ID=3>
◦ Sent from A to B when A is not interested in B

93
 HAVE (9 bytes)
◦ <len=0005><ID=4><piece index>
◦ Sent from a peer to all the peers in its peer set when it
just received the piece with ID <piece index>, and that
the SHA-1 of this piece is checked
◦ HAVE sent to each peer in the peer set is not required for the
correct protocol operation
 A peer may choose not to advertise having a piece to a peer
that already has that piece. At a minimum "HAVE
suppression" will result in a 50% reduction in the number of
HAVE messages, this translates to around a 25-35%
reduction in protocol overhead
◦ However, this information is useful for torrent monitoring
 it may be worthwhile to send a HAVE message to a peer that
has that piece already since it will be useful in determining
which piece is rare.

94
 BITFIELD (Ceil[(# of pieces)/8] + 5 bytes)
◦ <len=0001+X><ID=5><bitfield>
◦ First message sent after the handshake
 No more sent in the following
 Sent by both peers once the connection is initialized
◦ The payload is a bitfield representing the pieces
that have been successfully downloaded
 The high bit in the first byte corresponds to piece
index 0
◦ Bit i in the bitfield is set to 1 if the peer has piece i,
0 otherwise

95
 REQUEST (17 bytes)
◦ <len=0013><ID=6><index><begin><length>
◦ Sent from peer A to peer B to request to peer B the
block:
 Of the piece with index <index>
 Starting with an offset <begin> within the piece
 Of length <length>

96
 REQUEST
◦ The message allows to specify the block length, but
it is hard coded in the client
◦ Changing the block size may be useful to improve
pipelining under certain conditions
 No study on the block size impact
◦ Block size larger than 217 is forbidden

97
 PIECE (214 + 13 bytes for a standard block
size)
◦ <len=0009+X><ID=7><index><begin><block>
◦ Only one message used to send blocks
◦ Sent from peer A to peer B to send a block of data
to peer B
 Of the piece with index <index>
 Starting with an offset <begin> within the piece
 Payload is <block>

98
 CANCEL (17 bytes)
◦ <len=0013><ID=8><index><begin><length>
◦ Used in end game mode only
◦ Sent from peer A to peer B to cancel a request
already sent to peer B for the block
 Of the piece with index <index>
 Starting with an offset <begin> within the piece
 Of a length <length>

99
 Introduction
 BT System Functionality
 BT content & .torrent files & Peer-ID
 Content Publishing
 Peer connecting to a swarm
 Peer-Tracker communication
 DHTs (Chord)
 Peer-Wire protocol
 Piece Selection
 Peer Selection
 BT Applications
 Debate

10
0
 The piece selection strategy is a decision of the
implementer
 Different BT clients with different piece selection
strategies can exchange blocks
 The selection of piece is a local client decision
based on the information it has
 We study 4 policies applied in the reference BT
client implementation (mainline)
◦ Strict priority
◦ Random first piece
◦ Local rarest first
◦ Endgame mode

10
1
 Once a block of a piece has been requested,
request all the other blocks of the same piece
before a block of any other piece
 Rationale
◦ Pieces are the unit of replication
 It is important to download a piece as fast as possible,
only complete pieces can be retransmitted

10
2
 Strict priority improves piece download speed
 Never blocking
◦ If a peer is choked during a piece download and
this piece is not available on any other peer, a new
piece will be requested.
◦ As soon as the partially downloaded piece is
available, request the remaining blocks with highest
priority

10
3
 For the first 4 downloaded pieces, the pieces
are selected at random
 Rationale
◦ Rare pieces may be slower to download
 In particular if present on a single peer
◦ A peer without a piece cannot reciprocate
 Must wait for an optimistic unchoke (OU)
 The first piece is most of the time received from
several other peers that performs OU

10
4
 When a peer starts, it receives its first piece
with an OU
◦ With a typical upload speed of 20kB/s, each
unchoked peer receives at 5kB/s (4 in parallel)
◦ For a piece of 256kB, needs 51 seconds at 5kB/s
to receive a piece
◦ But, an OU lasts for 30s only
 An OU never becomes a RU when the peer has no piece to
reciprocate
◦ Faster to complete the last blocks of the piece
from another peer that makes an OU if the piece
is not the rarest one

10
5
 Random first piece makes more likely to
complete the first piece faster
 Not optimal, but a good tradeoff between
simplicity and efficiency (the random piece
may be a rarest one)
 Only impacts the startup phase of a peer
 Then switches to local rarest first

10
6
 Download first the pieces that are rarest in
the peer set of the peer
 Rationale
◦ Cannot maintain the state for all peers
 Require a connection to all peers or a centralized
component
◦ Peer selection should not be constrained by piece
availability
◦ The initial seed should send as fast a possible a
first copy of the content

10
7
 Improve the entropy of the pieces
◦ Peer selection is not biased
◦ Better survivability of the torrent
 Even without a seed the torrent is not dead
 Increase the speed at which the initial seed
delivers a first copy of the content
◦ The seed can leave early without killing the torrent

10
8
 When all blocks are either received or have
pending requests, request all not yet received
blocks to all peers. Cancel request for blocks
received.
 Rationale
◦ Prevent the termination idle time

10
9
 Improve the termination idle time
 Not a major impact at the scale of a download
 Do not solve the last pieces problem
◦ An overloaded peer remains overloaded

11
0
 Introduction
 BT System Functionality
 BT content & .torrent files & Peer-ID
 Content Publishing
 Peer connecting to a swarm
 Peer-Tracker communication
 DHTs (Chord)
 Peer-Wire protocol
 Piece Selection
 Peer Selection
 BT Applications
 Debate

11
1
 Choke algorithm
◦ Leecher state
◦ Seed state

11
2
 Algorithm Called Every 10 seconds
 In addition algorithm called
◦ Each time an unchoked and interested peer leaves the
peer set
◦ Each time an unchoked peer becomes interested or not
interested
Shorten the reactivity
 Every 3 rounds (30 seconds)
◦ An interested and choked peer is selected at random
 Planned optimistic unchoke
◦ It will be unchoked later in the algorithm
◦ The node selected by the optimistc unchoke will be
unchoked during 3 rounds increasing the chances of
getting a complete piece to upload

11
3
 Each time the algorithm is called
◦ Order the peers interested and not snubbed
according to their upload rate (from the local peer.
This maximizes local peer download rate)
 Snubbed
 Did not send a block in the last 30 seconds
 Favor peers that have contributed recently
◦ Unchoke the 3 fastest peers
◦ If the planned optimistic unchoke is not part of the
3 fastest it is unchoked

11
4
 Each time the algorithm is called
◦ If the planned optimistic unchoke is one of the 3
fastest
 Choose another peer at random
 New planned optimistic unchoke
 If this new planned optimistic unchoke is interested,
unchoke it, DONE
 If this new planned optimistic unchoke is not
interested, unchoke it and chose another planned
optimistic unchoke at random, loop again

11
5
 At most 4 peers can be interested and
unchoked
 But, more than 4 peers can be unchoked
◦ In case an unchoked peer becomes interested, the
choke algorithm is called immediately
◦ Improve the reactivity in case there are few
interested peers

11
6
 Peers unchoked and interested less than 20
seconds ago or that have pending requests for
blocks are ordered according to the time they were
last unchoked, most recently unchoked peers first
◦ Peers should be active or recent
 Upload rate discriminate among peers with the
same unchoke time
T1 T2 T3

U1 U2 U3 U1 U2 U1 U2 U3
Last unchoked time: T1<T2<T3
Upload rates: U1>U2>U3
11
7
 All the other peers unchoked and interested are
ordered according to their upload rate, with the
lowest priority

T1 T2 T3 Only upload rate

U1 U2 U3 U1 U2 U1 U2 U3 U1 U2 U3 U4

Last unchoked time: T1<T2<T3


Upload rates: U1>U2>U3>U4

11
8
 For two rounds out of three the three first peers
are kept unchoked, and an additional peer choked
and interested is unchoked at random
 For the third round, the four first peers are kept
unchoked

T1 T2 T3 Only upload rate

U1 U2 U3 U1 U2 U1 U2 U3 U1 U2 U3 U4

Last unchoked time: T1<T2<T3


Upload rates: U1>U2>U3>U4
11
9
 Default behavior
◦ A maximum of 4 interested peers to unchoke in
parallel
 Depending on the implementations
◦ Increase this number according to the upload
capacity
 Rational is that the higher your upload capacity the
higher the number of parallel uploads
◦ Increase this number with a configuration
parameter

12
0
 No clear evaluation of the benefit to
increase the number of parallel uploads
when the upload capacity is high
◦ May be beneficial is your upload capacity is larger
than 4 times the mean download capacity of the
peers
 For a mean maximum download speed of 1 Mbit/s your
upload speed must be higher than 4 Mbit/s
 No study on the mean maximum download speed
◦ Studies on the number of parallel uploads do not
take into account this asymmetry

12
1
 Introduction
 BT System Functionality
 BT content & .torrent files & Peer-ID
 Content Publishing
 Peer connecting to a swarm
 Peer-Tracker communication
 DHTs (Chord)
 Peer-Wire protocol
 Piece Selection
 Peer Selection
 BT Applications
 Debate

12
2
 File Sharing (ilegal??)
◦ Mostly linked to sharing of copyrighted content
◦ Laws against p2p file sharing
 Patch distribution (legal)
◦ IT department at Holland University used the BitTorrent
protocol for dropping 22TB of patches (update to
Windows Vista) on 6500 PCs in four hours.
 Data replication/distribution (legal)
◦ Twitter uses BitTorrent to deploy files across its many
servers in a more efficient way.
◦ Facebook pushes hundreds of megabytes of new code to
all servers worldwide in just a minute.
◦ Linux Mandrake new releases

12
3
 Introduction
 BT System Functionality
 BT content & .torrent files & Peer-ID
 Content Publishing
 Peer connecting to a swarm
 Peer-Tracker communication
 DHTs (Chord)
 Peer-Wire protocol
 Piece Selection
 Peer Selection
 BT Applications
 Debate

12
4
 Is it BT a fair protocol? Can free riders take
advantage of BT?
 Why ISPs do not like BT traffic?
 Why we can find so many copyrighted content
in BT? Who uploads it? What are the reasons?
 Is BT an altruistic network?

12
5

You might also like