0% found this document useful (0 votes)
71 views19 pages

Csc523: Analysis of The P2P Bittorrent Protocol: Abram Hindle 0020755 April 16, 2004

Analysis of the P2P BitTorrent protocol by abram hindle. Problems examined consist of iterated prisoners dilemma, various P2P questions, byzantine generals problem, and hashing.

Uploaded by

bluestar25
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
71 views19 pages

Csc523: Analysis of The P2P Bittorrent Protocol: Abram Hindle 0020755 April 16, 2004

Analysis of the P2P BitTorrent protocol by abram hindle. Problems examined consist of iterated prisoners dilemma, various P2P questions, byzantine generals problem, and hashing.

Uploaded by

bluestar25
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

CSC523: Analysis of the P2P BitTorrent Protocol

Abram Hindle 0020755

April 16, 2004


1 Introduction cies often the seeder can send out a file once and have it
distributed to many due to the P2P network made around
In this paper we look closely at the BitTorrent P2P pro- that file. Thus effectively by uploading the file once, its
tocol. We extract problems that have already been studied persistence within the network depends on how long other
from the protocol and discuss those problems. Some prob- hosts stay on the network.
lems we subject to further analysis while creating new ones
to solve. 2 Terms
BitTorrent’s range of problems crosses many domains,
from computer networks, to sociology to economics, to Torrent - A file which provides a URL to the tracker as
computer security issues. BitTorrent covers a wide gam- well contains a list of SHA1 hashes for the data being
bit of problems and thus many will be discussed. There is transfered. This is so that the hashes in the Torrent can
also a literature of the general state of P2P in relation to be used to verify if the blocks received are valid or not.
BitTorrent.
Problems examined consist of Tit For Tat strategy in Iter- Tracker - A middleman who informs the peers of all
ated Prisoners Dilemma, Various P2P Questions, Byzantine the other peers in the network.
Generals Problem, and Hashing. Peer - A client to the network dedicated to a torrent.

1.1 What is BitTorrent Seeder - A Peer who has all the blocks in a torrent.
Choked - A connection is choked if not file data is
BitTorrent is a P2P protocol meant for distributing files. passed through it. Control data may flow but the trans-
The purpose behind BitTorrent is to reduce the bandwidth mission of actual blocks will not.
load for the peer (the seeder) initially sharing the file . Peers
who download from the seeding peer join the network and Interest - indicates whether a peer has blocks which
share their blocks of the file with other clients. Each file is other peers want.
split into blocks so a seeder can distribute blocks among the
downloading peers such that peers can download the blocks Snubbed - A peer acting poorly - not uploading - or
off other peers. Thus a downloader effectively becomes an sending bad control messages, usually disconnected or
uploader. As more peers join they connect to the download- ignored.
ing peers and trade file blocks from them.
To promote sharing of bandwidth a Tit For Tat algorithm 3 Literature Review
is implemented on each peer. This suggests that a peer must
send data to another peer if it expects the other peer to send I read various papers on issues relating P2P. Some re-
data back. Thus to successfully download from a BitTorrent lated to BitTorrent, while other related slightly to analysis
network one has to allocate some of their upstream band- using techniques such as Chernof Bounds or iterated pris-
width to the network otherwise suffer very slow transfers. oners dilemma. Here are some reviews of some of them.
BitTorrent is interesting as it is currently used by many “Analyzing peer-to-peer traffic across large networks”
users to distribute large files. Originally used to distribute by Sen and Wang [10] discussed how P2P traffic actually
legal high quality bootleg recordings of live concerts, Bit- looked on a large network. They analyzed the network traf-
Torrent is now very popular with those who trade televi- fic of an ISP (probably a AT&T owned ISP) and concluded
sion, movies, arcade games, comic books and music ille- the results. They took statistics about packet sources and
gally. BitTorrent is also used heavily in the Linux commu- destinations and reasoned that many of the current p2p net-
nity to distribute like files such as CD images. The popu- work available today can be taken down by the removal of a
larity of BitTorrent is likely due to the control the seed has few important nodes. They also were able to notice the dif-
over the network as well as the ability for a seed to distribute ference p2p users using networks that used supernodes and
a file’s whole size once while. A seeder can post a torrent those which didn’t. They observed the phenomena of less
file on their webserver and say “If you want this file jump than 10% of the hosts being responsible for more than 90%
on to this torrent”. A user downloads the torrent and Bit- of the traffic and content on the file based P2P networks.
Torrent downloads the file. The focus of effort and lack of Some of my research went into how agents act. “Emo-
”always-on” P2P sharing software makes it especially use- tional Pathfinding”, [11] by Donaldson, Park and Lin de-
ful in small community like those based on messageboards. scribed agents who prioritize goals using emotions. Emo-
There are many communities which trade television shows tions were abstracted away to the idea of dominant priori-
and comics through semi-private webboards using BitTor- ties that can increase or decrease in importance and override
rent. Due to BitTorrent’s tit for tat bandwidth sharing poli- other goals. This is to avoid certain erratic and fast acting

1
behaviors which can be detrimental to the agent or the sys- “Scalable Byzantine Agreement” [6] by Lewis and Saia
tem. This paper provided a good basis for what a computer discussed an algorithm to solve the Byzantine Agreement
scientist means by the term “emotional”. It’s almost like using an randomized algorithm. It uses randomness and
smoothing the transitions between priorities. “A Framework  

probability on the Byzantine General’s Problem of Byzan-


for the simulation of Agents with Emotions”, [3] by Bazzan
and Bordini shared the same idea of emotions from the pre-

for each of peers (
  

tine Agreement such that using messages sent each


rounds) the the problem of
vious paper and applied it to the prisoner’s dilemma. Un- reaching a trustable agreement is solved. They make an in-
fortunately they only simulated the results. In their trials of teresting but valid claim that “In fact, for a p2p system to be
iterated prisoners dilemma, the emotional agents usually de- scalable we generally require all resource costs per peer to
feated the always defect and always cooperate agents. The be polylogarithmic in the number of peers”.
emotional agents were not tested against tit for tat agents be- “FARSITE: Federated, Available and Reliable Storage
cause the researchers thought that defeated the purpose and for an Incompletely Trusted Environment” , by many au-
the results would have been predictable. Tit for tat likely thors at Microsoft Research [2], describes a distributed file
would have been dominant against an emotional agent. The system like much like NFS except based around the idea
emotional agents are relevant to BitTorrent as BitTorrent that not all the peers who will store and keep information
acts similar to an optimistic tit for tat algorithm. can be trusted. This decentralized storage technique is to
“Notions of reputation in multi-agents systems: a re- take advantage of a large companies unused storage space
view”, by Mui, Mohtashemi, and Halberstadt [8] discussed found on their employee’s computers. An office might have
the state of reputation in agent based systems. Discussed 40 computers each with 20 Gigabytes of free space. Effec-
were different ways for measuring, observing and inferring tively there is 800 gigabytes of unused space that could be
reputation. Examples of real world systems were given such used in FARSITE file system. FARSITE is interesting be-
as eBay buyer/seller feedback. They then tested many of cause it supposedly protects itself against byzantine faults.
these systems for reputation in Iterated Prisoners Dilemma. Unfortunately the paper doesn’t really go into what they do
The conclusion seemed to be that reputation based on what to protect themselves, they just make reference to the name
a trusted group of peers had already concluded about an- of fault. This is what one would expect from a paper from a
other peer resulted in a better performing agents in Iterated industry research group.
Prisoners Dilemma.
“Towards a Pareto-optimal solution in general-sum 4 Problems
games”, by Sen, Airiau and Mukherjee [9] discuss tech-
niques in which agents can learn to not necessarily co- There are many problems surrounding P2P protocols and
operate but to optimize their strategy in a Markov game the BitTorrent Protocol itself.
(like iterated prisoners dilemma) in order to achieve Pareto-
optimal solutions that beat Nash Equilibrium solutions. 4.1 Block Distribution Problem
Concepts such as desired states versus greedy states are
brought up. A desired state is where both agents made a Block Distribution is a concern for BitTorrent. Peers join
choice which results in the greatest payoff for both of them and exit the network all the time. If a seeder uploads to
together. A greedy state is one which both agents choose a peer that leaves, many of those blocks are lost. Thus a
that state because they are given safe but high payoffs. This host is often obligated to distribute blocks amongst peers
was to also use learning to determining whether given the such that when a peer leaves a large subset of blocks is not
opponent it was feasible to use a greedy or desirable strat- missing.
egy. This is related to BitTorrent because BitTorrent often BitTorrent peers usually request random blocks initially
takes optimistic approaches to game theory in the hopes of then request the rarest blocks from that point on.
gaining faster, more cooperative peers.
“The Byzantine Generals Problem” [5] by Lamport,
4.1.1 Random Block
Shostak and Pease introduced the Byzantine General’s
Problem to computer science. The problem was proposed The connecting peer doesn’t know much about the network
and analyzed in the paper, the problem is about how a group it connects to. If it choses the rarest block first, the peer
of traitors can confuse messages and cause miscommunica- could be slowed since this block could be rare due to a very
tion, cause disagreement or agreement based on their bias. slow host. If the peer requests a block the seeder has already
These traitors could be working together to subvert the in- sent (possibly the rarest block) the seeder is less likely to
tegrity of the final decision. The paper covers many dif- send that block to the peer as it’s already been sent once.
ferent aspects of the problem from communication network Thus by requesting a random block first the peer has a
disruption to bad commanders. better chance at receiving a full block from a good source

2
 *  /&

The probability of being chosen is .


than asking for a rare block.

0$1 24365
 7#  * 8&
.
 

Random block is most useful when a client doesn’t know


the layout of the network well yet. Although it seems some-
 
0$1 293:5; * <&
243 , thus
where is a random var indicating
it expected that

what strange. Wouldn’t it be better to download the first how many peers chose peer  .
block from the network as it’d be guaranteed to be there? We want to know how many rounds before a peer has all
Potential issues with using one static block would be that the blocks.
no one on the network wants that block thus you rely on the
optimistic parts of the tit for tat algorithm. By requesting a
Based on coupon collector problem( [7] >
one random block per turn,
0$1 2
5   
if =)
we? are offered
where is
?
totally random block you slowly get a block which is likely a constant.
to be rare to other peers such that you can participate in the This would only be true if the distribution was uniform
network effectively. per turn of a block being offered.
Lets skip to Turn where the seeder has uploaded all
the blocks to network.
4.1.2 Rarest Block
2$3
Rarest Blocks are requested by the peers. By rarest we
Assuming a uniform distribution, 0$1 2 is 5a r.v. * which in-
dicates how many& blocks peer  has.
3 
mean the block that the fewest peers have. By requesting 0>1 2 3 5 . Thus
each peer has blocks left.
rarest blocks, peers try to keep all the blocks available to
If we assume a system where everyone uploads ran-
the network. This reduces the reliance on one peer to host
domly to everyone else. This is basically the coupon col-
one block and provides redundancy as the weakest parts are
lector problem. Each round we get a block randomly from
backed up first.
someone. Lets assume that our current collection of blocks
I will demonstrate how using the rarest block first algo-
0$1 2 3 5
is unique to us thus the likely hood of getting one of the

rithm we will still get all the blocks in the network. Let initial
0$1 2 3 5 * zero
blocks back is initially
 and 
* slowly starts
there be peers. Let there be a file of chunks. For each
 to grow to be bounded by . So we’ll ig-
chunks assume at least 1 peer has that chunk. in-
  
dicates how many peers have chunk m.

    The rarest block is coupon collector


0>1 2@5 (
nore the duplicates thing. Taking   
<the =A?previous results from
we now apply it to
the  
      Everyone
    inthe net- our current situation where X is a r.v. indicating how many
work has all blocks when  . Thus
if a peer doesn’t have a block and the block is available in 
0$1 2@5B ( & if * we’re sent& a block
rounds before we have all the blocks 
 * C= 

the network then that block will be the rarest or eventually ?


randomly each round.

  

!" 
become the rarest. The the peer will get that block and if
then everyone has downloaded all the =7(
So the expected number 
& of turns
* & * Dhas
til everyone 
 
=7? their 

blocks should be .
necessary blocks.

Given peers and 1 seeder. How many times does the
Let’s compare our model’s results with the real world.
These results are from the empirical section 6.2, see there

seeder have to upload to send the whole file of
the peers assuming each peer drops off once they have all
parts to
for any constraints on the experimental results. See figure
1 for the data and the graph of the data in figure 2. The
the pieces.
So our upper bound is naively
 $#
parts uploaded.
comparison between the experimental results and the model
aren’t quite valid for 2 reasons. In the experiment the peers
Given or Assumed:
drop out when they are finished and the network had shared
 %
Lets assume each peer is connected to all other peers. but limited bandwidth.

Each peer is connected to all other peers.


Each peer is connected to the seeder
4.2 Game Theory
Assume no preferences or priority to uploaders
Assume each turn every peer can upload 1 chunk to 1 Game Theory applies directly to BitTorrent as aspects
peer and receive 0 or more chunks from another peer of game theory help determine fair strategies that promote
Lets assume each peer has 1 chunk and the seeder has all bandwidth sharing and the distribution of load among peers.
the chunks.
Lets assume a peer randomly chooses another peer to 4.2.1 Pareto Efficiency
send to
 
'&)( +*
&


-Chance
, a peer is not chosen in a round, A Game that is Pareto efficient if there no way someone is
better off without making someone worse off [1]. In BitTor-
What is the expected number of peers that chose any rent this is used to spur peers to look for better peers or at
given peer? least be fair and communicate with many peers.

3
file chunks experimental n=4 log(10) n=4 log(2) n=5 log(10) n=5 log(2) n=5 log(exp(1))
1.00 17.86 0.81 0.38 0.57 0.84 0.48 0.64
2.00 22.05 2.53 3.75 3.22 2.65 4.17 3.50
4.00 30.74 6.86 13.51 10.59 7.23 14.74 11.44
8.00 48.17 17.34 39.02 29.50 18.32 42.28 31.76
16.00 81.60 41.90 102.04 75.64 44.34 110.16 81.27
32.00 154.20 98.25 252.08 184.55 104.10 271.52 198.02
64.00 303.65 225.40 600.16 435.64 239.03 645.43 467.02
128.00 639.06 508.60 1392.31 1004.35 539.71 1495.67 1076.00
256.00 1309.33 1132.79 3168.63 2274.88 1202.72 3400.94 2435.91

Figure 1. Experimental Results Versus Model Results File Chunks Versus Turns/Time

Number Of Chunks Vs Turns/Time


10000
UL Turns
UL Turns
x+2*(x-x/4)*log(x-x/4)
x+2*(x-x/4)*log(x-x/4)/log(2)
x+2*(x-x/4)*log(x-x/4)/log(10)
1000

100

10

0.1
1 10 100 1000

Figure 2. Experimental Results Versus Model Results File Chunks Versus Turns/Time

4
In computer science terms, seeking Pareto ef- 4.3 Related Problems
ficiency is a local optimization algorithm in which
pairs of counterparties see if they can improve 4.3.1 Byzantine Generals Problem
their lot together, and such algorithms tend to lead
to global optima. Specifically, if two peers are The Byzantine Generals Problem is related to BitTorrent
both getting poor reciprocation for for some of the more so as a warning against sabotage on the the BitTor-
upload they are providing, they can often start up- rent network. Sabotage could come from copyright holders
loading to each other instead and both get a better to Internet vigilantes to hackers.
download rate than they had before.[4] This relates to BitTorrent. How does BitTorrent defend
against colluding peers that seek to subvert the network?
Effectively BitTorrent is designed to promote the shar- An area in BitTorrent where this could be used in detecting
ing of bandwidth in order to improve transfer rates between if a peer or a group of peers is lying about their upload /
peers. download statistics to the tracker. If everyone voted and

well. Especially if it only takes


  

agreed what one client uploaded that might work out quite
as suggested by
4.2.2 Tit For Tat Lewis and Saia [6].
In BitTorrent If a peer detects invalid data from an-
Tit For Tat is a strategy for uploading and downloading be-
other peer such as damaged datastructures or improper field
tween peers. There are pessimistic and optimistic tit for tat
lengths, it automatically disconnects that peer. If a peer
algorithms.
sends invalid data to another peer, this will be noticed as
Tit For Tat is a strategy used in game theory problems
the SHA1 hash from that chunk will not match.
such as prisoner dilemma where you take the strategy of
your opponent. If they cooperate, you cooperate next turn.
They don’t cooperate; you don’t cooperate. 5 Hashing
How would a random strategy fair against tit for tat? By
fair we mean minimize the time it takes to download while SHA1 hashes are used by BitTorrent on chunks of the
minimizing the amount of data uploaded. A strategy that file. The size of the chunks range
The chunks are always of sizes

( , from 64KB to 1024KB.
where is greater than
doesn’t upload much downloads a lot is a good strategy.
or equal to 16.
 
Lets create an small game.
Lets define the tit for tat strategy as for the first round we ( 
SHA1 hashes are 160-bit, thus naively the likelihood of
 
always upload. Given a round where  & and  any two strings having a matching checksum is . As
and   

 if uploaded during round

or if far as I can tell no one has found a SHA1 collision yet.


Thus if a peer’s packets get modified or garbled or it’s
didn’t

Tit for tat strategy uploads if    was 1. Otherwise it original file is not complete or is corrupted the other peer
will know. If a peer keeps sending bad data it will be
doesn’t upload.
How does tit for tat fare against a random opponent? snubbed and ignored.
We’ll define a random opponent as an opponent who given
a probability . uploads to . 6 Further Analysis

Thus the expected number of times


uploads during a n round game is

9# a random opponent
. . An optimistic tit 6.1 BitTorrent Code
for tat would #
. 0$uploads
result in  # =
an expected
.
number of uploads either
being
rounds,
1 2
5 0$1 2@5 =
or
or
.
uploads during the game of Bram Cohen’s code for BitTorrent is well written. It

lacks commenting (per each file there are 2 lines of com-


Event 1 doesn’t upload at the start, resulting in +1 ments at the top of the file). The language BitTorrent was
more &
. .
 uploads.
Thus Event 1 occurs with a probability of written in is Python. A clean cut scripting language which

has some nice array manipulation operators. Python is vir-


Event 2 does upload a the start resulting in 0 more tually equivalent to Perl.
uploads. Thus Event 2 occurs with a probability of . .
0$1 2
5 2

The Choker is a rather interesting module. Every 30
Thus for the where is the number of uploads#
 

seconds a round robin scheduler runs and checks for con-


& is . 9#
0$1 makes


 
5 = and &

0$is1 the5 = number of uploads # # = makes 


 nections with are choked but are interested in participating.
= # # = # = & # # = &  # & . =
. 
which is . .  This scheduler rotates the connections array until the first
. 0$ . 1 2
5  . $# & . =  . . . . . . choked but interested connection appears.
Thus . . At least every 10 seconds the connections are re-
This is a naive approach to game theory with BitTorrent. prioritized using the rechoke method.

5
B doesn’t upload to A B uploads to A
A doesn’t upload to B (0,0) (1,0)
A uploads to B (0,1) (1,1)

Table 1. In relation to A’s actions (A receives,A sends)

The algorithm for rechoking is such: 1400


File Size Vs Turns/Bytes Uploaded

Total Uploaded
Total Uploaded
UL Turns
UL Turns
create a list called ’preferred’ of connections are not 1200

snubbed but are interested 1000

800
reverse sort ’preferred’ by the upload rate (so the
largest uploading connections appear first) 600

400

cut the the tail off the list to reduce the size to
200
max uploads in size
0

(
0 50 100 150 200 250 300

for all connections


( 
Figure 5. Given files of size to megabytes
– unchoke the connection if it is in the list ’pre- in size we see how much the seeder uploads
ferred’ to 4 other peers on a shared network
– otherwise unchoke the connection if we have
less unchoked and interested connections than
min uploads or if we haven’t found a interested
connection yet. 6.2 Empirical
#
If we find an interested connection we in-
crement our upload count . Thus we will
keep unchoking until a min uploads number A test framework was created to enable the easy creation
of connections have been unchoked. and execution of tests using BitTorrent. A network of 5
computers was setup and linked together using a 10/100
– otherwise choke the connection. switch. The 10/100 switch caused difficulties as it allowed
some hosts to access 10 times the bandwidth than other
Lots of the BitTorrent code relies on randomization to hosts. Instead a 10 MBit hub was substituted.
provide fair and optimistic strategies. In the choker when
Given a little bit more time I would be able to do more
one adds a connection it is added randomly to the connec-
effective testing. Such as ones which don’t require the peers
tion list (with a slight bias for the front of the list).
to exit of finishing.
Lets take a look at a tiny sample of the code in figure 3.
This method is called when a connection is made and In figure 4 an example run of the network is given. The
the connection is added to the Choker (who chokes and un- seeder is the first peer . The figure describes the upload
chokes connections, as well as prioritizes them). It takes in and download rates in bytes per second of each peer in the
3 arguments, the object itself, the connection and the pri- network. Each peer is set to disconnect once they have the
ority . . If . is not set, the connection is randomly placed full file. These experiments take place on one Ethernet net-
inside the connection list associated with the object. work, thus the bandwidth is shared bandwidth. When more
peers leave, they stop using the network bandwidth thus the
      
1 & ( Randrange
    ? chooses

 ?




a = values from the range


thus the range is seeder uploads faster. A problem with the experiment is
      


1 & (     ?  ?  5


 . Using that value the con- that the machines running the BitTorrent software are het-
nection is inserted into the list. In the forth line we see, if erogeneous. They differ in just about every way other than
the value for . is less than 0 we use 0 instead. Thus ,
software. The software is exactly the same except for the
drivers that the kernel loads.

is probability that an element will be put at the head of the

list, where was the size of the list originally.


 ,
= Where as In figure 5 we graph the linear relation between how
for all other positions it is , , thus ,
, . much the seeder uploads and how big the files are.


6
def connection_made(self, connection, p = None):
if p is None:
p = randrange(-2, len(self.connections) + 1)
self.connections.insert(max(p, 0), connection)
self._rechoke()

Figure 3. Python Code that assigns priority to new connections

700000
UL Rate 1
UL Rate 2
DL Rate 2
UL Rate 3
600000 DL Rate 3
UL Rate 4
DL Rate 4
UL Rate 5
500000 DL Rate 5

400000

300000

200000

100000

0
0 500 1000 1500 2000 2500 3000 3500 4000 4500

Figure 4. Upload and download Rates of Peers from a seed. In this case, the Seed uploaded almost
4 times the filesize in bytes to 4 peers

7
18
File Size Vs Turns/Upload Ratio
well there needs to be research into how to detect, prevent
UL Ratio To Size

16
UL Ratio To Size
UL Ratio Turns To Size
and protect against this collusion.
UL Ratio Turns To Size
An interesting problem to investigate is whether or not
14
multiple peers on the same computer can download faster
12
than one peer on one computer. Judging by the tit for tat al-
10
gorithm I have reason to believe that this might be true since
8
there is a tendency to be optimistic and grant a client a band-
6 width reprieve to see if they will share more. Thus will this
4 compounded optimism work in the favor of the host with
2 multiple peers versus the host with a single peer? From my
0
personal study of this phenomenon I found that the peers on

(
1 10 100 1000
the same host will share amongst themselves very rapidly.
(  Given the tit for tat choking algorithms use a 30 second
Figure 6. Given files of size to megabytes window is there a way to use this timing information to im-
in size we see how much the seeder uploads prove the upload download ratio in one’s favor?
to 4 other peers on a shared network, this is Often BitTorrent is used across the Internet, over many
the ratio of turns or uploaded megabytes to networks. BitTorrent tests should probably be done across a
the original file size peer to peer network with explicit paths rather than a shared
medium network like 10Mbit ethernet. This should simu-
late being on different networks and should avoid the prob-
7 Conclusions lem of limiting 5 peers to a 1 MBit pipe.

The models generated here are too naive to be really 9 What Did I actually Do
useful when modeling BitTorrent. More complex models
would have greater difficulty in proving facts about the sys- Comment and Extract algorithms from BitTorrent. At-
tem. Although as shown in section ?? even a naive model tempt to understand parts of the program from it’s
can successfully model the constraints of the real world. source. Specifically the piecepicker and choker.
Perhaps a simple model is best as it leaves room for ex-
Attempt to read a lot of papers in the area.
perimental error.
BitTorrent uses concepts from game theory and eco- Attempt to run empirical tests on BitTorrent using a
nomics to promote fairness. The use of optimistic strategies network of computer and modified BitTorrent client.
enables connections to attempt to renegotiate and counter Generate a testing framework. Generate the code nec-
balance the downward spiraling effects of the tit for tat al- essary to collect, retrieve and analyze the data.
gorithm.
Optimistic techniques used by BitTorrent seem to be use- Attempt at proving facts about naive models of BitTor-
ful strategies to promote file downloading. Bram Cohen, rent. Attempt to verify those models against experi-
the author of BitTorrent, has suggested that BitTorrent was mental data.
never meant to promote 1 to 1 upload download ratios, it Write this report.
was created to reduce the load of sharing large files.
In relation to randomized algorithms and analysis of References
such systems, BitTorrent is quite interesting albeit quite
complex. The major parts of BitTorrent related to random- [1] Pareto efficiency. 2004.
ized algorithms are piece picking, upload / download strat- http://en.wikipedia.org/wiki/Pareto efficiency.
egy (tit for tat), and hashing. The game theory aspects
of BitTorrent relate closely with economics and statistics [2] A. Adya, W. Bolosky, M. Castro, R. Chaiken, G. Cer-
which are quite related to probabilistic analysis. mak, J. Douceur, J. Howell, J. Lorch, M. Theimer, and
R. Wattenhofer. Farsite: Federated, available, and reli-
8 Future Work able storage for an incompletely trusted environment,
2002.
There needs to be further investigation into how a group [3] A. L. C. Bazzan and R. H. Bordini. A framework
of peers can collude to create unfair network conditions, for the simulation of agents with emotions. In Pro-
such as download more than upload, or even simply disrupt ceedings of the fifth international conference on Au-
the network enough to disable the distribution of a file. As tonomous agents, pages 292–299. ACM Press, 2001.

8
[4] B. Cohen. Incentives build robustness in bittorrent.
May 2003.
[5] L. Lamport, R. Shostak, and M. Pease. The byzantine
generals problem. ACM Trans. Program. Lang. Syst.,
4(3):382–401, 1982.
[6] C. S. Lewis and J. Saia. Scalable byzantine agreement,
2004.
[7] M. Mitzenmacher and E. Upfal. Probabilistic Analysis
and Randomized Algorithms: A First Course. Brown
University, 2003.
[8] L. Mui, M. Mohtashemi, and A. Halberstadt. Notions
of reputation in multi-agents systems: a review. In
Proceedings of the first international joint conference
on Autonomous agents and multiagent systems, pages
280–287. ACM Press, 2002.

[9] S. Sen, S. Airiau, and R. Mukherjee. Towards a


pareto-optimal solution in general-sum games. In Pro-
ceedings of the second international joint conference
on Autonomous agents and multiagent systems, pages
153–160. ACM Press, 2003.
[10] S. Sen and J. Wang. Analyzing peer-to-peer traffic
across large networks. In Proceedings of the second
ACM SIGCOMM Workshop on Internet measurment,
pages 137–150. ACM Press, 2002.
[11] A. P. Toby Donaldson and I.-L. Lin. Emotional
pathfinding, 2004.

10 Appendix

Figures 7 , 8 , 9 , 10 , 11 , 12 , 13 , 14 , 15 depict BitTor-


rent networks working on different file sizes. Do to a bug in
the clock, the seeder (the red line) starts earlier but on the
graph it appears later. I need to use a synchronized clock in
future experiments.

9
350000
UL Rate 1
UL Rate 2
DL Rate 2
UL Rate 3
300000 DL Rate 3
UL Rate 4
DL Rate 4
UL Rate 5
250000 DL Rate 5

200000

150000

100000

50000

0
0 2 4 6 8 10 12 14 16 18

Figure 7. BitTorrent Network of File of 1MB

10
400000
UL Rate 1
UL Rate 2
DL Rate 2
350000 UL Rate 3
DL Rate 3
UL Rate 4
DL Rate 4
300000 UL Rate 5
DL Rate 5

250000

200000

150000

100000

50000

0
0 5 10 15 20 25

Figure 8. BitTorrent Network of File of 2MB

11
400000
UL Rate 1
UL Rate 2
DL Rate 2
350000 UL Rate 3
DL Rate 3
UL Rate 4
DL Rate 4
300000 UL Rate 5
DL Rate 5

250000

200000

150000

100000

50000

0
0 5 10 15 20 25 30 35

Figure 9. BitTorrent Network of File of 4MB

12
600000
UL Rate 1
UL Rate 2
DL Rate 2
UL Rate 3
DL Rate 3
500000 UL Rate 4
DL Rate 4
UL Rate 5
DL Rate 5
400000

300000

200000

100000

0
0 5 10 15 20 25 30 35 40 45 50

Figure 10. BitTorrent Network of File of 8MB

13
700000
UL Rate 1
UL Rate 2
DL Rate 2
UL Rate 3
600000 DL Rate 3
UL Rate 4
DL Rate 4
UL Rate 5
500000 DL Rate 5

400000

300000

200000

100000

0
0 10 20 30 40 50 60 70 80 90

Figure 11. BitTorrent Network of File of 16MB

14
700000
UL Rate 1
UL Rate 2
DL Rate 2
UL Rate 3
600000 DL Rate 3
UL Rate 4
DL Rate 4
UL Rate 5
500000 DL Rate 5

400000

300000

200000

100000

0
0 20 40 60 80 100 120 140 160

Figure 12. BitTorrent Network of File of 32MB

15
700000
UL Rate 1
UL Rate 2
DL Rate 2
UL Rate 3
600000 DL Rate 3
UL Rate 4
DL Rate 4
UL Rate 5
500000 DL Rate 5

400000

300000

200000

100000

0
0 50 100 150 200 250 300 350

Figure 13. BitTorrent Network of File of 64MB

16
700000
UL Rate 1
UL Rate 2
DL Rate 2
UL Rate 3
600000 DL Rate 3
UL Rate 4
DL Rate 4
UL Rate 5
500000 DL Rate 5

400000

300000

200000

100000

0
0 100 200 300 400 500 600 700

Figure 14. BitTorrent Network of File of 128MB

17
700000
UL Rate 1
UL Rate 2
DL Rate 2
UL Rate 3
600000 DL Rate 3
UL Rate 4
DL Rate 4
UL Rate 5
500000 DL Rate 5

400000

300000

200000

100000

0
0 200 400 600 800 1000 1200 1400

Figure 15. BitTorrent Network of File of 256MB

18

You might also like