Online Course- Introduction to Computer Networking
Online Course- Introduction to Computer Networking
Resume Course
March 5, 2015
Dear Student,
Before you start the course, we ask that you complete a short pre-course survey, so we can get to
know a bit about your background.
This course is intended for students who have some understanding of how a computer lays out its
memory (bytes, words, hexidecimal notation), some basic probability (conditional probabilities),
and knowing what an integral of a function is (the ones we'll deal with are trivial and can be
considered geometrically). The most mathemetically heavy part of the course is in unit 3, when
we explain and explore how a switch behaves as packets flow through it.
This class is almost identical to the course we teach at Stanford, with one exception: it does not
have any programming assignments. Our plan is to release separate courses with programming
assignments in the future.
The course consists of 8 units. Each unit includes a few quiz questions to test that you understand
the material. There are two exams, a mid-term and a final. We expect each unit will take you 8-
10 hours of work. To obtain a Statement of Accomplishment for the course, you need to earn a
grade of 60% or higher. 30% of your grade will be from quiz questions that are part of the videos
and the other 70% will be from two exams (35% each).
We encourage you to review our Terms of Service and Privacy Policy before beginning the
course.
1
1. Start of transcript. Skip to the end.
2. Phil: Welcome to the first unit! This is the unit where you get the “big picture,”and a first
few details.
3. Nick :You’re going to learn the basics of how the Internet works. You might even figure
out which one of us is Phil and which one is Nick. We hope to help you to
understand*why* the Internet is designed the way it is. What are some of its strengths
and weaknesses? We’ll also teach you some of the commonly accepted network design
principles, such as layering, encapsulation, and packet switching. At the end of this unit,
you should be able to answer questions such as “What is the Internet? What is an Internet
Address?”and “How do applications such as the web, Skype, and BitTorrent work?”
These principles will help you design better networks in the future.
4. Phil: At the end of the first unit, you should be familiar with something called the “4-
layer model” of the Internet. It describes how the Internet is broken down into four
distinct layers. You’ll learn what layers are and why they’re a basic principle of good
network design. You’ll learn what the Internet’s four layers are and how they work
together. You’ll learn that most applications use a transport layer called the Transmission
Control Protocol, or TCP, and how some applications use it. You’ll also learn that the
Internet works by breaking data up into small units called packets. For example, when
you request a web page, your computer sends some packets to the web server. The
Internet decides how these packets of data arrive to the right destination.
5. Nick: This unit also examines one layer, called the network layer, in a bit of detail. You
might have heard of IP, the Internet Protocol. It’s the protocol named after the Internet
because it’s the glue that lets the whole thing work. You can change all of the other
layers, but to be using the Internet you need to be using the Internet Protocol at the
network layer. You’ll learn about what the Internet Protocol does and how it does it.
You’ll learn about Internet Protocol addresses and how they’re assigned. You’ll start to
learn how the Internet decides the path a packet should take based on Internet addresses.
6. Phil: Finally, we’ll show you a few software tools you can use to inspect how your
computer is using the Internet. So you can apply what you’ve learned in this unit to the
next time you browse the web!
7. End of transcript. Skip to the start.
What ultimately makes networks interesting are the applications that use them. Dave Clark,
one of the key contributors to the Internet’s design, once wrote “The current exponential
growth of the network seems to show that connectivity is its own reward, and it is more valuable
than any individual application such as mail or the World-Wide Web.” Connectivity is
the idea that two computers in different parts of the world can connect to one another and
exchange data. If you connect your computer to the Internet, you suddenly can talk with
all of the other computers connected on the Internet. Well, at least the ones that want
2
to talk with you too. Let’s look at what exactly that means and how some modern applications
The tremendous power of networked applications is that you can have multiple computers, each
with their own private data, each perhaps owned and controlled by different people,
exchange information. Unlike your local applications, which can only access data that resides on
your local system, networked applications can exchange data across the world. For example,
think of using a web browser to read a magazine. The server run by the publisher has all of
the magazine articles, and might also have all of the articles from past issues. As articles
are corrected or added, you can immediately see the newer versions and newer content.
The entire back catalog of articles might be too much for you to download, so you can
load them on demand. If you didn’t have a network, then you’d need someone to send
So the basic model is that you have two computers, each running a program locally, and these
two programs communicate over the network. The most common communication model used is
data, which goes over the network, such that then program B running on computer B can read
it. Similarly, program B can write data that program A can read. There are other modes
of communication, which we’ll talk about later in the course, but a reliable, bidirectional
Let’s walk through what this looks like. Computer B, on the right, is waiting for other
computers to connect to it. It might be, for example, a web server. Computer A, on the
left, wants to communicate with B. Following this example, it’s a mobile phone running
a web browser. A and B set up a connection. Now, when A writes data to the connection,
it travels over the network and B can read it. Similarly, if B writes data to the connection,
that data travels over the network and A can read it. Either side can close the connection.
3
For example, when the web browser is done requesting data from the web server, it can
close the connection. Similarly, if the server wants to, it can close the connection as well.
If you’ve ever seen an error message in a web browser saying “connection reset by
peer,” that’s what this means: the web server closed the connection when the web
browser wasn’t expecting it. Of course the server can refuse the connection as well:
you’re probably seen connection refused messages, or have a browser wait for a long
Later in this course, you’ll learn all of the details of how this works under the covers;
for now, let’s just think about it from the application perspective, which is the
ability to reliably read and write data between two programs over a network.
Now that we’ve seen the basic way networked applications communicate, let’s look at
our first example: the world wide web. The world wide web works using something called
HTTP, which stands for the HyperText Transfer Protocol. When you see http:// in your browser,
that means it’s communicating using HTTP. We’ll dig much deeper into the details of
HTTP later in the course, when we cover applications. For now I’m just going to give a very high
level overview.
In HTTP, a client opens a connection to a server and sends commands to it. The most
common command is GET, which requests a page. HTTP was designed to be a document-centric
the browser opens a connection to the server www.stanford.edu and sends a GET request for
the root page of the site. The server receives the request, checks if it’s valid and the
user can access that page, and sends a response. The response has a numeric code associated
with it. For example, if the server sends a 200 OK response to a GET, this means that
the request was accepted and the rest of the response has the document data. In the example
of the www.stanford.edu web page, a 200 OK response would include the HyperText that
4
describes the main Stanford page. There are other kinds of requests, such as PUT, DELETE,
Because HTTP is document-centric, clients requests name a file. HTTP is all in ASCII
text: it’s human readable. For example, the beginning of a GET request for Stanford
looks like this: GET / HTTP/1.1. The beginning of a response to a successful request looks
But the basic model is simple: client sends a request by writing to the connection, the
server reads the request, processes it, and writes a response to the connection, which
Let’s look at a second application, BitTorrent. BitTorrent is a program that allows people
to share and exchange large files. Unlike the web, where a client requests documents
from a server, in BitTorrent a client requests documents from other clients. So that a single
client can request from many others in parallel, BitTorrent breaks files up into chunks of
data called pieces. When a client downloads a complete piece from another client, it then
tells other clients it has that piece so they can download it too. These collections of
collaborating clients are called swarms. So we talk about a client joining or leaving
the swarm.
BitTorrent uses the exact same mechanism as the world wide web: a reliable, bidirectional
data stream. But it uses it in a slightly more complex way. When a client wants to download
a file, it first has to find something called a torrent file. Usually, you find this using
the world wide web and download it using, you guessed it, HTTP. This torrent file describes
some information about the data file you want to download. It also tells BitTorrent about
who the tracker is for that torrent. A tracker is a node that keeps track (hence the name)
of what clients are members of the swarm. To join a torrent, your client contacts the
tracker, again, over HTTP, to request a list of other clients. Your client opens connections
5
to some of these clients and starts requesting pieces of the file. Those clients, in turn,
can request pieces. Furthermore, when a new client joins the swarm, it might tell this
new client to connect to your client. So rather than a single connection between a client
and one server, you have a dense graph of connections between clients, dynamically exchanging
data.
For our third and final application, let’s look at Skype, the popular voice, chat, and
video service. Skype is a proprietary system. It doesn’t have any official documentation
on how it works internally. In 2008 some researchers at Columbia figured out mostly how it works
by looking at where and when Skype clients send messages. The messages were encrypted,
though, so they couldn’t look inside. In 2011, however, Efim Bushmanov reverse engineered
the protocol and published open source code. So now we have a better sense of how the protocol
works.
In its most simple mode, when you want to call someone on Skype, it’s a simple client-server
exchange, sort of like HTTP. You, the caller, open a connection to the recipient. If the
recipient accepts your call, you start exchanging voice, video, or chat data.
In some ways this looks like the world wide web example: one side opens a connection to
the other and they exchange data. But unlike the web, where there’s a client and a server,
in the Skype case you have two clients. So rather than having a personal computer request
something from a dedicated server, you have two personal computers requesting data from
each other. This difference turns out to have a really big implication to how Skype works.
The complication comes from something called a NAT, or Network Address Translator. NATs
are everywhere today. A small home wireless router is a NAT. When a mobile phone connects
We’ll cover them in greater detail later in the course, but for now all you need to
know is that if you’re behind a NAT then you can open connections out to the Internet,
6
but other nodes on the Internet can’t easily open connections to you. In this example,
that means that Client B can open connections to other nodes freely, but it’s very hard
for other nodes to open connections. That’s what this red-green gradient is showing; connections
coming from the green side work fine, but connections coming from the red side don’t.
So the complication here is that if the client A wants to call the client B, it can’t open
It does so using something called a rendezvous server. When you log into Skype, your client
opens connections to a network of control servers. In this case, client B opens a connection
to the rendezvous server. This works fine because the server isn’t behind a NAT and
When client A calls client B, it sends a message to the rendezvous server. Since the server
has an open connection to client B, it tells B that there’s a call request from A. The
call dialog pops up on client B. If client B accepts the call, then it opens a connection
to client A. Client A was trying to open a connection to client B, but since B was behind
connected to, which then asks client B to open a connection back to client A. Since
client A isn’t behind a NAT, this connection can open normally. This is called a reverse
connection because it reverses the expected direction for initiating the connection. Client
This happens in Skype because Skype clients are typically personal machines. It’s rare
for publicly accessible web servers to be behind NATs. Since you want the server to
be accessed by everyone on the Internet, putting it behind a NAT is a bad idea. Therefore,
opening connections to web servers is easy. Personal computers, however, are often behind
NATs, for security and other reasons. Therefore Skype has to incorporate some new communication
7
So what does Skype do if both clients are behind NATs? We can’t reverse the connection.
Client A can’t open a connection to client B and client B can’t open a connection to
client A.
To handle this case, Skype introduces a second kind of server, called a relay. Relays can’t
be behind NATs. If both client A and client B are behind NATs, then the communicate through
a relay. They both open connections to the relay. When client A sends data, the relay
forwards it to client B through the connection that B opened. Similarly, when client B sends
data, the relay forwards it to client A through the connection client A opened.
In summary, we’ve seen the most common communication model of networked applications: a reliable,
bidirectional byte stream. This allows two programs running on different computers to
exchange data. It abstracts away the entire network to a simple read/write relationship.
Although it’s a very simple communication model, it can be used in very inventive and
complex ways. We looked at 3 examples: the world wide web, BitTorrent and Skype. The
world wide web is a client-server model. A client opens a connection to a server and
requests documents. The server responds with the documents. BitTorrent is a peer-to-peer
model, where swarms of clients open connections to each other to exchange pieces of data,
creating a dense network of connections. Skype is a mix of the two. When Skype clients can
communicate directly, they do so in a peer-to-peer fashion. But sometimes the clients can’t
So you can see how what looks like a very simple abstraction, a bidirectional, reliable
data stream, can be used in many interesting ways. By changing how programs open connections
and what different programs do, we can create complex applications ranging from document
very different data and a very different role than the clients, just as Skype has relays
8
I’ve presented a very simple abstraction: “A and B set up a connection.” In order
must be able to somehow find another computer, for example, through a name such as
www.google.com.
An application must also be able to name the service it wants, such as a web page, a BitTorrent
chunk, or multimedia over Skype. The network somehow figures out the best path messages
should take across the world: messages send to New York from Chicago should not pass through
London. Our applications can deliver data reliably, despite the fact that nodes in the
By providing all of these services and more, the network allows applications to just think
about this high level abstraction and not worry about how it’s achieved. I find it
exciting and fascinating how a network like the Internet is really just a collection of
very small abstractions and services, which, somehow, when you combine them, become something
so much greater than the sum of its parts. If we take all of these little pieces and
Problem 1-1A
You own a large gaming company and want to distribute an update to all of your players. The
size of the update is 1GB and your server can send data at up to 1GB/s. Your engineers have
found that assuming every player can download at 1MB/s leads to very accurate estimates of
network performance. Furthermore, they've found you can assume that the server splits its
capacity evenly across clients. You have 100,000 players.
One of your engineers recommends that you distribute the update by having all users download
the full file from your company's server. You walk through this calculation with the engineer and
determine it will take 100,000 seconds (~28 hours) for all of your players to download the
update. The server can support 1,000 players downloading the update at once by splitting its
1GB/s across 1,000 1MB/s clients. It will take 100 rounds of 1,000 players for everyone (all
100,000) to receive the update. As it takes 1,000 seconds for a 1MB/s connection to download
9
1GB, each round will take 1,000 seconds. 100 rounds of 1,000 seconds is 100,000 seconds, or 28
hours.
Another engineer recommends a different, peer-to-peer, strategy. In this strategy, players who
have downloaded the patch allow other players to download it from them. So in the first round,
1,000 players download the patch from the server. In the second round, 1,000 new players
download the patch from the server, and 1,000 new players pair with the first round
downloaders. In the third round, 1,000 new players download the patch from the server, and
3,000 new players download the patch from players who have already downloaded it.
Calculate how long it will take until the last player to receive the update using the second
strategy. How much faster is it than the first strategy?
It will be slower.
Twice as fast (~50000 seconds)
Four times as fast (~25000 seconds)
Eight times as fast (~12000 seconds)
Ten times as fast (~10000 seconds)
Fourteen times as fast (~7000 seconds
Problem 1-1B
Problem 1-1C
11