Chapter 4 Network Layer
Chapter 4 Network Layer
Layer
Silvia Giordano
ICA, EPFL
The transport layer relies on the services of the network layer, which provides
a communication service between hosts. In particular, the network layer moves
transport- layer segments from one host to another. At the sending host, the
transport- layer segment is passed to the network layer. It is then the job of the
network layer to get the segment to the destination host and pass the segment
up the protocol stack to the transport layer.
1
Network Layer
Chapter goals: Overview:
r understand principles r network layer services
behind network layer r routing principles
services: r IP addresses
m routing
r Internet routing protocols
m how a router works
reliable transfer
m advanced topics: IPv6,
m intra-domain
multicast
m inter-domain
r instantiation and
r ICMP
implementation in the
Internet r Routers, bridges and switches
r IPv6
r multicast routing
2
Network layer functions
The network layer involves each and every host and router in the network. The
role of the network layer in a sending host is to begin the packet on its journey
to the receiving host. The role of the network layer is thus deceptively simple--
to transport packets from a sending host to a receiving host. To do so, three
important network- layer functions can be identified:
•Path determination. The network layer must determine the route or path taken
by packets as they flow from a sender to a receiver. The algorithms that
calculate these paths are referred to as routing algorithms.
•Switching. When a packet arrives at the input to a router, the router must
move it to the appropriate output link.
•Call setup. With TCP, a three-way handshake is required before data actually
flow from sender to receiver. This allowed the sender and receiver to set up the
needed state information (for example, sequence number and initial flow-
control window size). In an analogous manner, some network- layer
architectures (for example, ATM) require that the routers along the chosen
path from source to destination handshake with each other in order to setup
state before data actually begins to flow. In the network layer, this process is
referred to as call setup. The network layer of the Internet architecture does
not perform any such call setup.
3
Network service model
4
Virtual circuits
“source-to-dest path behaves much like telephone
circuit”
m performance-wise
m network actions along source-to-dest path
r call setup, teardown for each call before data can flow
r each packet carries VC identifier (not destination host ID)
r every router on source-dest path maintains “state” for
each passing connection
m transport-layer connection only involved two end systems
r link, router resources (bandwidth, buffers) may be
allocated to VC
m to get circuit-like performance
5
Virtual circuits: signaling protocols
application
6. Receive data application
transport 5. Data flow begins
network 4. Call connected 3. Accept call transport
data link 1. Initiate call 2. incoming call network
data link
physical
physical
The messages that the end systems send to the network to indicate the
initiation or termination of a VC, and the messages passed between the
switches to set up the VC (that is, to modify switch tables) are known as
signaling messages and the protocols used to exchange these messages are
often referred to as signaling protocols. In the Internet virtual circuits is not
used at network layer, while ATM, frame relay and X.25, are three other
networking technologies that use virtual circuit.
6
Datagram networks: the Internet model
r no call setup at network layer
r routers: no state about end-to-end connections
m no network-level concept of “connection”
r packets typically routed using destination host ID
m packets between same source-dest pair may take
different paths
application
application
transport
transport
network
data link 1. Send data 2. Receive data network
data link
physical
physical
7
The Internet Network layer
Host, router network layer functions:
Link layer
physical layer
The pieces of the network layer of the Internet are often collectively referred to
as the IP layer (named after the Internet's IP protocol). We'll see, though, that
the IP protocol itself is just one piece (albeit a very important piece) of the
Internet's network layer. The Internet's network layer provides connectionless
datagram service rather than virtual-circuit service. When the network layer at
the sending host receives a segment from the transport layer, it encapsulates
the segment within an IP datagram, writes the destination host address as well
as other fields in the datagram, and sends the datagram to the first router on the
path toward the destination host. The Internet’s network layer has three major
components:
•The Internet Protocol, or more commonly, the IP Protocol, which defines
network- layer addressing, the fields in the datagram (that is, the network- layer
PDU), and the actions taken by routers and end systems on a datagram based
on the values in these fields. There are two versions of the IP protocol in use
today: IPv4 [RFC 791] and IPv6 [RFC 2373; RFC 2460], which has been
proposed to replace IPv4 in upcoming years.
•The second major component of the network layer is the path determination
component; it determines the route a datagram follows from source to
destination. Examples of such components used in the Internet are RIP, OSPF,
BGP.
•The Internet's network- layer error and information reporting protocol, ICMP,
is a facility to report errors in datagrams and respond to requests for certain
network- layer information..
8
Internet and intranet
r an intranet
a collection of end and intermediate systems
interconnected using the TCP/IP architecture
normally inside one organization
r the Internet
the global collection of all hosts and routers
interconnected using the TCP/IP architecture
coordinated allocation of addresses and implementation
requirements by the Internet Society
r intranets are often connected to the Internet by
firewalls
m hosts that act as application level relays
9
IP Addressing: introduction
r IP address: 32-bit 128.178.1.1
10
IP Addressing
r IP address: 128.178.1.1
11
Network Example ETHZ-Backbone ezci7-ethz- switch
Komsys
with IP Addresses
129.132.100.12 129.132.100.27
ezci7-ethz- switch 129.132
129.132.35.1 66.46
Modem Switch
+ PPP 128.178.84.133 130.59.x.x
sic500cs
128.178.84.130 128.178.47.3
128.178.84.1 128.178.47.5
ed0-ext EPFL-Backbone ed0-swi
15.7 15.13 128.178.100.12
stisun1
15.221 128.178.100.3
ed2-in ed2-el
128.178.182.3
182.5 182.1
in-inr in-inj
128.178.156.1 DI 128.178.79.1 LEMA
lrcsuns 00:00:0C:02:78:36 00:00:0C:17:32:96
128.178.156.24 LRC
08:00:20:71:0D:D4 lrcmac4
disun3
128.178.29.64
128.178.79.9 08:00:07:01:a2:a5
lrcpc3 lrcmac4 08:00:20:20:46:2E
128.178.156.7 128.178.156.23 Anneau SIDI SUN
00:00:C0:B8:C2:8D 08:00:07:01:a2:a5
12
IP Address Classes
0 1 2 3… 8 16 24 31
class A 0 Net Id Subnet Id Host Id
class B 10 Net Id Subnet Id Host Id
class C 110 Net Id Host Id
class D 1110 Multicast address
class E 11110 Reserved
Class Range
A 0.0.0.0 to 127.255.255.255
B 128.0.0.0 to 191.255.255.255
C 192.0.0.0 to 223.255.255.255
D 224.0.0.0 to 239.255.255.255
E 240.0.0.0 to 247.255.255.255
At the origin, the prefix of an IP address was defined in a very rigid way. For class A addresses,
the prefix was 8 bits. For class B, 16 bits. For class C, 24 bits. The interest of that scheme was
that by simply analyzing the address you could find out what the prefix was.
The requirement that the network portion of an IP address be exactly one, two, or three bytes
long turned out to be problematic for supporting the rapidly growing number of organizations
with small and medium-sized networks. A class C (/24) network could only accommodate up to
28 - 2 = 254 hosts (two of the 28 = 256 addresses are reserved for special use)--too small for many
organizations. However, a class B (/16) network, which supports up 65,634 hosts was too large.
Under classful addressing, an organization with, say, 2,000 hosts was typically allocated a class
B (/16) network address. This led to a rapid depletion of the class B address space and poor
utilization of the assigned address space. It was soon recognized that this form was too rigid.
Then subnets were added. It was no longer possible to recognize from the address alone where
the subnet prefix ends and where the host identifier starts. For example, the host part at EPFL is 8
bits; it is 6 bits at ETHZ. Therefore, an additional information, that is the subnet mask, is
necessary.
Class C addresses were meant to be allocated one per network. Today they are allocated in
contiguous blocks.
13
CIDR: IP Address Hierarchies
r The prefix of an IP address is itself structured in order
to support aggregation
m For example: 128.178.x.y represents an EPFL host
128.178.156 / 24 represents the LRC subnet at EPFL
128.178 / 16 represents EPFL
m Used between routers by routing algorithms
m This way of doing is called classless and was first introduced in
inter domain routing under the name of CIDR (classless
interdomain routing)
r Notation: 128.178.0.0/16 means : the prefix made of the
16 first bits of the string
r It is equivalent to: 128.178.0.0 with netmask=255.255.0.0
r In the past, the class based addresses, with networks of
class A, B or C was used; now only the distinction between
class D and non-class D is relevant.
With so-called CIDRized (CIDR: Classless Interdomain Routing) network addresses, the
network part of an IP address can be any number of bits long, rather than being constrained to 8,
16, or 24 bits. A CIDRized network address has the dotted-decimal form a.b.c.d/x, where x
indicates the number of leading bits in the 32-bit quantity that constitutes the network portion of
the address. In our example above, the organization needing to support 2,000 hosts could be
allocated a block of only 2,048 host addresses of the form a.b.c.d/21, allowing the approximately
63,000 addresses that would have been allocated and unused under classful addressing to be
allocated to a different organization. In this case, the first 21 bits specify the organization's
network address and are common in the IP addresses of all hosts in the organization. The
remaining 11 bits then identify the specific hosts in the organization. In practice, the organization
could further divide these 11 rightmost bits using a procedure known as subnetting to create its
own internal networks within the a.b.c.d/21 network.
14
IP Addresses (examples -1)
r subnet mask at ETHZ = 255.255.255.192
(that is 111111111.11111111.111111111.11000000)
net part host part
r CIDR 129.132.0.0/26 6 bits
r question: net:subnet and host parts of
spr13.tik.ee.ethz.ch = 129.132.119.77 ?
answer: (77=01001101)
net:subnet = 129.132.119.64 (64=01000000)
host = 13=001101 (6 bits)
15
IP Addresses (examples -2)
194.167.41.0/24
194.167.42.0/23
Internet Service
Tango SA Provider SovKom
16
Special Case IP Addresses
1.
1. 0.0.0.0
0.0.0.0 this
this host
host
2.
2. 0.hostId
0.hostId specified
specified host
host onon this
this net
net
3.
3. 255.255.255.255
255.255.255.255 limited
limited broadcast
broadcast (not (not forwarded
forwarded byby
routers)
routers)
4.
4. netId.all
netId.all 1’s
1’s broadcast
broadcast on on this
this net
net
5.
5. netId.subnetId.all
netId.subnetId.all 1’s
1’s broadcast
broadcast on on this
this subnet
subnet
6.
6. 127.x.x.x
127.x.x.x loopback
loopback
7.
7. 10/8
10/8 reserved
reservednetworks
networks for for internal
internal use
use
172.16/12
172.16/12
192.168/16
192.168/16
r Example: 128.178.255.255:___________
broadcast to EPFL
128.178.156.___:
255 broadcast to all LRC net
128.178.156.0 : LRC net
129.132.119.64 : tik-sprach
hostId = 0 designates the network
The following address blocks are reserved and cannot be used in the Internet. they are typically
used in experimental or closed environments
10.0.0.0 - 10.255.255.255 (10/8)
172.16.0.0 - 172.31.255.255 (172.16/12)
192.168.0.0 - 192.168.255.255 (192.168/16)
17
IP Principles
Homogeneous addressing
r an IP address is unique across the whole network (
= the world in general)
r IP address is the address of the interface
r communication between IP hosts requires
knowledge of IP addresses
Routing:
r inside a subnetwork: hosts communicate directly
without routers
r between subnetworks: one or several routers are
used
r a subnetwork = a collection of systems with a
common prefix 4: Network Layer 4-18
18
The IP Packet Forwarding Algorithm
r Rule for sending packets (hosts, routers)
m ifthe destination IP address has the same prefix as one of
self’s interfaces, send directly to that interface
m otherwise send to a router as given by the IP routing table
128.178.79.0 255.255.255.0
128.178.182.3 255.255.255.0
128.178.156.0 255.255.255.0 128.178.182.5
128.178.79.1 255.255.255.0
DEFAULT 128.178.182.1
The IP packet forwarding algorithm is the core of the TCP/IP architecture. It defines what a
system should do with a packet it has to send or to forward. The rule is simple:
- if the destination IP address has the same prefix as one of self’s interfaces, send directly to that
interface
- otherwise send to a router as given by the table
It uses the IP routing table; the table can be checked with a command such as “netstat” with Unix
or “Route” with Windows NT
19
IP Unicast Packet Forwarding Algorithm
Read destAddr= destination IP address /* assume it is unicast */
Case 1: a host route exists for destAddr
for every entry in routing table
if(destinationaddr= destAddr)
then send to nextHop IPaddr; leave
Case 2: destAddr is on a directly connected network (=on-link):
for every physical interface IP address A and subnet mask sm
if(A & sm = destAddr & sm)
then send directly to destAddr; leave
Case 3: a network route exists for destAddr
for every entry in routing table
if(destinationaddr & subnetMask = destAddr & subnetMask)
then send to nextHop IP addr; leave
Case 4: use default route
for every entry in routing table
if(destinationaddr=DEFAULT) then send to nextHop IPaddr; leave
4: Network Layer 4-20
In reality there are exceptions to the rule. The complete algorithm is as above; the cases should
be test in that order (it is a nested if then else statement).
Remember that the above is the packet forwarding algorithm. The tables are written by the
control method (the routing algorithms).
20
Getting a datagram from source to dest.
routing table in A
Dest. Net. next router Nhops
128.178.1 1
IP datagram: 128.178.2 128.178.1.4 2
128.178.3 128.178.1.4 2
misc source dest default 128.178.1.4
data
fields IP addr IP addr A 128.178.1.1
to Internet
r datagram remains 128.178.2.1
unchanged, as it travels 128.178.1.2
128.178.1.4128.178.2.9
source to destination
B
r addr fields of interest 128.178.2.2
128.178.1.3 128.178.3.27 E
here
128.178.3.1 128.178.3.2
Every IP datagram has a source address field and a destination address field.
The source host fills a datagram's source address field with its own 32-bit IP
address. It fills the destination address field with the 32-bit IP address of the
final destination host to which the datagram is being sent. The data field of the
datagram is typically filled with a TCP or UDP segment. The IP datagram
travels inside the network remaining unchanged. For routing purpose, the
fields of main interest (e.g. the fields that are read and used) are the two
addresses: source and destination.The way the network transports the datagram
from the source to the destination depends on whether the source and
destination reside on the same subnetwork.
21
SOME INFO
22
Last week
transferred the apps
data from S to D! Internet:
Transport Layer: Logical mconnectionless
communication between processes
transport: UDP
Reliable data transfer mchecksum
r data received ordered & error-free
mpkt transmission
r Elements of procedure usually means
the set of following functions mconnection-oriented
m Error detection and correction (e.g. transport: TCP
ARQ )
m Flow Control mreliable service
Connection Management mflow and congestion
m
23
Last Week
r the network service model
defines edge-to-edge r the network layer functions:
channel r path determination
r transport pkt from r switching
sending to receiving hosts r call setup
r network layer protocols in r IP addressing:
every host, router
r network & host part
r the most important
r classes and CIDR
abstraction provided by
network layer: r IP principles:
m network-layer connection- r homogeneous addressing
oriented service: virtual r routing
circuit r routing to the same subnet
m network-layer r routing to another subnet
connectionless service:
datagram 4: Network Layer 4-24
24
Getting a datagram from source to dest.:
same subnetwork
misc
128.178.1.1 128.178.1.3 data Dest. Net. next router Nhops
fields
128.178.1 1
Starting at A, given IP 128.178.2 128.178.1.4 2
P
128.178.3 128.178.1.4 2
datagram addressed to B: default 128.178.1.4
r look up net. address of B A 128.178.1.1
P
r find B is on same net. as A to Internet
128.178.2.1
r link layer will send datagram 128.178.1.2
directly to B inside link-layer 128.178.1.4128.178.2.9
frame B P
P 128.178.2.2
E
m B and A are directly 128.178.1.3 128.178.3.27
connected
128.178.3.1 128.178.3.2
25
Getting a datagram from source to dest.:
different subnetworks
misc
128.178.1.1 128.178.2.3 data Dest. Net. next router Nhops
fields
128.178.1 1
Starting at A, dest. E: P
128.178.2 128.178.1.4 2
128.178.3 128.178.1.4 2
r look up network address of E
default 128.178.1.4
r E on different network A 128.178.1.1
A, E not directly attached to Internet
m P 128.178.2.1
r routing table: next hop 128.178.1.2
router to E is 128.178.1.4 P
128.178.1.4128.178.2.9
r link layer sends datagram to B
128.178.2.2
router 128.178.1.4 inside link- 128.178.1.3 128.178.3.27 E
layer frame
128.178.3.1 128.178.3.2
r datagram arrives at
128.178.1.4
r continued…..
4: Network Layer 4-26
26
Getting a datagram from source to
dest.: different subnetworks
Dest. next
misc
128.178.1.1 128.178.2.3 data network router Nhops interface
fields
128.178.1 - 1 128.178.1.4
Arriving at 128.178.1.4, 128.178.2 - 1 128.178.2.9
128.178.3 - 1 128.178.3.27
destined for 128.178.2.2 default xx xx
r look up network address of E A 128.178.1.1
r E on same network as router’s
128.178.2.1
interface 128.178.2.9 128.178.1.2 P
m router, E directly attached 128.178.1.4128.178.2.9
B P P
r link layer sends datagram to 128.178.2.2
128.178.2.2 inside link-layer 128.178.1.3 128.178.3.27 E
P
frame via interface 128.178.3.2
128.178.3.1
128.178.2.9
r datagram arrives at
128.178.2.2!!! (hooray!)
4: Network Layer 4-27
The datagram is now in the router, and it is the job of the router to move the
datagram toward its ultimate destination. The router consults it own routing
table and finds an entry, 128.178.2.0/24, whose network address matches the
leading bits in the IP address of host E. The routing table indicates that the
datagram should be forwarded on router interface 128.178.2.9. Since the
number of hops to the destination is 1, the router knows that destination host E
is on the same network as its own interface, 128.178.2.9. The router thus
moves the datagram to this interface, which then transmits the datagram to host
E.
27
IP datagram format
IP protocol version 32 bits total datagram
number
header length length (bytes)
ver head. type of length
(bytes) len service for
“type” of data fragment
16-bit identifier flgs fragmentation/
offset
max number time to upper reassembly
Internet
remaining hops live layer checksum
(decremented at
32 bit source IP address
each router)
32 bit destination IP address
upper layer protocol
to deliver payload to Options (if any) E.g. timestamp,
record route
data taken, pecify
(variable length, list of routers
typically a TCP to visit.
or UDP segment)
The maximum amount of data that a link- layer packet can carry is called the MTU (maximum
transfer unit). Because each IP datagram is encapsulated within the link- layer packet for
transport from one router to the next router, the MTU of the link- layer protocol places a hard
limit on the length of an IP datagram. Having a hard limit on the size of an IP datagram is not
much of a problem. What is a problem is that each of the links along the route between sender
and destination can use different link- layer protocols, and each of these protocols can have
different MTUs.
•Modem link: short MTU 1000 B at 9600 b/s = 530 ms too large for interactive traffic
•large MTU = higher throughput less overhead(TCP + IP = 40 bytes header overhead) no
fragmentation loss avalanche effect
29
IP Fragmentation & Reassembly
r network links have MTU
(max.transfer size) - largest possible
link-level frame.
m different link types, different
MTUs fragmentation:
in: one large datagram
r large IP datagram divided out: 3 smaller datagrams
(“fragmented”) within net
m one datagram becomes several
datagrams
m “reassembled” only at final reassembly
destination
m IP header bits used to identify,
order related fragments
r fragmentation is in principle avoided
with TCP and UDP using small
segments
Suppose you receive an IP datagram from one link, you check your routing
table to determine the outgoing link, and this outgoing link has an MTU that is
smaller than the length of the IP datagram. Time to panic--how are you going
to squeeze this oversized IP packet into the payload field of the link- layer
packet? The solution to this problem is to "fragment" the data in the IP
datagram among two or more smaller IP datagrams, and then send these
smaller datagrams over the outgoing link. Each of these smaller datagrams is
referred to as a fragment.
Fragments need to be reassembled before they reach the transport layer at the
destination. Indeed, both TCP and UDP are expecting to receive complete,
unfragmented segments from the network layer. However, Fragmentation and
reassembly puts an additional burden on Internet routers and on the destination
hosts. For this reason it is desirable to keep fragmentation to a minimum. This
is often done by limiting the TCP and UDP segments to a relative ly small size,
so that fragmentation of the corresponding datagrams is unlikely.
30
IP Fragmentation and Reassembly
length ID fragflag offset
=4000 =x =0 =0
31
Routing Table maintenance
at host
r configuration
r ICMP redirect
r ICMP router discovery messages
at routers
r configuration
r all routers participate in routing protocols: distribute
addresses and routes
r autonomous systems (ASs)
m stub or mutlihomed: ex: EPFL
m transit: ex: Switch
r between ASs: EGP and BGP
inside AS: RIP, OSPF(standard), IGRP (Cisco)
r example. OSPF
m routers exchange topology and addressing information ->
topology database
m routes computed with Dijkstra’s SPF algorithm
4: Network Layer 4-32
32
ICMP: Internet Control Message Protocol
The most typical use of ICMP is for error reporting. ICMP is often considered
part of IP, but architecturally lies just above IP, as ICMP messages are carried
inside IP packets. That is, ICMP messages are carried as IP payload, just as
TCP or UDP segments are carried as IP payload. ICMP messages have a type
and a code field, and also contain the first eight bytes of the IP datagram that
caused the ICMP message to be generated in the first place (so that the sender
can determine the packet that caused the error). The well-known ping
program sends an ICMP type 8 code 0 message to the specified host. The
destination host, seeing the echo request, sends back a type 0 code 0 ICMP
echo reply. Also Traceroute also uses ICMP messages.
33
ICMP Redirect
r Sent by router to source host to inform source
that destination is directly connected
m host updates routing table
ICMP //
Redirect || IP
IP datagram
datagram header
header (prot
(prot == ICMP)
ICMP)
Format +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|| Type=5
Type=5 || code
code || checksum
checksum
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|| Router
Router IPIP address
address that
that should
should be
be preferred
preferred
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|| IP
IP header
header plus
plus 88 bytes
bytes of
of original
original datagram
datagram data
data
//
The ICMP redirect is very useful when source and destination are directly connected. In this
case, the source host receives, from the router it contacted for reaching the destination, an ICMP
Redirect message that indicates that the destination is directly connected. In the TCP/IP
architecture, hosts only transfer packets to connected hosts/routers. They do not have knowledge
of the network and learn the minimal view of the network needed via ICMP. Routers, which
performs *real* routing, need more extensive information.
34
ICMP Redirect Example
35
ICMP Redirect Example (cont’d)
BEFORE
AFTER ICMP
Destination
Destination Gateway
Gateway Flags RefRefUse
Flags UseInterface
Interface
-------------------- -------------------- ----- ----- ------ ---------
ICMPREDIRECT
127.0.0.1 127.0.0.1 UH 0 11239 lo0
128.178.156.0 128.178.156.24 U 3 38896 le0
128.178.29.9
224.0.0.0 128.178.156.100
128.178.156.24 UGHD U 30 19 0 le0
REDIRECT
224.0.0.0
default 128.178.156.24
128.178.156.1 U
UG 30 85883
0 le0
default 128.178.156.1 UG 0 85883
Note that ICMP adds route for a single host, not for a net!!!
4: Network Layer 4-36
36
Routing and Packet forwarding
r Routing
m computation of routing tables or data
structures for unicast and multicast
m normally only between routers
m non-real time: latency up to 2 minutes
m uses protocols such as RIP, OSPF, EIGRP
(Cisco) for unicast
and DVMRP, M-OSPF, PIM for multicast
r Packet Forwarding
m for every packet
m real time
For transferring packets from a sending host to the destination host a packet the network layer
performs two functions: routing that, roughly speaking, determines the path or route that the
packets are to follow and packet forwarding, the transmission of the packets to an address that
can be reached directly. The former is generally performed between routers (by means of routing
tables and data structures) and not in real-time; while packet forwarding, being a more simple
action, is a real time action.
37
Routing
Routing protocol
Goal: determine “good” path 5
(sequence of routers) thru 3
network from source to dest. B C 5
2
A 2 1 F
3
Graph abstraction for 1 2
routing algorithms: D E
1
r graph nodes are
routers r “good” path:
r graph edges are m typically means minimum
physical links cost path
m link cost: delay, $ cost, m other def’s possible
or congestion level
Given a set of routers, with links connecting the routers, a routing algorithm
finds a "good" path from source to destination. Typically, a "good" path is one
that has "least cost." With the graph abstraction for routing algorithms, nodes
in the graph represent routers--the points at which packet routing decisions are
made--and the lines ("edges" in graph theory terminology) connecting these
nodes represent the physical links between these routers. A link also has a
value representing the "cost" of sending a packet across the link. The cost may
reflect the level of congestion on that link or the physical distance traversed by
that link.
38
Routing Algorithm classification
Global or decentralized Static or dynamic?
information? Static:
Global: r routes change slowly over
r all routers have complete time
topology, link cost info
Dynamic:
r “link state” algorithms
r routes change more quickly
Decentralized:
m periodic update
r router knows physically-
connected neighbors, link m in response to link cost
costs to neighbors changes
r iterative process of
computation, exchange of
info with neighbors
r “distance vector” algorithms
39
Routing in the Internet
r Two-level routing:
m Intra-AS: administrator is responsible for choice
m Inter-AS: unique standard
40
Internet AS Hierarchy
Inter-AS border (exterior gateway) routers
Intra-AS interior
(gateway) routers
41
Intra-AS and inter-AS Routing
Intra-AS routing:
r Also known as Interior Gateway Protocols (IGP)
r Most common IGPs:
m RIP: Routing Information Protocol
m OSPF: Open Shortest Path First
m EIGRP: Extended Interior Gateway Routing Protocol
(Cisco propr.)
Inter-AS routing:
r Also known as Exterior Gateway Protocols (EGP)
r BGP (Border Gateway Protocol): the de facto standard
Why are there Different Inter-AS and Intra-AS Routing
Protocols?
r Policy
r Scale
r Performance
Historically, three routing protocols have been used extensively for routing
within an autonomous system in the Internet: RIP (the Routing Information
Protocol), OSPF (Open Shortest Path First), and EIGRP (Cisco's propriety
Enhanced Interior Gateway Routing Protocol).
The Border Gateway Protocol version 4, specified in RFC 1771 (see also RFC
1772; RFC 1773), is the de facto standard interdomain routing protocol in
today's Internet. It is commonly referred to as BGP4 or simply as BGP. As an
inter-autonomous system routing protocol, it provides for routing between
autonomous systems (that is, administrative domains).
There are several resons for having different Inter-AS and Intra-AS routing
protocols:
•Policy. Among ASs, policy issues dominate. It may well be important that
traffic originating in a given AS specifically not be able to pass through
another specific AS. Similarly, a given AS may well want to control what
transit traffic it carries between other ASs.
• Scale. The ability of a routing algorithm and its data structures to scale to
handle routing to/among large numbers of networks is a critical issue in inter-
AS routing. Within an AS, scalability is less of a concern.
•Performance. Because inter-AS routing is so policy-oriented, the quality (for
example, performance) of the routes used is often of secondary concern.
Indeed, among ASs, there is not even the notion of preference or costs
associated with routes.
42
Router Definitions
r Definition: IP router
m a system that forwards packets based on IP addresses
m performs packet forwarding + routing + control method
• routing, configuration management. DHCP relay, IPv6 router
advertisements…
r Implementation:
m any UNIX, NT machine can be configured as IP router
m normally, dedicated packet forwarder called router
r Multiprotocol router
m a system that forwards packets based on layer 3 addresses
for various protocol architectures (ex: IP, Appletalk)
m CISCO, IBM, etc…
m most multiprotocol routers works at both layer 2 and 3
• architecture: forward at layer 2 + forward at layer 3
• implementation: one CISCO
m IP router boxes also perform other functions: port filtering,
DHCP relay, …
4: Network Layer 4-43
If your ever read commercial literature, you have to be aware of the difference between
architecture names and implementation names. The word router (like most words) is
unfortunately used in both contexts.
- from an architecture view point, a router is any system which forwards packets based on layer 3
information. Router in that context is a function.
- The router function can be implemented by a piece of software on Unix or Windows NT, or by
a complex dedicated machine (a Cisco, IBM, Bay Networks or Flextel box for example).
Most boxes called “routers” perform a set of additional functions that have nothing to do with
packet forwarding using layer 3 addresses. For example, they can be used as bridges or
application level relay.
43
Router Architecture Overview
Two key router functions:
r run routing algorithms/protocol (RIP, OSPF, BGP)
r switching datagrams from incoming to outgoing link
The routing algorithms control the routes taken by packets through the
network In the network layer, the real work is the forwarding of datagrams, at
first, between an incoming link and an outgoing link. The switching function
of a router is the transfer of datagrams from a router's incoming links to the
appropriate outgoing links. The input port performs, among other functions,
the physical layer functionality of terminating an incoming phys ical link to a
router; and the data- link layer functionality needed to interoperate with the
data link layer functionality on the other side of the incoming link. The
switching fabric connects the router's input ports to its output ports. This
switching fabric is completely contained within the router. The output port
stores the packets that have been forwarded to it through the switching fabric,
and then transmits the packets on the outgoing link. The output port thus
performs the reverse data link and physical layer functionality as the input
port. The routing processor executes the routing protocols, maintains the
routing tables, and performs network management functions, within the router.
44
Protocols Other than TCP/IP
r Some other protocol families (ex: Appletalk,
IPX) are not compatible with TCP/IP
r routers must be multiprotocol
r MAC interface (layer 2) is standard
Appletalk
LLC
A B
Ap MAC
TCP ple
talk PHY
Bridge
multi
IP LLC
layer 2
protocol
TCP
MAC
IP
PHY C
MAC
B (an old macintosh file server) runs only Appletalk. Only applications using the Appletalk
protocols can be used (MacOS file sharing, printing). TCP/IP applications such as the web cannot
be used on B.
C (a modern PC) runs only TCP/IP. All TCP/IP applications can be used, but not MacOS file
sharing.
A (a windows NT server) runs both in parallel. It can talk to both C and B.
A bridge can be used to interconnect A, B and C; there is nothing special to do. If a router is used
instead, it must run in parallel Appletalk and IP.
The protocol stacks shown are all implemented in software. They use the standard Ethernet
adapters.
45
NetBIOS
r NetBIOS was originally developed to work only at
layer 2
m uses broadcast that is blocked by routers: LLC-2 similar to
TCP but located in layer 2 (also called NETBEUI)
m in that form, it is not “routable”: can only go at layer 2
App App
NetBIOS NetBIOS
LLC2 Bridge LLC2
Layer 2
MAC MAC
MAC MAC
NetBIOS is an interface for distributed applications which is commonly used with IBM and
Microsoft systems. Originally, NetBIOS used the LLC-2 protocol, a link layer protocol which
does packet retransmissions, much as TCP does. Only MAC addresses are used. In addition,
NetBIOS offers a naming service. This version of NetBIOS works only in a bridged
environment.
46
IPv6
r The current IP is IPv4
r IPv4 address space is too small (32 bits)
m will be exhausted some day
IPv6 is primarily IP with a larger address space. However, a number of details are different, in
particular the IPv6 header is easier to process (but is also longer). An excellent online source of
information about IPv6 is The IP Next Generation Homepage [Hinden 1999].
Many features which were originally designed for IPv6 are now part of IPv4 (security and
mobility).
The most important changes introduced in IPv6 are evident in the packet format:
•Expanded addressing capabilities. IPv6 increases the size of the IP address from 32 to 128 bits.
This ensures that the world won't run out of IP addresses. In addition to unicast and multicast
addresses, a new type of address, called an anycast address, has also been introduced, which
allows a packet addressed to an anycast address to be delivered to any one of a group of hosts.
•A streamlined 40-byte header. A number of IPv4 fields have been dropped or made optional.
The resulting 40-byte fixed- length header allows for faster processing of the IP datagram. A new
encoding of options allows for more flexible options processing.
•Flow labeling and priority. IPv6 has an elusive definition of a "flow. " RFC 1752 and RFC 2460
state that this allows "labeling of packets belonging to particular flows for which the sender
requests special handling, such as a non-default quality of service or real-time service.
IPv6 is incompatible with IPv4; this is to avoid the IBM’s SNA syndrom (a monster of
complexity,, because the last version is compatible with all details of all previous versions).
Interworking between the two will use the dual stack approach, as shown for interworking
between Appletalk and IP.
47
Transition From IPv4 To IPv6
r Not all routers can be upgraded simultaneous: no “flag days”
m How will the network operate with mixed IPv4 and IPv6 routers?
r Two proposed approaches:
r Dual Stack: routers with both v6, v4 “translate” between formats
IPv6/IPv4 IPv4/IPv6
48
Transition From IPv4 To IPv6
r Two proposed approaches:
49
Plug and Play and DHCP
r IPv6 address is allocated automatically by
negotiation with routers
m “stateless allocation”
r alternatively, Dynamic Host Configuration Protocol
(DHCP) can be used
r DHCP can be used with IPv4 also
m DHCP server on LAN has a list of IP addresses that can
be allocated dynamically
m MAC address used to identify a host to DHCP server
m renumbering is possible
m more complex to use than IPv6 stateless allocation
With IPv6 an host can negotiate and get its IP address directly from the router
which is attached to. As alternative the Dynamic Host Configuration
Protocol (DHCP) [RFC 2131], also available for IPv4, and used for MobileIP,
can be used
DHCP is sometimes referred to as Plug and Play. With DHCP, a DHCP server
in a network (for example, in a LAN) receives DHCP requests from a client
and, in the case of dynamic address allocation, allocates an IP address back to
the requesting client. DHCP is used extensively in LANs and in residential
Internet access.
50
Broadcasting, Multicasting
r Broadcast = send to all:
m sent to all hosts on one net/subnet ; usedby NetBIOS
for discovery
r Anycast = send to one in a group
m used in IPv6
A number of emerging network applications require the delivery of packets from one or more senders to
a group of receivers. For each of these applications, an extremely useful abstraction is the notion of a
multicast: the sending of a packet from one sender to multiple receivers with a single send operation.
Clearly, this second approach toward multicast makes more efficient use of network bandwidth in that
only a single copy of a datagram will ever traverse a link. On the other hand, considerable network layer
support is needed to implement a multicast-aware network layer. Internet multicast is not a
connectionless service--state information for a multicast connection must be established and maintained
in routers that handle multicast packets sent among hosts in a so-called multicast group. This, in turn,
will require a combination of signaling and routing protocols in order to set up, maintain, and tear down
connection state in the routers. in the Internet architecture (and the ATM architecture as well), a
multicast datagram is addressed using address indirection. That is, a single identifier is used for the
group of receivers, and a copy of the datagram that is addressed to the group using this single identifier is
delivered to all of the multicast receivers associated with that group. In the Internet, the single identifier
that represents a group of receivers is a Class D multicast address. The group of receivers associated with
a class D address is referred to as a multicast group. Multicast addresses are not allocated on a
geographical basis. A global allocation scheme is under discussion at the IETF. Today, global scope
addresses are allocated using the sd tool on Unix. Note that the unique IP unicast address of an host is
completely independent of the address of the multicast group in which it is participating.
51
IP Multicast Principles
Multicast Routing
A IGMP: join m P to m
S
P
2 3 R3 3 P
1
R1 R5
5
R2 P
4 P
B 5
R4
1 S sends data to multicast address m; there is no member, the data is simply lost at the router
2 A joins the multicast address m
3 R1 informs the rest of the network that m has a member at R1; the multicast routing protocol
builds a tree. Data sent by S now reach A
4 B joins the multicast address m
5 R4 informs the rest of the network that m has a member at R4; the multicast routing protocol
adds branches to the tree. Data sent by S now reach both A and B
52
IP Multicast Forwarding Algorithm
Packet Forwarding (host, router)
Read At lrcsuns: Physical Interface Tables
Read address
address MA
MA == destination
destination IP@
IP@
IP subnetMask
/*
/* assume
assume itit is
is multicast
multicast */
*/
for
for every physical interface
every physical interface PIPI 128.178.156.24 255.255.255.0
if
if MA
MA is
is enabled
enabled on
on PI
PI 224.2.166.207
then
then send
send directly
directly toto PI
PI 224.2.127.255
The mapping IP to MAC for multicast addresses is not unique. Ethernet hosts must filter up to 32
IP addresses for one MAC multicast address
53
IGMP: Internet Group Management Protocol
Purpose: manage group membership inside one subnet
r routers: know if group is present on an interface
m know whether to forward locally or not
r hosts: know if a multicast address is already in use
locally 1
2
3
128.178.156.0
54
IGMP Host Implementation
Host Implementation
r goal: avoid avalanche effects: one router
originated query might cause a burst of reports
r solution = the synchronization avoidance protocol
m 1. hosts delay responses randomly
m 2. hosts listen to responses, only first one answers
Host IGMP Finite State Machine
(1): a first response is sent spontaneously, a short timer (10s) set, then another response sent after
expiration (because of possible loss)
(2): a random timer is chosen
55
MBone
Mbone (1)
r Global Multicast not available
m no stable routing protocol implemented in all routers of the Internet
r Mbone = a network of “routers” supporting multicast
Tunneling used to build virtual links
r protocol = 4 in IP header
r example of use of a network layer as a layer 2 by another
network
m other examples: IPv6 over IPv4, IP over Frame Relay, over ATM,
AppleTalk over IP, etc.
r MBone “hacks”
m limitation of multicast enforced by Mbone routers on TTL field
m multicast routing with Distance Vector Multicast Routing Protocol
(DVMRP)
• each router computes SPT from each source using distance vector
algorithm
4: Network Layer
• reverse path forwarding (RPF)
4-56
MBone routers
B
9.141.49.67 9.141.1.3
57
MulticastSocket
java.lang.Object
r DatagramSocket with joining |
group capabilities +--java.net.DatagramSocket
r Two public constructors |
r Socket out of the group can +--java.net.MulticastSocket
send to the group public class Socket
extends DatagramSocket
The multicast datagram socket class is useful for sending and receiving IP
multicast packets. A MulticastSocket is a (UDP) DatagramSocket, with
additional capabilities for joining "groups" of other multicast hosts on the
internet.
One would join a multicast group by first creating a MulticastSocket with the
desired port, 224.0.0.1 to 239.255.255.255, inclusive, then invoking the
joinGroup(InetAddress groupAddr) method:
When one sends a message to a multicast group, all subscribing recipients to
that host and port receive the message (within the time-to- live range of the
packet). The socket need not be a member of the multicast group to send
messages to it.
When a socket subscribes to a multicast group/port, it receives datagrams sent
by other hosts to the group/port, as do all other members of the group and port.
A socket relinquishes membership in a group by the leaveGroup(InetAddress
addr) method. Multiple MulticastSocket's may subscribe to a multicast group
and port concurrently, and they will all receive group datagrams.
58
Constructors and main methods
r public MulticastSocket() throws IOException
CONSTRUCTORS
MulticastSocket() throws IOException
Create a multicast socket.
MulticastSocket(int port) throws IOException
Create a multicast socket and bind it to a specific port.
MAIN METHODS
public void joinGroup(InetAddress mcastaddr) throws IOException
Joins a multicast group.Its behavior may be affected by setInterface. If there is
a security manager, this method first calls its checkMulticast method with the
mcastaddr argument as its argument.
public void leaveGroup(InetAddress mcastaddr) throws IOException
Leave a multicast group. Its behavior may be affected by setInterface. If there
is a security manager, this method first calls its checkMulticast method with
the mcastaddr argument as its argument.
public void send(DatagramPacket p, byte ttl) throws IOException
Sends a datagram packet to the destination, with a TTL (time- to- live) other
than the default for the socket. This method need only be used in instances
where a particular TTL is desired; otherwise it is preferable to set a TTL once
on the socket, and use that default TTL for all packets.
public void setTimeToLive(int ttl) throws IOException
Set the default time-to- live for multicast packets sent out on this socket. The
TTL sets the IP time-to- live for DatagramPackets sent to a MulticastGroup, 59
which specifies how many "hops" that the packet will be forwarded on the
Example
…..
byte[] msg = {'H', 'e', 'l', 'l', 'o'};
InetAddress group = InetAddress.getByName("228.5.6.7");
MulticastSocket s = new MulticastSocket(6789);
s.joinGroup(group);
DatagramPacket hi = new DatagramPacket(msg, msg.length, group,
6789);
s.send(hi);
// get their responses!
byte[] buf = new byte[1000];
DatagramPacket recv = new DatagramPacket(buf, buf.length);
s.receive(recv);
// OK, I'm done talking - leave the group...
s.leaveGroup(group);
…..
In the example we see how to join a Multicast group and send the group
salutations.The MulticastSocket s, that is a DatagramSocket, uses the
DatagramPacket for building the datagram, which is then sent with the send()
method.
60
Summary
r The network layer transports packets from a sending host to
the receiver host.
r Main components:
m addressing
m routing
m routers (and how a router works)
r advanced topics: IPv6, multicast
r the Internet network layer
m Connectionless
m Best-effort
61