Inter-Networking Session I
2nd September 2008
Presented by Michuki Mwangi
Topics
IP and Networking Basics
DNS Fundamentals
Contention Ratio
Monitoring and Measurement Tools
IP and Networking Basics
Outline
Origins of TCP/IP
OSI & TCP/IP Architecture
IPv4 Addressing
IPv6
Routing
Types of Links
Address Resolution Protocol
Origins of TCP/IP
RAND Corporation (a “think tank”) & DoD
formed ARPA (Advanced Research Project
Agency)
1968 – ARPA engineers proposed Distributed
network design for ARPANET Network
A small internetwork or (small “i”)
“internet”
The (capital “I”) Internet
The world-wide network of TCP/IP networks
Different people or organisations own different
parts
Different parts use different technologies
Interconnections between the parts
Interconnections require agreements
sale/purchase of service “transit” agreements
Contracts and SLA’s
“peering” agreements
No central control or management
The principle of “Internetworking”
We have lots of little networks
Many different owners/operators
Many different types
Ethernet, dedicated leased lines, dialup, ATM, Frame Relay,
FDDI
Each type has its own idea of addressing and
protocols
We want to connect them all together and provide a
unified view of the whole lot (treat the collection of
networks as a single large internetwork)
What is TCP/IP?
In simple terms is a language that enables
communication between computers
A set of rules (protocol) that defines how two
computers address each other and send data
to each other
Is a suite of protocols named after the two
most important protocols TCP and IP; but
includes other protocols such as UDP, RTP,
etc.
Protocol Layers:
The TCP/IP Hourglass Model
Application layer
SMTP HTTP FTP Telnet DNS Audio Video
TCP UDP RTP Transport layer
IP Network layer
Token Frame
Ethernet ATM X.25 PPP HDLC
Ring Relay
Data link layer
Corresponding layers in the
OSI and TCP/IP models
7 Application
Mail, Web, etc.
6 Presentation Application
5 Session
4 Transport Transport TCP/UDP – end to end reliability
3 Network Network IP - Forwarding (best-effort)
2 Data Link Data Link & Framing, delivery
1 Physical Physical Raw signal
OSI TCP/IP
IP Addressing
Purpose of an IPv4 address
Unique Identification of:
Source
So the recipient knows where the message is from
Sometimes used for security or policy-based filtering of
data
Destination
So the networks know where to send the data
Network Independent Format
IP over anything
Purpose of an IPv4 Address
Identifies a machine’s connection to a network
Physically moving a machine from one
network to another requires changing the IP
address
Unique; assigned in a hierarchical fashion
IANA to RIRs (AfriNIC, ARIN, RIPE, APNIC,
LACNIC)
RIR to ISPs and large organisations
ISP or company IT department to end users
IPv4 uses unique 32-bit addresses
Basic Structure of an IPv4 Address
32 bit number (4 octet number):
(e.g. 133.27.162.125)
Decimal Representation:
133 27 162 125
Binary Representation:
10000101 00011011 10100010 01111101
Hexadecimal Representation:
85 1B A2 7D
Addressing in Internetworks
The problem we have
More than one physical network
Different Locations
Larger number of computers
Need structure in IP addresses
“network part” of the address identifies which
network in the internetwork (e.g. the Internet)
“host part” identifies host on that network
Hosts or routers connected to the same link-layer
network will have IP addresses with the same
network part, but different host part.
Address Structure Revisited
Hierarchical Division in IP Address:
Network Part (Prefix) – high order bits (left)
describes which physical network
Host Part (Host Address) – low order bits (right)
describes which host on that network
205 . 154 . 8 1
11001101 10011010 00001000 00000001
Network Host
Boundary can be anywhere
very often NOT at a multiple of 8 bits
choose the boundary according to number of hosts
Network Masks
“Network Masks” help define which bits are
used to describe the Network Part and which
for the Host Part
Different Representations:
decimal dot notation: 255.255.224.0
binary: 11111111 11111111 11100000 00000000
hexadecimal: 0xFFFFE000
number of network bits: /19
count the 1's in the binary representation
Binary AND of 32 bit IP address with 32 bit
netmask yields network part of address
Example Prefixes
137.158.128.0/17 (netmask 255.255.128.0)
1111 1111 1111 1111 1 000 0000 0000 0000
1000 1001 1001 1110 1 000 0000 0000 0000
198.134.0.0/16 (netmask 255.255.0.0)
1111 1111 1111 1111 0000 0000 0000 0000
1100 0110 1000 0110 0000 0000 0000 0000
205.37.193.128/26 (netmask 255.255.255.192)
1111 1111 1111 1111 1111 1111 11 00 0000
1100 1101 0010 0101 1100 0001 10 00 0000
Special Addresses
All 0’s in host part: Represents Network
e.g. 193.0.0.0/24
e.g. 138.37.128.0/17
All 1’s in host part: Broadcast
e.g. 137.156.255.255 (137.156.0.0/16)
e.g. 134.132.100.255 (134.132.100.0/24)
e.g. 190.0.127.255 (190.0.0.0/17)
127.0.0.0/8: Loopback address (127.0.0.1)
0.0.0.0: Various special purposes
Allocating IP Addresses
The subnet mask is used to define size of a
network
E.g. a subnet mask of 255.255.255.0 or /24
means 24 network bits, 8 host bits
(24+8=32)
28 minus 2 = 254 possible hosts
Similarly a subnet mask of 255.255.255.224
or /27 means 27 network bits, 5 host bits
(27+5=32)
25 minus 2 = 30 possible hosts
More levels of address hierarchy
We can also group several networks into a
larger block, or divide a large block into
several smaller blocks
arbitrary number of levels of hierarchy
blocks don’t all need to be the same size
but each block size must be a power of 2
Old systems used restrictive rules (obsolete)
Called “Class A”, “Class B”, “Class C” networks
These days (since 1994), no restriction
Called “classless”
A little History:
Classes of IP addresses
Different classes were used to represent different
sizes of network (small, medium, large)
Class A networks (large):
8 bits network, 24 bits host (/8, 255.0.0.0)
First byte of IP address in range 0-127
Class B networks (medium):
16 bits network, 16 bits host (/16 ,255.255.0.0)
First byte of IP address in range 128-191
Class C networks (small):
24 bits network, 8 bits host (/24, 255.255.255.0)
First byte of IP address in range 192-223
Netmasks of classful addresses
A classful network had a “natural” or “implied”
prefix length or netmask:
Class A: prefix length /8 (netmask 255.0.0.0)
Class B: prefix length /16 (netmask 255.255.0.0)
Class C: prefix length /24 (netmask 255.255.255.0)
Modern (classless) routing systems have
explicit prefix lengths or netmasks
You can't just look at an IP address to tell what the
prefix length or netmask should be. It needs explicit
configuration.
Traditional subnetting of classful
networks
Old routing systems allowed a classful
network to be divided into subnets
All subnets (of the same classful net) had to be the
same size and have the same netmask
Subnets could not be divided into sub-sub-nets
None of these restrictions apply in modern
systems
You should never use old routing systems that
have these restrictions (e.g. RIP version 1)
Classless Addressing
Class A, Class B, Class C terminology and
restrictions are now of historical interest only
Obsolete in 1994
Internet routing and address management
today is classless
CIDR = Classless Inter-Domain Routing
routing does not assume that class A, B, C implies
prefix length /8, /16, /24
VLSM = Variable-Length Subnet Masks
routing does not assume that all subnets are the
same size
Classless addressing example
A large ISP gets a large block of addresses
e.g., a /16 prefix, or 65536 separate addresses
Allocate smaller blocks to customers
e.g., a /22 prefix (1024 addresses) to one
customer, and a /28 prefix (16 addresses) to
another customer
An organisation that gets a /22 prefix from
their ISP divides it into smaller blocks
e.g. a /26 prefix (64 addresses) for one
department, and a /27 prefix (32 addresses) for
another department
IPv6
IP version 6
IPv6 designed as successor to IPv4
Expanded address space
Address length quadrupled to 16 bytes (128 bits)
Header Format Simplification
Fixed length, optional headers are daisy-chained
No checksum at the IP network layer
No hop-by-hop fragmentation
Path MTU discovery
64 bits aligned fields in the header
Authentication and Privacy Capabilities
IPsec is mandated
No more broadcast
IPv4 and IPv6 Header Comparison
IPv4 Header IPv6 Header
Type of
Version IHL Total Length
Service Version Traffic Class Flow Label
Fragment
Identification Flags
Offset Next
Payload Length Hop Limit
Header
Time to
Protocol Header Checksum
Live
Source Address
Source Address
Destination Address
Options Padding
Field’s name kept from IPv4 to IPv6
Legend
Fields not kept in IPv6
Destination Address
Name and position changed in IPv6
New field in IPv6
Larger Address Space
IPv4 = 32 bits
IPv6 = 128 bits
IPv4
32 bits
= 4,294,967,296 possible addressable devices
IPv6
128 bits: 4 times the size in bits
= 3.4 x 1038 possible addressable devices
= 340,282,366,920,938,463,463,374,607,431,768,211,456
∼ 5 x 1028 addresses per person on the planet
IPv6 Address Representation
16 bit fields in case insensitive colon hexadecimal
representation
2031:0000:130F:0000:0000:09C0:876A:130B
Leading zeros in a field are optional:
2031:0:130F:0:0:9C0:876A:130B
Successive fields of 0 represented as ::, but only once
in an address:
2031:0:130F::9C0:876A:130B is ok
2031::130F::9C0:876A:130B is NOT ok (two “::”)
0:0:0:0:0:0:0:1 → ::1 (loopback address)
0:0:0:0:0:0:0:0 → :: (unspecified address)
IPv6 Address Representation
In a URL, it is enclosed in brackets (RFC3986)
http://[2001:db8:4f3a::206:ae14]:8080/index.html
Cumbersome for users
Mostly for diagnostic purposes
Use fully qualified domain names (FQDN)
Prefix Representation
Representation of prefix is same as for IPv4 CIDR
Address and then prefix length, with slash separator
IPv4 address:
198.10.0.0/16
IPv6 address:
2001:db8:12::/40
IPv6 Addressing
Type Binary Hex
Unspecified 0000…0000 ::/128
Loopback 0000…0001 ::1/128
Global Unicast
0010 ... 2000::/3
Address
Link Local
1111 1110 10... FE80::/10
Unicast Address
Unique Local 1111 1100 ...
FC00::/7
Unicast Address 1111 1101 ...
Multicast Address 1111 1111 ... FF00::/8
IPv6 Global Unicast Addresses
Provider Site Host
48 bits 16 bits 64 bits
Global Routing Prefix Subnet-id Interface ID
001
IPv6 Global Unicast addresses are:
Addresses for generic use of IPv6
Hierarchical structure intended to simplify
aggregation
IPv6 Address Allocation
/12 /32 /48 /64
2000 0db8 Interface ID
Registry
ISP prefix
Site prefix
LAN prefix
The allocation process is:
The IANA is allocating out of 2000::/3 for initial IPv6
unicast use
Each registry gets a /12 prefix from the IANA
Registry allocates a /32 prefix (or larger) to an IPv6 ISP
Policy is that an ISP allocates a /48 prefix to each end
customer
IPv6 Addressing Scope
64 bits reserved for the interface ID
Possibility of 264 hosts on one network LAN
Arrangement to accommodate MAC addresses
within the IPv6 address
16 bits reserved for the end site
Possibility of 216 networks at each end-site
65536 subnets equivalent to a /12 in IPv4
(assuming 16 hosts per IPv4 subnet)
IPv6 Addressing Scope
16 bits reserved for the service provider
Possibility of 216 end-sites per service provider
65536 possible customers: equivalent to each
service provider receiving a /8 in IPv4 (assuming a
/24 address block per customer)
32 bits reserved for service providers
Possibility of 232 service providers
i.e. 4 billion discrete service provider networks
Although some service providers already are justifying
more than a /32
Equivalent to the size of the entire IPv4 address
space
Summary
Vast address space
Hexadecimal addressing
Distinct addressing hierarchy between ISPs,
end-sites, and LANs
ISPs have /32s
End-sites have /48s
LANs have /64s
Other IPv6 features discussed later
Routing
Routing and Forwarding
Routing is not the same as Forwarding
Routing is the building of maps
Each routing protocol usually has its own routing
database
Routing protocols populate the forwarding table
Forwarding is passing the packet to the next
hop device
Forwarding table contains the best path to the next
hop for each prefix
There is only ONE forwarding table
IP Routing
Each router or host makes its own routing decisions
Sending machine does not have to determine the entire path to
the destination
Sending machine just determines the next-hop along the path
(based on destination IP address)
This process is repeated until the destination is reached, or there’s
an error
Forwarding table is consulted (at each hop) to determine the
next-hop
IP Routing
Classless routing
route entries include
destination
next-hop
mask (prefix-length) indicating size of address space described by the
entry
Longest match
for a given destination, find longest prefix match in the routing table
example: destination is 35.35.66.42
routing table entries are 35.0.0.0/8, 35.35.64.0/19 and 0.0.0.0/0
All these routes match, but the /19 is the longest match
IP routing
Default route
where to send packets if there is no entry for the
destination in the routing table
most machines have a single default route
often referred to as a default gateway
0.0.0.0/0
matches all possible destinations, but is usually not the
longest match
Static vs. Dynamic routing
• Static routes • Dynamic routes
– Set up by administrator – Provided by routing
– Changes need to be protocols
made by administrator – Changes are made
– Only good for small automatically
sites and star topologies – Good for network
– Bad for every other topologies which have
topology type redundant links (most!)
Dynamic Routing
Routers compute routing tables dynamically
based on information provided by other
routers in the network
Routers communicate topology to each other
via different protocols
Routers then compute one or more next hops
for each destination – trying to calculate the
most optimal path
Automatically repairs damage by choosing an
alternative route (if there is one)
A Large ISP with more than one
upstream provider
Upstream Upstream
USA ISP ISP
Europe
Large ISP
Africa
Why does an ISP need BGP?
Multi-homing – connecting to multiple
providers
upstream providers
local networks – regional peering to get local traffic
Policy discrimination
controlling how traffic flows
do not accidentally provide transit to non-
customers
Aggregation
Defining BGP
BGP = Border Gateway Protocol
BGP is an exterior routing protocol
Focus on routing policy, not topology
BGP can make ‘groups’ of networks
(Autonomous Systems)
Good route filtering capabilities
Ability to isolate from other’s problems
BGP Protocol Basics
Peering
A C
AS 100 AS 101
B D
Routing Protocol used between
ASes
E
If you aren’t connected to
multiple ASes you don’t need
BGP
AS 102
Runs over TCP
BGP Protocol Basics
Uses Incremental updates
sends one copy of the RIB at the beginning, then
sends changes as they happen
Path Vector protocol
keeps track of the AS path of routing information
Many options for policy enforcement
Terminology
Transit – carrying network traffic across a network, usually for a
fee
Peering – exchanging routing information and traffic
your customers and your peers’ customers network information only.
not your peers’ peers; not your peers’ providers.
Peering also has another meaning:
BGP neighbour, whether or not transit is provided
Default – where to send traffic when there is no explicit route in
the routing table
What is an Exchange Point
Network Access Points (NAPs) established at
end of NSFnet
The original “exchange points”
Major providers connect their networks and
exchange traffic
High-speed network or ethernet switch
Simple concept – any place where providers
come together to exchange traffic
Internet Exchange Points
ISP A
IXP 1 IXP 2
ISP B
ISPs connect at Exchange Points or Network
Access Points to exchange traffic
Conceptual Diagram of an IXP
Exchange Point Medium
ISP Router ISP Router
ISP Router
Why use an IXP?
KEEP LOCAL TRAFFIC LOCAL!!!
ISPs within a region peer with each other at the
local exchange
No need to have traffic go overseas only to come
back
Much reduced latency and increased performance
Why use an IXP?
VASTLY IMPROVES PERFORMANCE!!!
Network RTTs between organisations in the local
economy is measured in milliseconds, not seconds
Packet loss becomes virtually non-existent
Customers use the Internet for more products,
services, and activities
Why use an IXP?
Countries or regions with a successful IXP
have a successful Internet economy
Local traffic stays local
Money spent on local ‘net infrastructure
Service Quality not an issue
All this attracts businesses, customers, and
content
Domain Name System
(DNS) Fundamentals
Computers use IP addresses.
Why do we need names?
Names are easier for people to remember
Computers may be moved between
networks, in which case their IP address
will change.
The old solution: HOSTS.TXT
A centrally-maintained file, distributed to all
hosts on the Internet
•SPARKY 128.4.13.9
•UCB-MAILGATE 4.98.133.7
•FTPHOST 200.10.194.33
•... etc
This feature still exists:
/etc/hosts (UNIX)
c:\windows\hosts
hosts.txt does not scale
✗ Huge file (traffic and load)
✗ Name collisions (name uniqueness)
✗ Consistency
✗ Always out of date
✗ Single point of Administration
✗ Did not scale well
The Domain Name System was born
DNS is a distributed database for holding
name to IP address (and other) information
Distributed:
Shares the Administration
Shares the Load
Robustness and improved performance
achieved through
replication
and caching
Employs a client-server architecture
A critical piece of the Internet's
infrastructure
DNS is Hierarchical
.(root) / (root)
ke org com etc bin usr
/etc/rc.d usr/local usr/sbin
ac.ke isoc.org afnog.org google.com
usr/local/src
kcct.ac.ke www.isoc.org
DNS Database Unix Filesystem
Forms a tree structure
DNS is Hierarchical (contd.)
Globally unique names
Administered in zones (parts of the tree)
You can give away ("delegate") control of
part of the tree underneath you
Example:
isoc.org on one set of nameservers
wiki.tools.isoc.org on a different set
elists.isoc.org on another set
Domain Names are (almost) unlimited
Max 255 characters total length
Max 63 characters in each part
RFC 1034, RFC 1035
If a domain name is being used as a host name,
you should abide by some restrictions
RFC 952 (old!)
a-z 0-9 and minus (-) only
No underscores ( _ )
Using the DNS
A Domain Name (like www.isoc.org) is the
KEY to look up information
The result is one or more RESOURCE
RECORDS (RRs)
There are different RRs for different types of
information
You can ask for the specific type you want,
or ask for "any" RRs associated with the
domain name
Commonly seen Resource Records
(RRs)
A (address): map hostname to IPv4 address
AAAA (quad A): map a hostname to IPv6
address
PTR (pointer): map IP address to hostname
MX (mail exchanger): where to deliver mail for
user@domain
CNAME (canonical name): map alternative
hostname to real hostname
TXT (text): any descriptive text
NS (name server), SOA (start of authority):
used for delegation and management of the
DNS itself
A Simple Example
Query: www.isoc.org.
Query type: A
Result:
www.isoc.org. 86400 IN A 206.131.241.137
In this case a single RR is found, but in
general, multiple RRs may be returned.
(INis the "class" for INTERNET use of the
DNS)
Possible results from a Query
POSITIVE
one or more RRs found
NEGATIVE
definitely no RRs match the query
SERVER FAIL
cannot find the answer
REFUSED
not allowed to query the server
How DNS resolution works (2)
Auth
NS
2
1
Query Caching 3 Auth
Resolver
NS NS
Response
4
5
Auth
NS
DNS Resolving and Caching www.my.co.ke A? Root Server
Ask .ke server @
mzizi.kenic.or.ke (+glue)
www.my.co.ke A? Caching Forwarder www.my.co.ke A?
(Recursive)
Resolver ccTLD Server
62.8.88.72 Ask kenic server @
ole.kenic.or.ke (+glue)
Add to
Cache
62.8.88.72
www.my.co.ke A?
KENIC Server
Contention Ratio
Definition
The ratio of the potential maximum demand (usage)
to the actual bandwidth available - ref wikipedia
Also referred to as Overbooking ratio
Call it the bandwidth sharing ratio
Most service providers do not disclose this ratio
In the UK its 50:1 on BT home ADSL and 20:1 on
business subscribers
The ratio is higher in the US - re Comcast case
No Data on Kenyan ISPs contention ratio
Argument for contention ratio is that 10% of
subscribers utilize over 80% of bandwidth available
What does it mean?
In the ratio of 50:1 it means if you have a
1Mbps link you are most likely sharing such
(transit) with 49 other subscribers.
Therefore if all users were online
simultaneously you would get a speed of
20Kbps
The easiest way to observe downloads at peak
hours and off-peak hours
Also locally hosted content would be subject
to lower content ratios
Monitoring and Measurement
Tools
Why Monitoring is important
To check network health status
Identify network bottlenecks
Plan for growth and expansion
Address security issues
Open Source Monitoring tools
MRTG & Cacti for bandwidth utilization
Nagios for service monitoring
Smokeping for RTT and availability
FlowD, NFSEN for protocol analysis and utilization
Webalizer - Web-server log monitoring
Rancid - router management
Snort - intrusion detection
Wireshark - tcdump analysis and log file
Open Source Measurement Tools
Ping - one-way RTT and reachability
Traceroute - one-way reachability
Mtr - one-way path attributes, packet loss,
Netperf - client/server bandwidth,
throughput
Iperf - client/server bandwidth, throughput
Pathchar - one-way bandwidth, through put
Iperf sample
michuki:~ michuki$ iperf -c wavu.kixp.or.ke
------------------------------------------------------------
Client connecting to wavu.kixp.or.ke, TCP port 5001
TCP window size: 65.0 KByte (default)
------------------------------------------------------------
[ 3] local 10.0.1.2 port 62134 connected with 80.240.194.142 port 5001
[ ID] Interval Transfer Bandwidth
[ 3] 0.0-13.6 sec 232 KBytes 139 Kbits/sec
Netperf
michuki:~ michuki$ sudo netperf -H wavu.kixp.or.ke -f k
Password:
TCP STREAM TEST to wavu.kixp.or.ke
Recv Send Send
Socket Socket Message Elapsed
Size Size Size Time Throughput
bytes bytes bytes secs. 10^3bits/sec
65536 65535 65535 27.35 54.64
Thank you for your attention!
Most of the slides used are lifted from the AfNOG training material
available at:
www.ws.afnog.org