Advanced Computer Networks - CS716 Power Point Slides Lecture 25
Advanced Computer Networks - CS716 Power Point Slides Lecture 25
Lecture No. 25
Review Lecture
Switched Networks
A network can be defined recursively as...
Two or more nodes connected by a link
Circular nodes (switches) implement the network Squared nodes (hosts) use the network
Switched Networks
A network can be defined recursively as...
Two or more networks connected by one or more nodes: internetworks
Circular nodes (router or gateway) interconnects the networks A cloud denotes any type of independent network
Switching Strategies
Circuit switching: Carry bit streams
a. establishes a dedicated circuit b. links reserved for use by communication channel c. send/receive bit stream at constant rate d. example: original telephone network
Multiplexing
Physical links/switches must be shared among users
(Synchronous) Time-Division Multiplexing (TDM) Frequency-Division Multiplexing (FDM)
L1 L2
Multiple flows on a single link
R1 R2
L3
Switch 1
Switch 2
R3
Statistical Multiplexing
On-demand time-division, possibly synchronous (ATM) Schedule link on a per-packet basis Buffer packets in switches that are contending for the link Packets from different sources interleaved on link
Inter-Process Communication
Turn host-to-host connectivity into process-to-process communication, making the communication meaningful. Fill gap between what applications expect and what the underlying technology provides.
Host Host Application
Host Application
Channel
Host
10
Performance Metrics
and to do so while delivering good performance Bandwidth (throughput)
Data transmitted per unit time, e.g. 10 Mbps Link bandwidth versus end-to-end bandwidth Notation KB = 210 bytes Kbps = 103 bits per second
11
Performance Metrics
Latency / Delay
Time to send message from point A to point B One-way versus Round-Trip Time (RTT) Components Latency = Propagation + Transmit + Queue Propagation = Distance / c Transmit = Size / Bandwidth
Note:
No queuing delay in direct (point-to-point) link Bandwidth irrelevant if size = 1 bit Process-to-process latency includes software processing overhead (dominates over shorter distances)
12
13
Network Architecture
The challenge is to fill the gap between hardware capabilities and application expectations, and to do so while delivering good performance
Designers cope with this complex task by developing a network architecture as a guideline
Layering, protocols, standards
14
Layering
Alternative abstractions at each layer Manageable network components Modify layers independently
Application programs Request/reply channel Message stream channel
Protocols
Building blocks of a network architecture Each protocol object has two different interfaces
service interface: operations on this protocol peer-to-peer interface: messages exchanged with peer
Protocol Interfaces
Host 1 Host 2
High-level object
Service interface
High-level object
Protocol
Peer-to-peer
Protocol
interface
17
RRP: Request Reply Protocol MSP: Message Stream Protocol HHP: Host-toHost Protocol
RRP
MSP
RRP
MSP
HHP
HHP
18
Protocol Machinery
Multiplexing and Demultiplexing (demux key) Encapsulation (header/body) in peer-to-peer interfaces
Indirect communication (except at hardware level) Each protocol adds a header Part of header includes demultiplexing field (e.g., pass up to request/reply or to message stream?)
19
Encapsulation
Host 1 Application program Application program Data RRP RRP Data HHP HHP RRP RRP Data Host 2
Data
HHP
RRP Data
20
Standard Architectures
Open System Interconnect (OSI) Architecture
International Standards Organization (ISO) International Telecommunications Union (ITU), formerly CCITT X dot series: X.25, X.400, X.500 Primarily a reference model
21
OSI Architecture
End host Application End host Application
Application Data formatting Connection management Process-to-process communication channel Host-to-host packet delivery Framing of data bits
Presentation
Presentation
User level
Session
Session
OS kernel
Transport
Transport
Network
Network
Network
Network
Data link
Data link
Data link
Data link
Physical
Physical
Physical
Physical
22
Internet Architecture
TCP/IP Architecture
Developed with ARPANET and NSFNET Internet Engineering Task Force (IETF) Culture: implement, then standardize OSI culture: standardize, then implement Became popular with release of Berkeley Software Distribution (BSD) Unix; i.e. free software Standard suggestions traditionally debated publically through Request For Comments (RFCs)
23
Internet Architecture
Implementation and design done together Hourglass Design (bottleneck is IP) Application vs Application Protocol (FTP, HTTP)
FTP HTTP NV
TFTP
TCP
UDP
IP
NET1
NET2
NETn
24
Internet Architecture
Layering is not very strict
Application TCP
UDP
IP Network
25
26
28
Socket API
Use sockets as abstract endpoints of communication Issues
Creating & identifying sockets Sending & receiving data
Mechanisms
UNIX system calls and library routines socket process
29
Protocol-to-Protocol Interface
A protocol interacts with a lower level protocol like an application interacts with underlying network Why not using available network APIs for PPI ?
Inefficiencies built into the socket interface Application programmer tolerate them to simplify their task
inefficiency at one level
Process Model
Avoid context switches
Buffer Model
Avoid data copies
31
Process Model
inter-process communication
procedure call
Process-per-Protocol
(a)
Process-per-Message
32
(b)
Buffer Model
Application Process
Buffer Copy
send() deliver()
Buffer Copy
Topmost Protocol
33
Network Programming
Things to Learn
Internet protocols (IP, TCP, UDP, ) Sockets API (Application Programming Interface)
34
Socket Programming
Reading: Stevens 2nd edition, Chapter 1-6 Sockets API: A transport layer service interface
Introduced in 1981 by BSD 4.1 Implemented as library and/or system calls Similar interfaces to TCP and UDP Can also serve as interface to IP (for super-user) known as raw sockets Linux also provides interface to MAC layer (for superuser) known as data-link sockets
35
Client-Server Model
Asymmetric relationship Server/Daemon
Well-known name Waits for contact Process requests, sends replies
Client
Initiates contact Waits for response
Client
36
Client-Server Model
Bidirectional communication channel Service models
Sequential: server processes only one clients requests at a time Concurrent: server processes multiple clients requests simultaneously Hybrid: server maintains multiple connections, but processes requests sequentially
37
TCP Connections
TCP connection setup via 3-way handshake
J and K are sequence numbers for messages
SYN J
SYN K ACK J+1 ACK K+1
Client
Server
Hmmm RTT is important!
38
TCP Connections
TCP connection teardown (4 steps) (either client or server can initiate connection teardown)
active close FIN J ACK J+1 passive close closes connection
Client
ACK K+1
FIN K
Server
Hmmm Latency matters!
39
Endianness
Machines on Internet have different endianness Little-endian (Intel, DEC): least significant byte of word stored in lowest memory address Big-endian (Sun, SGI, HP): most significant byte...
42
Address Conversion
All binary values used and returned by these functions are network byte ordered
struct hostent* gethostbyname (const char* hostname);
Address Conversion
in_addr_t inet_addr (const char* strptr);
translate dotted-decimal notation to IP address; returns -1 on failure, thus cannot handle broadcast value 255.255.255.255
int inet_aton (const char* strptr, struct in_addr inaddr);
Socket API
Creating a socket
int socket(int domain, int type, int protocol) domain (family) = AF_INET, PF_UNIX, AF_OSI type = SOCK_STREAM, SOCK_DGRAM protocol = TCP, UDP, UNSPEC return value is a handle for the newly created socket 46
Sockets (cont)
Passive Open (on server)
int bind(int socket, struct sockaddr *addr, int addr_len) int listen(int socket, int backlog) int accept(int socket, struct sockaddr *addr, int addr_len)
Sockets (cont)
Sending Messages
int send(int socket, char *msg, int mlen, int flags)
Receiving Messages
int recv(int socket, char *buf, int blen, int flags)
48
Point-to-Point Links
Reading: Peterson and Davie, Ch. 2
Outline
Hardware building blocks Encoding Framing Error Detection Reliable transmission Sliding Window Algorithm
49
user-level software
Transport
Network Data Link Physical
reliability
Links
Copper wire with electronic signaling Glass fiber with optical signaling Wireless with electromagnetic (radio, infrared, microwave) signaling 51
Links
Physical Media
Twisted pair cable Coaxial cable Optical fiber Space
Media is used to propagate signals Signals are electromagnetic waves of certain frequency, traveling at speed of light
52
Encoding
Signals propagate over a physical medium
Modulate electromagnetic waves e.g. vary voltage
Node
Adaptor
Adaptor
Node
55
Encoding
Digital data (a string of symbols) modulator a string of signals demodulator
RS-232(-C)
Communication between computer and modem Uses two voltage levels (+15V, -15V), a binary voltage encoding Data rate limited to 19.2 kbps (RS-232-C) raised in later standards
57
NRZ
59
Problem: Consecutive 1s or 0s
Low signal (0) may be interpreted as no signal High signal (1) leads to baseline wander Unable to recover clock
Senders and receivers clock have to be precisely synchronized Receiver resynchronizes on each signal transition Clock Drift in long periods without transition
Senders clock Receivers clock
60
Alternative Encodings
Non-Return to Zero Inverted (NRZI) Make a transition from current signal (switch voltage level) to encode/transmit a one Stay at current signal (maintain voltage level) to encode/transmit a zero Solves the problem of consecutive ones (shifts to 0s)
61
Alternative Encodings
Manchester (in IEEE 802.3 10 Mbps Ethernet) Split cycle into two parts
Send high--low for 1, low--high for 0 Transmit XOR of NRZ encoded data and the clock
62
4B/5B Encoding
Every 4 consecutive bits of data encoded in a 5-bit code (symbol)
4-bit pattern is translated to a 5-bit pattern (not addition)
5-bit codes selected to have no more than one leading 0 and no more than two trailing 0s
00xxx (8 symbols) and xx000 (4 symbols) are illegal 5 free symbols (non-data)
Thus, never gets more than three consecutive 0s Resulting 5-bit codes are transmitted using NRZI Achieves 80% efficiency
63
8-symbol example
65
450 150
66
67
Summary of Encoding
Problems: attenuation, dispersion, noise Digital transmission allows periodic regeneration Variety of binary voltage encodings
High frequency components limit to short range More voltage levels provide higher data rate
Framing
Breaks continuous stream/sequence of bits into a frame and demarcates units of transfer Typically implemented by network adaptor
Adaptor fetches/deposits frames out of or into host memory
Node A
Adaptor
Bits
Adaptor
Node B
Frames
70
Advantages of Framing
Synchronization recovery
Consider continuous stream of unframed bytes Recall RS-232 start and stop bits
Multiplexing of link
Multiple hosts on shared medium Simplifies multiplexing of logical channels
Approaches
Organized by end of frame detection method Approaches to framing
Sentinel (marker, like C strings) Length-based (like Pascal strings) Clock-based
72
Approaches
Beginning sequence
Header
Body
CRC
Ending sequence
74
Length-based Framing
Include payload length in header e.g., DDCMP (byte-oriented, variable-length) e.g. RS-232 (bit-oriented, implicit fixed length)
8 8 8 14 42 16
SYN
SYN
Class
Length
Header
Body
CRC
Clock-based Framing
Continuous stream of fixed-length frames
Each frame is 125s long (all STS formats) (why?)
9 rows
90 columns
78
Clock-based Framing
Problem: how to recover frame synchronization
2-byte synchronization pattern starts each frame (unlikely to occur in data) Wait until pattern appears in same place repeatedly
79
Clock-based Framing
Problem: how to maintain clock synchronization
NRZ encoding, data scrambled (XORd) with 127-bit pattern Creates transitions Also reduces chance of finding false sync pattern
80
Error Detection
Validates correctness of each frame Errors checked at many levels Demodulation of signals into symbols (analog) Bit error detection/correction (digital) our main focus
Within network adapter (CRC check) Within IP layer (IP checksum) Possibly within application as well
81
0
voltage
? (erasure) 1
-15
Possible binary voltage encoding symbol Neighborhoods and erasure region Possible QAM symbol Neighborhoods in green All other space results in erasure
83
k bits are derived from the original message Both the sender and receiver know the algorithm
84
87
Two-Dimensional Parity
Adding one extra bit to a 7-bit code to balance 1s extra parity byte for the entire frame Catches all 1, 2 and 3 bit errors and most 4 bit errors 14 redundant bits for a 42-bit message, in the example
Parity bits
0101001 1
1101001
1011110 Data
0
1 1 1 0 0
88
0001110
0110100 1011111
Parity byte
1111011
16 redundant bits for a message of any length Weak protection, accepted as a last line of defense
89
Practice
Bitwise XORs
90
Receiver divides (P(x) + E(x)) by C(x); remainder will be zero ONLY if:
E(x) was zero (no error), or E(x) is exactly divisible by C(x)
92
Reliable Transmission
Error-correcting codes are not advanced enough to handle the range of bit and burst errors
Corrupt frames generally must be discarded A reliable link-level protocol must recover from discarded frames
93
Reliable Transmission
Reliability accomplished using acknowledgments and timeouts
ACK is a small control frame confirming reception of an earlier frame Having no ACK, sender retransmits after a timeout
94
Reliable Transmission
Automatic Repeat reQuest (ARQ) algorithms
Stop-and-wait Concurrent logical channels Sliding window Go-back-n, or selective repeat
97
Stop-and-Wait
Send a single frame Wait for ACK or timeout
If ACK received, continue with next frame If timeout occurred, send again (and wait)
Frame lost in transit; or corrupted and discarded
Frame 0
Sender
Receiver
98
Stop-and-Wait
Frames delivered reliably and in order Is that enough ?
No, we need performance, too.
100
101
Sliding Window
Allow sender to transmit multiple frames before receiving an ACK, thereby keeping the pipe full Upper bound on outstanding un-ACKed frames Also used at the transport layer (by TCP)
Time
Sender
Receiver
102
time
LAR=13
11 12 13 14
SWS 15 16 17
LFS=18
18 19 20
time
Advance LAR when ACK arrives Buffer up to SWS frames and associate timeouts
104
time
Stop-and-Wait
Go-back-N Selective Repeat
107
All possible packets must have unique SeqNum SWS < (MaxSeqNum+1)/2 or SWS+RWS < MaxSeqNum+1 is the correct rule Intuitively, SeqNum slides between two halves of sequence number space
109
110
History of Ethernet
Developed by Xerox PARC in mid-1970s Roots in Aloha packet-radio network Standardized by Xerox/DEC/Intel in 1978 Similar to IEEE 802.3 standard IEEE 802.3u standard defines Fast Ethernet (100 Mbps) New switched Ethernet now popular
112
Hub
Hub
113
Repeater
Host
114
Dest addr
Carrier sense
Nodes can distinguish between an idle and busy link
Collision detection
A node listens as it transmits to detect collision
116
If collision detected
Stop sending data and jam signal Try again later
118
51.2 s on 10 Mbps corresponds to 512 bits (64 bytes) Therefore, the minimum frame length for Ethernet is 64 bytes (header + 46 bytes data)
120
Retry After the Collision How long should a host wait to retry after a collision ?
Binary exponential backoff Maximum backoff doubles with each failure (exponential) After N failures, pick an N-bit number 2N discrete possibilities from 0 to maximum
121
Ethernet Frame Reception Sender handles all access control Receiver simply pulls frames from network Ethernet controller/card
Sees all frames Selectively passes frames to host processor
122
Experience With Ethernet Number of hosts limited to 200 in practice, standard allows 1024 Range much shorter than 2.5 km limit in standard Round-trip time is typically 5 or 10 s, not 50s
123
124
The term token indicates the way the access to shared channel is managed
125
Stations get round-robin service as the token circulates around the ring
127
Physical Properties
Data rate can be 4 Mbps or 16 Mbps Encoding of bits uses differential Manchester Ring may have up to 250 (802.5) or 260 (IBM) nodes Physical medium is twisted pair (IBM Token Ring)
128
Frame Format
Illegal Manchester codes in the start and end delimiters Frame priority and reservation bits in access control byte Demux key in frame control byte A and C bits for reliable delivery, in status byte
8 Start delimiter 8 Access control 8 Frame control 48 Dest addr 48 Src addr Variable Body 32 CRC 8 End delimiter 8 Frame status
131
Reliable Delivery
The A and C bit in the packet trailer for reliability Both bits are initially set to 0 Destination sets A bit if it sees the frame and sets C bit if it copies the frame into its adaptor
133
Strict priority scheme: no lowerpriority packets get sent when higher priority packets are waiting
134
Token Maintenance
Token rings have a designated monitor node Any station can become the monitor according to a well defined procedure Monitor is elected when the ring is first connected, or when the current monitor fails
135
Token Maintenance
Monitor periodically announces its presence Claim token sent by a station seeing no monitor
If the sender receives back the claim token, it becomes monitor If another station is also contending for monitor, some rule defines the monitor
136
(a)
(b)
137
FDDI Physical Properties Variable size buffer (9 80 bits) between input and output interfaces (10ns bit time)
Not required to fill buffer before starting transmission
138
Total 200 km fiber: dual nature implies 100 km cable connecting all stations Physical media can be coax or twisted pair cable Uses 4B/5B encoding
139
141
MAC Algorithm
Each node measures TRT between successive token arrivals If measured-TRT > TTRT
Token is late Can not send data
142
Synchronous traffic
Latency sensitive Gets higher priority Can always send data
143
Therefore, total synchronous data during one token rotation is bounded by TTRT
144
Frame Format
4B/5B control symbols for start and end of frame Control Field
1st bit: asynchronous (0) versus synchronous (1) data 2nd bit: 16-bit (0) versus 48-bit (1) addresses Last 6 bits: demux key (includes reserved patterns for token and claim frame)
Status Field
From receiver back to sender; error in frame Recognized address; accepted frame (flow control)
8 Start of frame 8 Control 48 Dest addr 48 Src addr Variable 32 8 End of frame 24 Status
146
Body
CRC
Wireless LANs
IEEE 802.11 standard
Designed for use in a small area (offices, campuses)
Bandwidth: 1, 2 or 11 Mbps
Up to 56Mbps in newer 802.11a standard
Is it sufficient ? All nodes are not always within reach of (to hear) each other
148
Exposed nodes
Sender does not send when its OK to send (false ve) B and C are exposed nodes in the figure below
149
Collision detection
No active collision detection Known only if CTS or ACK is not received Binary exponential back off (BEB) is used in case of collision, like in Ethernet
151
802.11 - Distribution System Nodes roam freely but operate within a structure
Tethered by wired network infrastructure (Ethernet ?) Each Access Point (AP) services nodes in some region Each mobile node associates itself with an AP
152
Managing Connectivity/Roaming
How wireless nodes select Access Point ? Scanning (active search for an AP)
Node sends Probe frame All APs within reach reply with Probe Response frame Node selects one AP; sends it Associate Request frame AP replies with Association Response New AP informs old AP via wired backbone 153
Managing Connectivity
Active scanning: when a node join or move Passive scanning: AP periodically sends Beacon frame, advertising its capabilities
Distribution sy stem
AP-1 AP-2 A B H C C D G
AP-3 F
154
Frame Format
Control field contains three subfields:
6-bit Type field (data, RTS, CTS, scanning); 1-bit ToDS; and 1-bit FromDS
ToDS=0, FromDS=0
ToDS=1, FromDS=1
AP-3
AP-1
A
155
Overview
Also called network interface card (NIC) Components (high-level overview) Options for use
Data motion Event notification
156
Communication ?
Cache $
memory bus
Network Adaptor
To Network
Memory
I/O bus
Link Interface
network
158
Host Perspective
Adaptor is ultimately programmed by CPU Adaptor exports a Control Status Register (CSR) CSR is readable and writable from CPU at some memory address
159
161
Data Motion
CPU
Network Adaptor
To Network
Memory
I/O bus
Hardware interrupts
Processor free to do other things Events delivered immediately State (register) save/restore expensive Context switches more expensive
163
Event polling
Processor must periodically check Events wait until next check No extra state changes
164
Device Drivers
Operating system routines anchoring protocol stack to network hardware Initialize device, transmit frames, field interrupts Code contains device specific details
Difficult to read but simple in logic
165
Performance Bottlenecks
Performance Bottlenecks
167
Packet Switches
A multi-input multi-output device Local star topology Performance independent of connectivity
(e.g. adding new host) if switch is designed with enough aggregate capacity
Forwarding
Packets arrive at one of the several inputs and have to be forwarded/switched to one of the available outputs
Connectionless and connection-oriented approach to determine the correct output
Routing
Forwarding requires information
host
Presentation
Session switch Transport switch between different physical layers
OS kernel
Network
Network
Data Link
One or more nodes
Physical
172
Three approaches
Datagram or connectionless approach Virtual circuit or connection-oriented approach Source routing
173
174
Datagram Switching
Managing tables in large, complex networks with dynamically changing topologies is a real challenge for the routing protocol
Host E
Host A
Host G 1
Switch 3 Host B 3
Host H
175
Datagram Model
No round trip time delay waiting for connection setup
Host can send data anywhere, anytime as soon as it is ready Source has no way of knowing if the network is capable of delivering a packet or if the destination host is even up
VC Tables in VC Switching
Setup message in signaling process (to create VC table) is forwarded like a datagram Acknowledgment of connection setup to downstream neighbors to complete signaling
Data transfer phase can start after ACK is received
178
Signaling in VC Switching
Setup message is forwarded from Host A to Host B On connection request, each switch creates an entry in VC table with a VCI for the connection
I/F VCI I/F VCI in in out out I/F VCI I/F VCI in in out out 2 5 1
2
3
Switch 2
0 Switch 1 3 2 1
3
0
setup setup B
setup
B
2 Host A 1 3 Switch 3
Host B
setup
179
Typically wait full RTT for connection setup before sending first data packet
Can not avoid failures dynamically, must re-establish connection (old one is torn down to free storage space)
180
Source Routing
Switch 1 1
3 0
data
2
0 data 0 1 3
data
Switch 3 Host B
data data
1 2
0 3
3 0 1
Host A 1 3 0
data
182
Source host needs to know the correct and complete topology of the network
Changes must propagate to all hosts
Packet headers may be large and variable in size: the length is unpredictable
183
Interface 2
Packet arriving at interface 1 has to go on interface 2 Point of contention for packets: I/O and memory bus
184
Problem
Want to scale LAN concept Larger geographic area (Greater than O(1 km)) More hosts (Greater than O(100)) But retain LAN-like functionality
Solution: bridges
185
Bridges
Connect two or more LANs with a bridge
Transparently extends a LAN over multiple networks Accept & forward strategy (in promiscuous mode) Level 2 connection (does not add packet header)
A B C Port 1 Bridge Port 2 X Y Z
186
Learning Bridges
Learn table entries based on source address
Timeout entries to allow movement of hosts
Table is an optimization need not be complete Always forward broadcast frames Uses datagram or connectionless forwarding
A B C Port 1 Bridge Port 2 X Y Z
Host A B C X Y Z
Port 1 1 1 2 2 2
187
Learning Bridges
A B B3 C D
B5
B2 E
B7
F
B1 G H
B6
B4 J
Problem
Redundancy (desirable to handle failures, but ) Makes extended LAN structure cyclic Frames may cycle forever
188
Spanning Tree
Subset of forwarding possibilities All LANs reachable, but Acyclic Bridges run a distributed algorithm to calculate the spanning tree
Select which bridge actively forward Developed by Radia Perlman of DEC Now IEEE 802.1 specification Reconfigurable algorithm
189
B
B5
LAN
C
B2 E D
B7 F B1
Designated port
Preferred port
B2
Designated bridge
B6 I
B4 J
190
Limitations of Bridges
Do not scale
Spanning tree algorithm does not scale Broadcast does not scale
193
ATM Signaling
Connection setup called signaling (standard Q.2931) Route discovery, resource resv, QoS, ... Send through network
Request setup circuit Send setup frame on setup circuit
Establish locally
No intermediate switch involvement Requires pre-established virtual path
194
Cell Switching (ATM) Fixed length (53 bytes) frames are called cells
5-byte (header + 1 byte CRC 8) + 48-byte payload
196
GFC
VPI
VCI
Type
CLP HEC(CRC-8)
payload
Host-to-switch format GFC: Generic Flow Control (still being defined) VCI/VPI: Virtual Circuit/Path Identifier Type: management, congestion control, AAL5 (later) CLP: Cell Loss Priority HEC: Header Error Check (CRC-8)
AAL
AAL
ATM ATM
198
ATM Layers
ATM Adaptation Layer (AAL)
Convergence Sublayer (CS) supports different application service models Segmentation and Reassembly (SAR) supports variable-length frames
CS AAL SAR
ATM Layer
Handles virtual circuits, cell header generation, flow control ATM
Physical layer
Transmission Convergence (TC) handles error detection, framing Physical medium dependent (PMD) sublayer handles encoding
TC PHY PMD
199
AAL 3/4
Provides information to allow variable size packets to be sent in fixed-size ATM cells Convergence Sublayer Protocol Data Unit (CS-PDU)
8 8 16 < 64 KB 0-24 8 8 16
CPI
Btag
BAsize
payload
Pad
Etag
Length
CPI: Common Part Indicator (version field) Btag/Etag:beginning and ending tags (same) BAsize: hint on reassembly buffer space to allocate Length: size of whole PDU
200
10
16
ATM header
type
seq
MID
payload
length
CRC-10
SEQ: Sequence Number (for cell loss/reordering) MID: multiplexing ID (mux onto virtual circuits) Length: number of bytes of PDU in this cell
201
CS-PDU header
44 bytes
User data
44 bytes
CS-PDU trailer
44 bytes <44 bytes
ATM header
AAL header
Cell payload
AAL trailer
Padding
202
AAL 5 CS-PDU
CS-PDU Format
< 64 KB 0 - 47 2 2 32
data
pad
reserved
length
CRC-32
Pad so trailer always falls at the end of ATM cell Length: size of PDU (data only) CRC-32 (detects missing or misordered cells)
Cell Format
End-of-PDU bit in Type field of ATM header
203
ATM header
Cell payload
204
205
ATM Switch
Ethernet Switch
H H H
H
H
ATM Switch
Ethernet Switch
H H H H
207
Host
Switch
Host
208
210
LECS
H1
H2
211
LANE Registration
1. Client contacts LECS on predefined VC, and sends ATM address to it 2. LECS returns LAN type, MTU and ATM address of LES 3. Client signals connection to LES, and registers MAC and ATM addresses with LES 4. LES returns ATM address of BUS 5. Client signals connection to BUS 6. Bus adds client to point-to-multipoint VC H3 LES BUS
ATM Network
H1
LECS
H2
212
ATM Network
H1
LECS
H2
213
Contention in Switches
Output Buffering
Standard check-in lines
Customer service
x a
1x6 Switch
1x6 Switch
a
trying to check-in
Mr. X Mr. A writing waiting to complaint claim refund letter of Rs.100 you
216
Backpressure
Switch 1
no more, please
Switch 2
Propagation delay requires that switch 2 exert backpressure at high-water mark rather than when buffer completely full It is thus typically only used in networks with small propagation delays (e.g. switch fabrics)
217
Switching Fabric
Special-purpose (switching) hardware General problem
Connect N inputs to M outputs (NxM switch) Often N=M (bidirectional links)
Design goals
High throughput: want aggregate close to MIN (sum of inputs, sum of outputs) Avoid contention (fabric faster than ports) Good scalability:linear size/cost growth in 218 N/M
Input Port
Output Port
Input Port
Output Port
Input Port
Output Port
219
Design Goals - Throughput An n x m switch can provide max ideal throughput of: S = S1+ S2 + + Sn
Only possible if traffic at inputs is evenly distributed across all outputs Sustained throughput higher than link speed of output is not possible
221
Switch Performance
Avoid contention with buffering
Use output buffering when possible Apply backpressure through fabric Input buffering with peeking (non-FIFO semantics) to reduce head-of-line blocking problems Drop packets if input buffer overflows
Good scalability
O(N) ports Port design complexity O(N) gives O(N2) for switch Port design complexity O(1) gives O(N) for switch
223
8-to-4 Concentrator
3
2 2 random selector
2
Outputs
1
Mux
Buffer memory
Demux
Write control
Read control
226
Self-Routing Fabrics
Use source routing on network within switch Input port attaches output port number as header Fabric routes packet based on output port Types
Banyan Network Batcher-Banyan Network Sunshine Switch
227
Banyan Network
001 011 110 111 011 001
110
MSB
LSB
111
6 1 Sort
6 1
7 1
1 7
6 7 Merge
6 7
Merge
Batcher-Banyan Network
Attach the two-back-to-back Arbitrary unique permutations routed without 229 contention
Batcher-Banyan Network
230
Sunshine Switch
k Delay k
Inputs
Batcher
n +k
Trap
(marks overflow packets)
n +k
n Selector n n
l bany ans
n n n
Outputs
Like a Knockout switch Re-circulates overflow packets i.e. when more than L arrive in one cycle
231
What we understand
Concepts of networking and network programming
Elements of networks: nodes and links Building a packet abstraction on a link
We also understand
How switches may provide indirect connectivity
Different ways to move through a network (forwarding) Bridge approach to extending LAN concept Example of a real virtual circuit network (ATM) How switches are built and contention within switches
Internetworking
Reading: Peterson and Davie, Ch. 4 Basics of Internetworking Heterogeneity
The IP protocol, address resolution, control messages
234
Internetworking
Routing moving forward with IP
Building forwarding information
235
TCP
UDP
IP
FDDI
Ethernet
ATM
236
IP Service Model
Internetwork
Concatenation of networks
H6 H1
H7 R3
Network 1 Ethernet
H2 H3
R1
Network 2
Point -topoint
R2
Network 3 FDDI
H5
H4
Network 4 Ethernet
H8
Protocol stack
H1 TCP R1 IP ETH PPP PPP IP FDDI FDDI R2 IP ETH R3 H8 TCP
IP
ETH
IP
ETH
238
IP Addresses
7 bits (126 nets) Class A: 0 Network 14 bits (16k nets) Class B: 24 bits (16 million hosts) Host 16 bits (64K hosts)
Network
21 bits (2 million nets)
Host
8 bits (256) Host
Class C:
Network
host in class A network (MIT) host in class B network (UIUC) host in class C network
Datagram Format
0 V ersion 4 HLen 8 TOS 16 19 Length 31
Ident
TTL Protocol
Flags
Offset
Checksum
4-bit version (4 for IPv4, 6 for IPv6) 4-bit header length (in words, minimum of 5) 8-bit type of service (TOS) more or less unused 16-bit datagram length (in bytes) 8-bit protocol (e.g. TCP=6 or UDP=17)
240
ETH IP (1400)
FDDI IP (1400)
0
Rest of header
Offset= 0 Ident= x
242
Datagram Forwarding
Network # 18.0.0.0 128.32.0.0 0.0.0.0
dest: 18.26.10.0
dest: 128.16.14.0 mask with 255.0.0.0 not matched mask with 255.255.0.0 not matched matched! send to port 3 mask with 0.0.0.0
243
Operation
SourceHardwareAddr (bytes 0 3) SourceHardwareAddr (bytes 4 5) SourceProtocolAddr (bytes 2 3) SourceProtocolAddr (bytes 0 1) TargetHardwareAddr (bytes 0 1)
244
TCP
UDP
IP
ICMP
FDDI
Ethernet
ATM
245
Sent to the source when a node is unable to process IP datagram successfully Error messages
Destination unreachable (protocol, port, or host) Reassembly failed IP Checksum failed; or invalid header TTL exceeded (so datagrams dont cycle forever) Cannot fragment
ICMP Message
Control messages
Echo (ping) request and reply Redirect (from router to source host, to change route)
246
Each host is not configured for DHCP server, it performs a DHCP server discovery
A broadcast discovery message is sent by the host 247 and a unicast reply is sent by the server
Controlled capacity
Change router drop and priority policies Provide guarantees on bandwidth, delay, etc.
Virtual net replaces leased line with shared net Unwanted connectivity is prevented on this logical link using IP tunnel
248
IP Tunnel in VPNs
Virtual point-to-point link between a pair of nodes separated by many networks
Network 1 R1
Internetwork
R2
Network 2
249
What is Routing ?
Definition: task of constructing and maintaining forwarding information (in hosts or in switches) Goals for routing
Capture notion of best routes Propagate changes effectively Require limited information exchange Admit efficient implementation
251
Routing Overview
Hierarchical routing infrastructure defines routing domains Network as a Graph
Nodes are routers Edges are links Each link has a cost Where all routers are under same administrative control A
6 3 4 C B 9 1 1 D 1 2 E F
252
Routing Outline
Algorithms
Static shortest path algorithms Bellman-Ford: all pairs shortest paths to destination Dijkstras algorithm: single source shortest path Distributed, dynamic routing algorithms Distance Vector routing (based on Bellman-Ford) Link State routing (Dijkstras algorithm at each node)
Bellman-Ford Algorithm
Static, centralized algorithm, (local iterations/destination) Requires: directed graph with edge weights (cost) Calculates: shortest paths for all directed pairs Check use of each node as successor in all paths For every node N for each directed pair (B,C) is the path B N C better than BC ? is cost BNdestination smaller than previously known? For N nodes Uses an NxN matrix of (distance, successor) values
254
Dijkstras Algorithm
Static, centralized algorithm, build tree from source Requires directed graph with edge weights (distance) Calculates: shortest paths from 1 node to all other Greedily grow set S of known minimum paths From node N
Start with S = {N} and one-hop paths from N Loop n-1 times add closest outside node M to S for each node P not in S
is the path N .....MP better than NP ?
255
Exchange updates of distance vector (Destination, Cost) with directly connected neighbors (known as advertising the routes)
Periodically (on the order of several seconds to minutes) Whenever vector changes (called triggered update)
256
B C A D E
257
B C A D E
258
B C D E
259
Split Horizon
Avoid counting to infinity by solving mutual deception problem When sending an update to node X, do not include destinations that you would route through X
If X thinks route is not through you, no effect If X thinks route is through you, X will timeout route
C:2:B
A
C:2:B
B
C:1:C C::-
C
260
261
262
Route Calculation
At node D Confirmed list Tentative list 1. (D,0,-) 2. (D,0,-) (C,2,C), (B,11,B) 3. (D,0,-), (C,2,C) (B,11,B) 4. (D,0,-), (C,2,C) (B,5,C), (A,12,C) 5 5. (D,0,-), (C,2,C), (B,5,C) (A,12,C) 6. (D,0,-), (C,2,C), (B,5,C) (A,10,C) A 7. (D,0,-), (C,2,C), (B,5,C), (A,10,C)
B 3 10 11 D
264
C 2