Data Communication and Network
Data Communication and Network
The internet is a global network of interconnected computers that communicate using a set of
standard protocols (rules). It allows billions of devices (such as computers, smartphones, and
servers) to exchange data.
Components of the Internet:
End systems (Hosts): These are devices like computers, smartphones, tablets, etc., which
are connected to the internet. They are the "users" of the network.
Routers: Specialized devices that forward data between different networks. Routers
direct data packets along the optimal path to their destination.
Communication links: These are the physical paths that carry data between devices and
routers. They can be made of:
o Copper wires (e.g., Ethernet cables),
o Fiber optic cables (for fast data transmission),
o Wireless signals (Wi-Fi, 5G).
The internet works by transferring data between devices in packets. Each packet contains parts
of the data you want to send, like a piece of a file or message. These packets are routed through
various networks until they reach their destination, where they are reassembled.
A protocol is a set of rules that defines how data should be transmitted over the network. The
most important protocols on the internet are:
Transmission Control Protocol (TCP): Ensures that data is sent and received reliably.
Internet Protocol (IP): Handles addressing and routing of packets so they can find their
destination.
The internet uses packet switching, where data is divided into small units called packets. Each
packet may take a different route to the destination, and once all packets arrive, they are
reassembled into the original data. This method is efficient because it allows multiple users to
share the same network resources.
1.2 The Network Edge: Access Networks, Physical Media
The network edge refers to the parts of the network where end devices (such as computers,
smartphones, and IoT devices) connect to the larger network. It's the point where data enters or
leaves the network.
An access network connects an end-user device to the internet service provider's (ISP)
network. There are various types of access networks based on where and how users connect to
the internet.
Physical Media:
Physical media refers to the materials that carry the network signals. There are two main
categories:
1. Guided Media:
o Twisted Pair Copper Wires: Often used for traditional telephone and DSL
connections.
o Coaxial Cable: Used in cable TV and cable internet.
o Fiber Optic Cable: Transmits data as light signals, offering very high speeds and
long-distance transmission capabilities.
2. Unguided Media:
o Wireless: Data is transmitted through the air via radio signals, such as Wi-Fi,
satellite, and mobile networks.
The network core refers to the central part of the internet that connects all the networks (like
access networks) together. It consists of high-speed routers and communication links that route
data from one end of the network to another.
Packet Switching:
The internet uses packet switching to send data. Here’s how it works:
Data is divided into packets: When you send data (like an email or file), it’s broken
down into small units called packets.
Packets are sent independently: Each packet may take a different path through the
network to reach its destination. The network uses routers to determine the best path for
each packet.
Packets are reassembled: At the destination, packets are reassembled to form the
original data.
Efficient use of resources: Multiple devices can share the same network links, as data is
sent in short bursts.
No need for dedicated connection: A constant, direct connection between the sender
and receiver is not required.
Delay: Packets may experience delays, especially during periods of high network traffic
(congestion).
Packet Loss: Some packets might get lost due to network congestion or errors, but
protocols like TCP can request missing packets.
Circuit Switching:
Circuit switching is a method traditionally used by telephone networks. Here's how it works:
Dedicated path: A dedicated communication path (or circuit) is established between the
sender and receiver for the entire duration of the communication.
Continuous connection: Once the connection is established, all data follows the same
path. The circuit remains open, even if there are periods of silence or no data being sent.
Inefficient for bursty traffic: If the connection is idle (e.g., during a pause in a
conversation), the reserved resources are wasted.
High setup time: Setting up a circuit can take time, especially in large networks.
Packet Switching: More flexible, efficient, and suited for modern internet traffic (where
data is sent in bursts). However, it can experience delays and packet loss.
Circuit Switching: Ideal for continuous, real-time communication (like voice calls), but
it is less efficient for the internet’s bursty traffic patterns.
In the network core, routers direct the flow of packets from source to destination. Routers make
forwarding decisions based on the packet’s destination IP address, sending the packet along the
best available path at that moment.
Delay refers to the time it takes for data (a packet) to travel from the sender to the receiver.
There are several types of delays in a network:
1. Processing Delay:
o The time taken by a router to examine a packet’s header, determine where to
forward the packet, and perform error checking.
o Usually a very small delay (microseconds).
2. Queuing Delay:
oWhen a packet arrives at a router or switch, it might have to wait in a queue if
there are other packets ahead of it.
o The queuing delay depends on the level of congestion in the network. More
congestion leads to longer queues and greater delay.
3. Transmission Delay:
o The time required to push all of a packet's bits onto the link.
o It depends on the packet size and the link’s transmission rate. Formula:
Transmission Delay=Packet Size (bits)Link Bandwidth (bits per second)\
text{Transmission Delay} = \frac{\text{Packet Size (bits)}}{\text{Link
Bandwidth (bits per
second)}}Transmission Delay=Link Bandwidth (bits per second)Packet Size (bits
)
4. Propagation Delay:
o The time it takes for a signal to travel from one end of the link to the other.
o It depends on the physical distance between the sender and receiver and the speed
at which the signal travels (usually close to the speed of light). Formula:
Propagation Delay=DistancePropagation Speed\text{Propagation Delay} = \frac{\
text{Distance}}{\text{Propagation
Speed}}Propagation Delay=Propagation SpeedDistance
2. Packet Loss
Packet loss occurs when one or more packets of data traveling across a network fail to reach their
destination. This can happen due to:
TCP (Transmission Control Protocol) can detect lost packets and request
retransmission, ensuring reliable data transfer.
UDP (User Datagram Protocol) does not have built-in mechanisms for retransmitting
lost packets, so it is used for real-time applications (e.g., video streaming) where speed is
prioritized over reliability.
Throughput is the rate at which data is successfully delivered over a communication link. It
measures how much data can be transmitted from the sender to the receiver within a given
period.
Link bandwidth: The maximum capacity of the link. If a link has a higher bandwidth, it
can carry more data per second.
Network congestion: High congestion can reduce throughput as packets may be delayed
or dropped.
Round-trip time (RTT): The time it takes for a packet to go from the sender to the
receiver and back. Longer RTT can lower throughput, especially in TCP, which waits for
acknowledgments of packet delivery.
High delay can slow down data transfer, especially in protocols like TCP that depend on
acknowledgments.
Packet loss can reduce throughput since packets need to be retransmitted, adding more
delay.
Congestion in the network can increase both delay (due to queuing) and packet loss,
significantly reducing throughput.
1. Layered Architecture
In networking, the layered architecture is a way to divide the communication process into
smaller, manageable parts. Each layer performs a specific function and interacts with the layers
directly above and below it. This approach simplifies design, development, and troubleshooting.
The most common layered model is the OSI (Open Systems Interconnection) model, but the
TCP/IP model is more widely used in practice.
1. Application Layer: Provides services directly to users and manages the applications that
require network communication. Examples: HTTP (for web browsing), FTP (for file
transfer), SMTP (for email).
2. Transport Layer: Manages communication between processes on different devices. It
ensures that data is transferred reliably (in the case of TCP) or as quickly as possible
without guarantees (in the case of UDP). Examples: TCP, UDP.
3. Network Layer: Responsible for logical addressing (IP addressing) and routing packets
to their destination. The primary protocol here is the Internet Protocol (IP).
4. Link Layer: Deals with the physical transmission of data across a physical medium (e.g.,
Ethernet, Wi-Fi). It handles communication between directly connected devices and deals
with MAC addressing.
OSI Model
Modularity: Each layer has its own set of responsibilities, making it easier to manage and
modify.
Interoperability: Systems from different manufacturers can communicate as long as they
adhere to the same protocols.
Abstraction: Each layer only needs to know how to interact with the layers directly above and
below it.
2. Encapsulation
Encapsulation is the process of wrapping data with the necessary protocol information before
transmission across a network. As data moves down through the layers, each layer adds its own
header (or metadata) to the data.
At the Application Layer, the data is created (e.g., a web page or email message).
The Transport Layer (e.g., TCP) adds a transport header, including information like port
numbers and sequence numbers for reliable delivery.
The Network Layer (e.g., IP) adds an IP header, which includes the source and destination IP
addresses.
The Link Layer adds a frame header, which includes the source and destination MAC addresses
and other link-level control information.
When the data is transmitted, each layer's header provides the necessary information for that
layer. At the receiving end, the process is reversed (decapsulation), where each layer strips off
its respective header and passes the data to the layer above it.
Example of Encapsulation:
).
As the message travels across the network, the Ethernet, IP, and TCP headers are used by various
routers and switches to deliver the data to the correct destination. At the destination, the headers
are stripped off layer by layer, and the original HTTP request is passed to the application (the
web browser).
The application layer is the topmost layer in the TCP/IP model, responsible for providing
services and protocols that support end-user applications such as email, file transfers, and web
browsing. The World Wide Web (WWW) and HTTP (HyperText Transfer Protocol) are
some of the most widely used services provided by the application layer.
Overview of HTTP
HTTP is the protocol used by web browsers and servers to communicate. It defines how
messages are formatted and transmitted, and how web servers respond to various commands
(like loading a web page).
Client-Server Model: In HTTP, a client (e.g., a web browser) sends a request to the
server (where the website is hosted). The server processes this request and sends back a
response (usually the requested web page).
Stateless Protocol: HTTP is stateless, meaning each request from a client to a server is
independent. The server does not retain any information (or state) about previous requests
from the same client. However, cookies are used to maintain sessions across requests.
HTTP messages are exchanged between the client and server. There are two types of HTTP
messages:
makefile
Copy code
GET /index.html HTTP/1.1
Host: www.example.com
User-Agent: Mozilla/5.0
less
Copy code
HTTP/1.1 200 OK
Content-Type: text/html
Content-Length: 1024
<html>
<body>...</body>
</html>
As HTTP is stateless, cookies are used to track user interactions across multiple requests. A
cookie is a small piece of data stored on the client’s browser, sent by the server to remember
stateful information (like login sessions).
css
Copy code
Set-Cookie: session_id=abc123; Expires=Wed, 19 Oct 2024 23:59:59 GMT; Path=/
Web Caching
Web caching improves the efficiency of HTTP by temporarily storing (or "caching") copies of
web content closer to the user. There are two types of web caches:
1. Browser cache: The browser stores recently accessed web pages, reducing the need to
request them again from the server.
2. Proxy cache: An intermediary server (proxy) stores copies of web pages for multiple
users in a network, reducing server load and speeding up content delivery.
Caches use headers like Cache-Control to determine how long to store a copy and whether the
content has changed since it was last accessed
2.2 Electronic Mail in the Internet
Electronic mail (email) is one of the earliest and most
widely used applications on the Internet. Email allows
users to send and receive messages over the network
using specific protocols.
Components of Email
An email system consists of the following main
components:
Email Protocols
There are three primary protocols used in email
communication:
POP3 Characteristics:
IMAP Characteristics:
Two-way Communication: Emails are stored on the
server, and the client synchronizes with the server,
allowing emails to be accessed from multiple devices.
Mail Folder Management: Users can create, rename, and
delete folders on the mail server.
Ideal for Multiple Devices: Users can check emails on
multiple devices (e.g., laptop, phone, tablet) and
maintain synchronization.
Ports: IMAP typically uses port 143 (unencrypted) or 993
(encrypted).
IMAP Process Example:
The Domain Name System (DNS) is often referred to as the “phonebook of the Internet.” It is a
crucial part of the application layer, responsible for translating human-readable domain names
(like www.example.com) into IP addresses (like 192.0.2.1), which computers use to identify
each other on the network.
Without DNS, users would have to remember the numerical IP addresses of every website they
want to visit, which is impractical. DNS makes the internet user-friendly by allowing the use of
simple domain names instead.
When you type a URL like www.google.com in your browser, a DNS query is made to resolve
(translate) the domain name into its corresponding IP address. The DNS operates using a
hierarchical and distributed system to manage domain name resolution across the globe.
1. DNS Query Initiation: When a user enters a URL, the client (usually the web browser)
sends a DNS query to resolve the domain name to an IP address.
2. Local DNS Cache Check: The system first checks the local cache (stored on the user’s
device or browser) to see if it has recently resolved the domain name. If the IP address is
found, it is used directly.
3. Recursive DNS Query: If the IP address is not cached locally, the query is sent to a
recursive DNS resolver (usually provided by the user's ISP). This resolver will initiate
the process of looking up the IP address from DNS servers if it doesn’t already know the
answer.
4. Root DNS Servers: If the resolver doesn’t have the information, it sends a query to one
of the 13 root DNS servers. These servers don't hold the actual IP addresses but direct
the resolver to the appropriate Top-Level Domain (TLD) server (e.g., .com, .org,
.net).
5. TLD Servers: The TLD server then directs the query to the authoritative name server
that holds the IP address for the specific domain (e.g., for www.example.com, the .com
TLD server directs the resolver to the name server for example.com).
6. Authoritative DNS Server: The authoritative DNS server for the domain responds with
the IP address associated with the domain name. This IP address is then returned to the
client, completing the query.
7. Website Access: Once the IP address is known, the web browser uses it to communicate
with the web server and retrieve the requested webpage.
DNS Hierarchy
The DNS is structured in a hierarchical manner:
1. Root Level: The highest level in the hierarchy. The root servers are responsible for
directing queries to the correct TLD servers.
2. Top-Level Domains (TLD): This level contains the major domain extensions such as
.com, .org, .edu, .net, etc., as well as country code TLDs like .uk, .us, and .jp.
3. Second-Level Domains: This is where the actual domain names are registered. For
example, in www.example.com, example is the second-level domain under the .com
TLD.
4. Subdomains: Domains can also have subdomains. For example, mail.example.com and
blog.example.com are subdomains of example.com.
1. Recursive Query: In a recursive query, the DNS resolver is responsible for finding the
IP address for the domain. If it doesn’t know the address, it queries other servers on
behalf of the client until the IP is found or an error is returned.
2. Iterative Query: In an iterative query, the DNS resolver responds with the best
information it has. If it doesn’t know the answer, it points the client to another DNS
server that may have the answer, but the client has to repeat the process.
3. Non-recursive Query: This occurs when the DNS resolver already knows the answer
(e.g., from its cache), and it returns the IP address without needing to query other servers.
DNS Records
DNS servers store various types of records that provide information about a domain:
A Record (Address Record): Maps a domain name to its corresponding IPv4 address.
Example:
css
Copy code
www.example.com. IN A 192.0.2.1
Example:
arduino
Copy code
www.example.com. IN AAAA 2001:db8::1
CNAME (Canonical Name Record): Maps an alias or subdomain to the canonical
domain name.
Example:
objectivec
Copy code
www.example.com. IN CNAME example.com.
MX (Mail Exchange Record): Specifies the mail server responsible for receiving email
for the domain.
Example:
Copy code
example.com. IN MX 10 mail.example.com.
NS (Name Server Record): Indicates the authoritative name servers for the domain.
Example:
Copy code
example.com. IN NS ns1.example.com.
PTR (Pointer Record): Used for reverse DNS lookups, mapping an IP address back to a
domain name.
Example:
Copy code
1.2.0.192.in-addr.arpa. IN PTR www.example.com.
DNS Caching
DNS caching improves efficiency and reduces the load on DNS servers by temporarily storing
the results of DNS queries. There are different levels of DNS caching:
1. Browser Cache: Web browsers cache DNS information for recently accessed domains to
reduce lookup times.
2. Operating System Cache: Operating systems often store DNS lookups in memory to
avoid redundant queries.
3. DNS Resolver Cache: Recursive DNS resolvers cache the results of DNS queries to
speed up future requests for the same domain.
Caching is controlled by the TTL (Time-to-Live) value, which specifies how long a DNS record
should be stored before it is considered outdated.
DNS is vital for the functioning of the internet, but it also has some security concerns:
1. DNS Spoofing (Cache Poisoning): Attackers can inject false DNS responses into the
cache, directing users to malicious websites.
2. DNSSEC (DNS Security Extensions): DNSSEC adds a layer of security to DNS by
providing authentication for DNS records using cryptographic signatures, preventing
DNS spoofing.
In traditional client-server architectures, a central server stores files and clients download them
from this server. However, in Peer-to-Peer (P2P) file distribution, there is no central server.
Instead, all the participating computers (peers) in the network act both as clients and servers.
They download files from other peers and also upload files to other peers. This significantly
reduces the load on any single server and makes file sharing more scalable.
In a P2P system, any device or computer (peer) can initiate both uploading and downloading.
The process generally involves the following steps:
1. Joining the Network: When a peer joins a P2P network, it connects to several other
peers and announces that it can share certain files.
2. File Discovery: If a peer wants to download a file, it first searches the network to locate
the file on other peers. This is usually done by contacting a tracker or a search
mechanism in decentralized networks.
3. File Segmentation: In many P2P systems, files are broken into smaller pieces called
chunks or blocks. This allows for faster downloads because different chunks can be
downloaded from multiple peers simultaneously.
4. Simultaneous Upload and Download: While a peer is downloading chunks from other
peers, it can also upload the chunks it already has to other peers, creating a distributed
load-sharing system.
5. Completion and Sharing: Once a peer has downloaded all the chunks of a file, it can
reassemble the file. It can then continue to share the file with others, even after it has
completed the download.
BitTorrent: One of the most well-known P2P file-sharing protocols. It divides files into
small pieces and enables users to download different parts of the file from multiple peers
simultaneously, speeding up the download process.
Napster: One of the first P2P file-sharing services, originally focused on music files. It
was eventually shut down due to legal issues.
Gnutella: A decentralized P2P network where every peer is equal, and there’s no need
for a central tracker to find peers.
1. Scalability: P2P systems scale well as the number of peers increases because each peer
contributes resources (bandwidth, storage) to the system.
2. Load Distribution: Instead of a central server handling all requests, the load is
distributed among peers. This reduces bottlenecks and allows large numbers of users to
share files.
3. Fault Tolerance: Since there is no single central server, the system is more resistant to
failures. If one peer goes offline, others can still provide the needed file chunks.
4. Efficient Use of Bandwidth: Files can be downloaded faster since multiple peers can
upload different parts of the file simultaneously.
1. Security Risks: P2P systems can be used to spread malware or pirated content. Since any
peer can contribute files, malicious actors can share infected files.
2. Quality Control: With no central authority overseeing file distribution, there’s a risk of
downloading corrupted or incorrect files.
3. Bandwidth Usage: P2P systems can consume large amounts of bandwidth, which may
be an issue for users with limited or metered internet connections.
4. Legal Issues: P2P networks have been associated with illegal file sharing (such as
copyrighted music or software), which has led to legal actions against some users or
services.
Tracker-based vs. Trackerless P2P Systems
Tracker-based P2P (e.g., BitTorrent): A tracker is a server that helps peers find other
peers that have the file they want. Once peers connect, they share data without the
involvement of the tracker.
Trackerless P2P (e.g., Gnutella, DHT in BitTorrent): In trackerless systems, peers
find each other without a centralized server. This often uses distributed hash tables
(DHT), where each peer maintains a portion of the overall directory, making the network
more decentralized and robust.
BitTorrent Protocol
In the BitTorrent protocol, the file-sharing process is highly optimized. Here's how it works:
1. Torrent Files: To download a file via BitTorrent, users first download a small .torrent
file. This file contains metadata about the files to be shared, including file names, sizes,
and the location of a tracker.
2. Swarm: The set of peers sharing the file is called a swarm. Each peer downloads chunks
of the file and shares them with other peers.
3. Seeders and Leechers:
o Seeders: Peers that have already downloaded the entire file and are only
uploading it to others.
o Leechers: Peers that are still downloading the file. They upload the chunks they
have, while downloading missing chunks from other peers.
4. Choking and Unchoking: BitTorrent implements a system where peers decide which
other peers to share with based on how much bandwidth they’re contributing. This
encourages fairness and prevents peers from just downloading without contributing
(known as “freeloading”).
5. Piece Selection Strategy: BitTorrent uses a rarest-first piece selection strategy. This
means that the peer will first download the rarest pieces of the file, helping to ensure that
all pieces remain available within the network.
With the rise of video content on the Internet, video streaming has become one of the most
popular services. Streaming allows users to watch videos in real-time without having to
download the entire file. This section will explain how video streaming works and how Content
Distribution Networks (CDNs) optimize the delivery of video content.
Video streaming refers to the process of transmitting video files over the internet so that users
can start viewing the video almost immediately as it’s being delivered, without having to
download the entire file.
1. On-demand streaming: Viewers can select a video and start watching it whenever they want
(e.g., YouTube, Netflix).
2. Live streaming: Video content is broadcasted in real-time, allowing users to watch live events as
they happen (e.g., live sports, concerts).
Server-side encoding: Video content is first encoded using a codec (e.g., H.264, VP9) to
compress the file without losing quality. This reduces the size of the video file for
transmission.
Transmission: The video is then split into small chunks and transmitted over the internet
in real-time. These chunks are delivered using Transmission Control Protocol (TCP) or
User Datagram Protocol (UDP).
Client-side playback: The user's device (client) receives the video chunks and plays
them as soon as enough chunks have been buffered. This prevents the video from
freezing while it is being played.
Key Metrics in Video Streaming:
Buffering: The process of pre-loading video data in the client device to ensure smooth playback
without interruptions.
Latency: The delay between the time a video is requested and when playback starts. Minimizing
latency is critical for live streaming.
Bitrate: The amount of data processed per second of video. Higher bitrates offer better quality
but consume more bandwidth.
1. Bandwidth: Streaming high-quality video (e.g., HD, 4K) requires a lot of bandwidth. Limited
bandwidth can lead to buffering issues and reduced video quality.
2. Network Congestion: Heavy network traffic can slow down video delivery, especially when
multiple users are streaming videos simultaneously.
3. Latency: Long delays in video delivery can frustrate users, particularly for live streaming
applications.
4. Device Compatibility: Video must be encoded and delivered in formats compatible with a wide
range of devices (e.g., smartphones, tablets, TVs).
To address bandwidth issues, adaptive bitrate streaming (ABR) is commonly used. In ABR,
the video is encoded at multiple bitrates (e.g., low, medium, high), and the best bitrate is
dynamically chosen based on the viewer’s network conditions.
How it works: If the user’s internet speed is fast, the system will switch to a higher-quality
stream. If the connection is slow, the system will downgrade to a lower-quality stream to
prevent buffering.
HTTP Live Streaming (HLS): Developed by Apple, widely used for streaming on mobile devices.
Dynamic Adaptive Streaming over HTTP (DASH): An open standard for adaptive streaming.
A Content Distribution Network (CDN) is a system of distributed servers (nodes) that deliver
web content, including video, to users based on their geographic location. CDNs play a critical
role in improving the quality and efficiency of video streaming by reducing latency and server
load.
Edge Servers: CDNs use a network of geographically distributed edge servers. These
servers cache content closer to end-users, reducing the distance the data needs to travel.
When a user requests a video, the CDN directs the request to the nearest edge server.
Caching: Frequently accessed content is cached (stored temporarily) on edge servers.
This ensures that future requests for the same content are delivered quickly without
having to fetch the content from the origin server.
1. Reduced Latency: By delivering content from a nearby edge server, CDNs reduce the
distance the data must travel, minimizing delays.
2. Scalability: CDNs can handle large volumes of traffic, allowing platforms like YouTube
and Netflix to serve millions of users simultaneously.
3. Improved Reliability: CDNs distribute traffic across multiple servers, reducing the risk
of overload on any single server.
4. Load Balancing: CDNs distribute incoming traffic evenly across servers, preventing any
one server from becoming overwhelmed.
5. Global Reach: CDNs provide fast and reliable access to video content for users across
different regions by having a global network of edge servers.
Major video streaming platforms rely heavily on CDNs for efficient content delivery:
Netflix: Uses its own CDN called Open Connect to deliver high-quality video to users across the
globe.
YouTube: Leverages CDNs to ensure fast video delivery, even during peak traffic times.
Amazon Prime Video: Also utilizes CDNs to serve video content efficiently to millions of users.
Real-Time Streaming Protocol (RTSP): A protocol used to control the streaming media servers.
It establishes and controls media sessions between endpoints (often used for live streaming).
Real-Time Transport Protocol (RTP): Works with RTSP and is used for delivering audio and video
over IP networks.
Hypertext Transfer Protocol (HTTP): Used for on-demand video streaming by delivering video
content over standard web protocols.
Summary
Video Streaming allows real-time viewing of content without downloading the entire file.
Adaptive Bitrate Streaming (ABR) improves the user experience by dynamically adjusting video
quality based on network conditions.
Content Distribution Networks (CDNs) play a key role in reducing latency, improving scalability,
and enhancing the reliability of video streaming by distributing content across geographically
dispersed servers.
Network Layers
Internet
3.2. Multiplexing and De-multiplexing
The transport layer is essential in networking, providing end-to-end communication services for
applications. It functions as an intermediary between the application layer and the network layer.
Here are the key aspects:
Reliable Data Transfer: Guarantees that data is delivered accurately and in the correct
order. This is achieved through error detection and retransmission of lost packets.
Connection-Oriented Service: Establishes a connection before data transfer, allowing
for reliable communication. This service is exemplified by the Transmission Control
Protocol (TCP).
Connectionless Service: Sends data without establishing a connection first. This service
is faster but does not guarantee reliability or order, as seen in the User Datagram Protocol
(UDP).
Multiplexing and De-multiplexing: Allows multiple applications to use the same
network connection. Multiplexing combines data from various applications into a single
stream for transmission, while de-multiplexing separates this stream back into individual
application data at the receiving end
The transport layer (Layer 4) and the network layer (Layer 3) work together to ensure effective
communication across networks. Here's how they relate:
The transport layer relies on the network layer to deliver packets to the correct
destination. When an application sends data, the transport layer segments this data into
packets, adding necessary headers (such as sequence numbers and checksums).
The network layer then encapsulates these packets into frames, adding its own header
(with source and destination IP addresses) for routing.
The network layer also provides feedback to the transport layer regarding network
conditions, such as congestion, which can affect data transmission.
The transport layer plays a vital role in the Internet's architecture, providing essential services
that support application communication. Here are the key features and functions:
The two primary protocols that operate at the transport layer in the Internet are:
Multiplexing and de-multiplexing are crucial processes at the transport layer that facilitate
efficient data communication between multiple applications over a single network connection.
Multiplexing
De-multiplexing
Example
When a web browser (application) requests multiple resources (like images, scripts) from
a web server, the transport layer multiplexes these requests into a single TCP connection,
each identified by different port numbers. Upon receiving, the server
The User Datagram Protocol (UDP) is a connectionless transport layer protocol widely used for
applications that require fast data transmission with minimal overhead. Below are the key
features and characteristics of UDP:
Connectionless Protocol:
o UDP does not establish a connection before sending data. This means there is no
handshake process, which allows for faster data transmission.
No Reliability Guarantees:
o UDP does not guarantee delivery of packets. If packets are lost during
transmission, they are not retransmitted, making it suitable for applications that
can tolerate some data loss.
No Order Guarantee:
o Packets may arrive out of order. The application layer is responsible for handling
any necessary ordering of received data.
Low Overhead:
o Because UDP lacks the features that provide reliability (like sequence numbers
and acknowledgments), it has a smaller header size (8 bytes) compared to TCP
(20 bytes), resulting in lower overhead.
UDP Checksum
The checksum is optional in UDP but provides a way to verify the integrity of the
transmitted data. It is calculated over the UDP header and data, allowing the receiver to
detect any errors that may have occurred during transmission.
If the checksum indicates an error, the receiving application can choose to discard the
corrupted datagram.
Use Cases
UDP is commonly used in applications where speed is critical and some data loss is
acceptable, such as:
o Video streaming (e.g., live broadcasts)
o Online gaming
o Voice over IP (VoIP)
The structure of a UDP segment is simple and efficient, designed to facilitate quick transmission
without the overhead of connection management. Below are the key components of a UDP
segment:
The UDP header consists of four fields, each serving a specific purpose:
Data Payload
Following the UDP header is the data payload, which contains the actual data being sent
by the application.
The size of the data payload can vary, depending on the application and the maximum
transmission unit (MTU) of the network.
plaintext
Copy code
+---------------------+
| Source Port | 16 bits
+---------------------+
| Destination Port | 16 bits
+---------------------+
| Length | 16 bits
+---------------------+
| Checksum | 16 bits
+---------------------+
| Data | Variable length
+---------------------+
The UDP checksum is a crucial component of the UDP segment that helps ensure the integrity of
the data being transmitted. Below are the key aspects of the UDP checksum:
Error Detection:
o The checksum is used to detect errors that may occur during the transmission of
the UDP segment over the network. This includes detecting data corruption due to
noise, interference, or other factors affecting the transmission medium.
Integrity Assurance:
o By calculating and verifying the checksum, the receiving application can
determine whether the data has been altered or corrupted in transit.
Checksum Calculation
Pseudo-Header:
o The checksum calculation involves a pseudo-header that includes:
Source IP address (32 bits)
Destination IP address (32 bits)
Protocol number (8 bits) – for UDP, this is 17.
UDP length (16 bits) – the length of the UDP header and data.
Checksum Process:
1. The sender constructs the pseudo-header and appends the UDP header and data to
it.
2. The checksum is calculated by performing a one's complement sum of all 16-bit
words in this combined structure.
3. The result is then complemented to produce the final checksum value, which is
placed in the UDP header.
Upon receiving a UDP segment, the receiver performs the same checksum calculation
using the pseudo-header, UDP header, and data.
If the calculated checksum matches the one in the UDP header, the data is assumed to be
intact. If not, the data is considered corrupted, and the segment may be discarded.
Although the UDP checksum is mandatory in IPv6, it is optional in IPv4. If a sender does
not compute a checksum, the field is set to zero, and the receiver does not perform any
checksum verification.
Use Cases
The checksum is particularly important in applications where data integrity is crucial, but
where the application can still function without certain lost packets, such as video
streaming and online gaming.
3.4 Principles of Reliable Data Transfer
Reliable data transfer is a critical function of transport layer protocols, ensuring that data is
delivered accurately and in the correct order. This section covers key mechanisms and protocols
used to achieve reliable data transfer, focusing on two common approaches: Go-Back-N (GBN)
and Selective Repeat (SR).
Acknowledgments (ACKs):
o The receiver sends an acknowledgment back to the sender for successfully
received packets. If the sender does not receive an acknowledgment within a
certain timeframe, it assumes the packet was lost and may retransmit it.
Sequence Numbers:
o Each packet is assigned a unique sequence number, allowing the receiver to detect
duplicate packets and maintain the correct order of received packets.
Timeouts:
o The sender waits for an acknowledgment for a specific duration (timeout period).
If no acknowledgment is received within this period, the sender assumes that the
packet has been lost and retransmits it.
Description:
o GBN is an automatic repeat request (ARQ) protocol where the sender can send
multiple frames before needing an acknowledgment for the first one. However,
the sender must keep a window of unacknowledged packets, which defines how
many packets can be sent without waiting for an acknowledgment.
Mechanism:
o If an error occurs (e.g., a packet is lost or corrupted), the receiver discards all
subsequent packets, and the sender must retransmit the entire window of packets
starting from the lost packet.
Advantages:
o Simple to implement, as it requires only a single acknowledgment for all packets
in the window.
Disadvantages:
o Inefficient in terms of bandwidth utilization if a packet is lost since it requires
retransmitting all subsequent packets.
Description:
o SR is another ARQ protocol that allows the sender to retransmit only the specific
lost or corrupted packets rather than the entire window of packets. This reduces
unnecessary retransmissions.
Mechanism:
o The sender keeps track of the packets that need to be acknowledged. The receiver
sends individual ACKs for correctly received packets and can buffer out-of-order
packets until missing packets are received.
Advantages:
o More efficient than GBN as it minimizes the amount of data retransmitted and can
achieve better bandwidth utilization.
Disadvantages:
o More complex to implement, requiring additional mechanisms for buffering and
maintaining state information about sent and received packets.
Connection-Oriented:
o TCP establishes a connection between the sender and receiver before data transfer
begins. This connection is maintained throughout the communication session.
Reliable Data Transfer:
o TCP ensures that data is delivered accurately and in the correct order. It employs
acknowledgments, sequence numbers, and retransmissions to achieve this.
Flow Control:
o TCP uses flow control mechanisms to prevent overwhelming the receiver with too
much data at once. It adjusts the rate of data transmission based on the receiver’s
ability to process data.
Congestion Control:
o TCP includes algorithms to detect network congestion and adjust the sending rate
accordingly to minimize packet loss and ensure smooth data transfer.
Sequence Numbers:
o Each byte of data in a TCP segment is assigned a sequence number, allowing the
receiver to reorder segments and detect duplicates.
Acknowledgments:
o The receiver sends an ACK for the highest sequence number received in order. If
segments are received out of order, the receiver can buffer them until the missing
segment arrives.
Retransmissions:
o If a segment is not acknowledged within the timeout period, it is retransmitted.
TCP uses various mechanisms (e.g., fast retransmit) to quickly identify lost
segments.
TCP congestion control mechanisms are essential for maintaining network performance and
stability. These mechanisms prevent network congestion, ensuring efficient data transmission
even under varying network conditions. This section covers the key concepts and algorithms
used in TCP congestion control.
Congestion:
o Occurs when the demand for network resources exceeds the available capacity,
leading to packet loss, delays, and reduced throughput.
Congestion Window (cwnd):
o A TCP state variable that limits the amount of data the sender can transmit before
receiving an acknowledgment. It adjusts dynamically based on network
conditions.
Congestion Control Algorithms
1. Slow Start:
o TCP begins transmission with a small congestion window (cwnd). For every
acknowledgment received, the cwnd is increased exponentially (doubled) until a
threshold (ssthresh) is reached. This phase allows TCP to quickly find the
available bandwidth.
2. Congestion Avoidance:
o Once the cwnd reaches the ssthresh, TCP transitions to the congestion avoidance
phase, where the cwnd increases linearly (by one segment per round-trip time).
This gradual increase helps avoid congestion.
3. Fast Retransmit:
o When the sender receives three duplicate ACKs for the same segment, it assumes
packet loss and retransmits the missing segment immediately, without waiting for
the timeout.
4. Fast Recovery:
o After fast retransmit, TCP enters the fast recovery phase. It temporarily reduces
the cwnd and enters congestion avoidance to probe for available bandwidth
without causing further congestion.
5. Reaction to Packet Loss:
o Upon detecting packet loss (via timeout or duplicate ACKs), TCP reduces the
cwnd significantly (multiplicative decrease) and sets the ssthresh to half of the
current cwnd. This response helps alleviate congestion and stabilize the network.
Summary
TCP's congestion control mechanisms are crucial for adapting to changing network conditions
and preventing congestion-related issues. By adjusting the transmission rate based on real-time
feedback from the network, TCP ensures reliable and efficient data transfer.