0% found this document useful (0 votes)
7 views

1-3-IO

The document covers the architecture and functioning of Input/Output (I/O) devices in computer systems, emphasizing the importance of I/O for usability. It discusses various types of I/O devices, their speeds, and different techniques for performing I/O operations, including Programmed I/O, Interrupt-driven I/O, and Direct Memory Access (DMA). Additionally, it touches on network communication, error detection methods like Cyclic Redundancy Check (CRC), and the significance of buffering in operating systems.

Uploaded by

work.manhcang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

1-3-IO

The document covers the architecture and functioning of Input/Output (I/O) devices in computer systems, emphasizing the importance of I/O for usability. It discusses various types of I/O devices, their speeds, and different techniques for performing I/O operations, including Programmed I/O, Interrupt-driven I/O, and Direct Memory Access (DMA). Additionally, it touches on network communication, error detection methods like Cyclic Redundancy Check (CRC), and the significance of buffering in operating systems.

Uploaded by

work.manhcang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

CSE 325 Computer Systems

Part 1-3 Input/Output

Related Reading: Stallings, 1.7, 11.1-11.2


Kurose, 1.5, 6.1-6.2

CSE 325 Architecture

Sources for CSE 325 Course Notes


• Most of the materials in these course notes were developed by
Prof. Philip McKinley at Michigan State University

• Other sources of materials/figures include:


• Operating Systems, Internals and Design Principles, Stallings, 9th ed. 2018
• Computer Networking -- A Top-Down Approach, Kurose/Ross, 7th ed. 2017
• Computer Organization and Architecture, Stallings, 11th ed. 2019
• Operating Systems: Three Easy Pieces, Arpaci-Dusseau2, (free text)
• Course notes of Prof. K. R. Joshi at Columbia University
• Course notes of Prof. Hugh C. Lauer at Worcester Polytechnic Institute
• Thanks to all these great writers and instructors!

CSE 325 Architecture


I/O Devices
• Without I/O, computers are not very useful
• But... there are now thousands of commercial
peripheral devices, each slightly (or very) different
from others
• Human readable/interactive
– used to communicate with the user
– Examples:
• video displays
• keyboard
• mouse
• printer

CSE 325 Architecture

Categories of I/O Devices


• Machine readable
– used to communicate with electronic equipment
– Examples:
• disk drives, solid state drives
• flash drives
• electronic controllers/actuators

• Communication with remote devices


– Examples:
• secondary display devices
• modems (mostly obsolete)
• network interfaces

CSE 325 Architecture


Widely Different I/O Speeds
• Goals:
– Don’t waste time waiting
for slow devices
– Minimize overhead for
fast devices

CSE 325 Architecture

Even Faster…
• PCI Express used for variety of devices:
– Most graphics cards
– external GPUs
– wifi interfaces
– video capture
– SSDs
– RAID controllers
– Parallel cluster
interconnects
– Etc…

CSE 325 Architecture


I/O Aspects
• Byte vs. Block
– Character devices read/write arbitrary numbers of bytes
• Examples: keyboard, terminal, printer, joystick
– Block devices read/write only in entire blocks
• Examples: hard disk, SSD, CD/DVD
• Typical Block size: 512b, 1024b, 4096b

• Polling vs. Interrupts


– Polling (periodically checking if device needs service) was
the norm for many years. CPU initiated.
– Some devices (e.g. USB) still use polling
– Most others generate interrupts when they need service

CSE 325 Architecture

Techniques for Performing I/O


• Programmed I/O
– process is busy-waiting for the operation to complete
• Interrupt-driven I/O
– I/O command is issued
– processor continues executing instructions
– I/O module sends an interrupt when done
• Direct Memory Access (DMA)
– DMA module controls exchange of data between main
memory and the I/O device
– processor interrupted only after entire block has been
transferred

CSE 325 Architecture 16


Programmed I/O (when I was your age…)
• I/O module performs the action, on behalf
of the processor

• But the I/O module does not interrupt the


CPU when I/O is done

• Processor capacity is wasted checking


status of I/O module

CSE 325 Architecture 17

Interrupt-Driven I/O
• Processor is interrupted when I/O
module is ready to exchange data
• Processor is free to do other work
• No needless waiting
• Consumes a lot of processor time
because every word read or written
passes through the processor

CSE 325 Architecture 18


Direct Memory Access
• CPU issues request to a DMA module
(separate module or incorporated into
I/O module)
• DMA module transfers a block of data
directly to or from memory (without going
through CPU – “cycle stealing”)
• An interrupt is sent when the task is
complete
• The CPU is only involved at the
beginning and end of the transfer
• The CPU is free to perform other tasks
during data transfer
CSE 325 Architecture 19

Example: DMA-Handled I/O Request

CSE 325 Architecture


OS and DMA
• Nowadays, DMA is widely used by a variety of
peripheral devices
– disk drive controllers, graphics/network/sound cards
• Data for I/O usually copied into or out of buffers in
the OS kernel
• Operating systems generally invoke the scheduler
whenever a user process issues an I/O request.
– Typically, another process is selected to run and can work
while the I/O is being carried out
– When the device interrupts upon completion (either read
or write), kernel tends to the buffers. More later...
CSE 325 Architecture 23

(Kernel) I/O Buffering


• Reasons for buffering in kernel
– Processes must wait for I/O to complete
– Relevant pages must remain in memory during I/O
• Kernel buffering necessary for both:
– Character devices
• transfer information as a stream of bytes
• used for monitors, printers, network cards, communication ports,
mouse, and many non-secondary-storage devices
– Block devices
• information stored in fixed sized blocks (e.g., 4096 bytes)
• transfers are made a block at a time
• used for SSD, hard drive, CDROM, DVD, (and formerly tapes)
CSE 325 Architecture 28
Illustration
User

.
.
.
Kernel

CSE 325 Architecture 29

Network Interface
• Of course, a major type of I/O in modern computers
(laptops, desktops, smart phones, etc.) is the
network interface.
• Perspective: for a few decades, networking was
NOT an integral part of a computer
(processor, memory, disk, keyboard, terminal)
• Now, the network enables access to the WWW,
streaming services, etc., etc..
• But the (hidden) network also extends the basic
functionality of computing into the Internet.
• More on this later…
CSE 325 Architecture
Computer Network Terminology
• As mentioned the first day of class,
– computer network functionality is organized as a set of
layers
– Each layer has specific duties with regard to delivering
data across the Internet.
– In the Internet, all data (files, video, images, form data) is
transmitted in “chunks” collectively called packets
• Interestingly, packets are referred to by different
names at different layers:
– frames, datagrams, segments, messages...

CSE 325 Architecture 45

Internet: A network of networks


• An internetwork, or internet, is a unified,
cooperative interconnection of networks that
supports a universal communication service.
• Nowadays we simply refer to The Internet.
• The Internet employs the TCP/IP Protocol
Suite, developed in the late 1970s by BBN
and UC Berkeley with funding from DARPA.
• These protocols (rules for communication)
create a unified internet from many individual
networks.

CSE 325 Concurrency/Net


Illustration

Physical Networks
(physical addresses)

Illustration

Physical Networks
(physical addresses)
Illustration

Internetwork
(IP addresses)

Physical Networks
(physical addresses)

Illustration

Transport Service
(IP address, port #)

Internetwork
(IP addresses)

Physical Networks
(physical addresses)
Illustration

Application Protocol
(e.g., web client/server)
What Network?

Transport Service
(IP address, port #)

Internetwork
(IP addresses)

Physical Networks
(physical addresses)

Network Architecture
• A set of layers and protocols
• Layers correspond to layers in previous slides
• Layer interaction
– each layer offers primitive operations and services to higher layers
– interfaces must be be clean and well-defined
• Layers (bottom to top)
1. Physical: how to transmit bits on a wired or wireless channel
2. Link: support communication between directly connected nodes
3. Network: enable communication across a network
(route packets, forward them from one node to the next)
4. Transport: enable process-to-process communication
5. Application: Everything above the transport layer

CSE 325 Concurrency/Net


Corresponding Protocol Layers

Application

Transport

Network

Link

Physical

55

Local Area Networking


• At this point in the course, we consider only
architectural support for communication between
devices connected directly by a single “link”
• In networking terms, this functionality is referred to
as the link layer
– Examples: Ethernet and WiFi networks
– At the link layer, packets are called frames.
– In the layer above, the network layer, packets are called
datagrams.
• We describe an algorithm, implemented in
hardware, that supports link-layer communication
by detecting errors in frames
CSE 325 Architecture 56
Link Layer
• Both hosts and switching
elements such as routers are
collectively called nodes
• Communication channels that
connect adjacent nodes along
communication path are links

The link layer has responsibility


for transferring frames from one
node (either host or router) to an
adjacent node over a link
(routing of packets among nodes
is responsibility of layer above)
CSE 325 Architecture

Where is the link layer implemented?

• Link layer executes in each and every


node on the Internet
– computers, routers, access points, etc.
• Implemented in an “adapter”
(a.k.a. network interface card -- NIC) cpu memory
and its associated driver (in the OS)
– Ethernet card, 802.11 card, host
bus
Ethernet chipset, 802.11 chipset controller (e.g., PCI)

• Hence, the link layer is implemented physical


transmission
as a combination of software and
hardware network adapter
card/chipset

CSE 325 Architecture


Network Adapters Communicating

datagram datagram

controller controller

sending host receiving host


datagram

frame

• sending side: • receiving side


– encapsulate datagram in – error detection (checksum)
frame – If in error, drop frame (usually)
– Transmit bits on channel possibly triggering
– Might retransmit if needed retransmission
(enhanced reliability) – extract datagram, pass to
upper layer
CSE 325 Architecture

Errors in communication
• Examples sources of errors
– interference from electrical equipment
– impulse noise: e.g., lightning
– signal distortion
– sender, receiver losing synchronization
– “collisions” between frames on a shared channel
• Even a single bit error can cause major
damage. Why?

CSE 325 Architecture


Frame damage
• Bits in a frame can be ”flipped,”, lost, or inserted
• Many errors are “burst” errors – part of the packet is
set to all 1s or all 0s

• Today in wired networks, noise often is not a


significant issue, in part due to complex signaling
• However, in wireless networks it is a HUGE issue
• How many transmissions needed to get a frame
across the channel?

Example
But, how to know if a packet is damaged?
• A key function implemented in hardware/firmware
by the link layer:
• Error detection
– the network interface receiving a frame needs to detect if
any bits have been flipped
– sender computes and attaches a checksum to the frame
– receiver recomputes the checksum upon arrival
• match, ok. no match, drop frame and request retransmission.
• at link layer, retransmit a limited number of times, then give up.
• TCP can provide full reliability IF desired
• Note: some types of communication do not require
100% reliability 69
CSE 325 Architecture

Background: Galois Fields


• In mathematics, a field supports addition,
subtraction, multiplication and division
• Examples: Reals, Rationals have infinite elements
• Some fields, called Galois fields, have a finite
number of elements
• Math operations are carried out modulo N

72
Cyclic Redundancy Codes (CRC)
• Basic idea: treat string of bits as coefficients of
polynomials over GF(2).
• Ex. 101001 represents x5 + x3 +1
• Addition and subtraction are both equivalent to
exclusive-or
• 10011011 10011011
• +11001010 -11001010

CSE 325 Architecture

Example:

• Multiplication of polynomials over GF(2)


– written out vs shorthand

74
Example:
• Division of polynomials over GF(2)

75

Checksumming: Cyclic Redundancy Check

• view data bits, D(x), as a (long) binary number


• choose r+1 bit pattern (generator), G(x)
• goal: compute r CRC bits, R(x), such that
– <D,R> exactly divisible by G (modulo 2)
– receiver knows G, divides <D,R> by G. If non-zero remainder:
error detected!
– can detect all burst errors less than r+1 bits
• widely used in practice (Ethernet, 802.11 WiFi, ATM)

CSE 325 Architecture


(Small) CRC Example
Frame: 1 0 1 0 0 0 1 1 0 1
G(x): 1 1 0 1 0 1

CSE 325 Architecture

Example of CRC Failure


When will a CRC fail?
Example:
G(x)= x5 + x4 + x2 +1
Frame: 1010001101
T 1 0 1 0 0 0 1 1 0 1 0 1 1 1 0
R 1 0 1 0 0 0 1 0 1 0 0 1 1 1 1

CSE 325 Architecture


Example Generators

• CRC-32
x32 + x26 + x23 + x22 + x16 + x12 + x11 +x10 + x8 + x7 + x5 +
x4 + x2 + x +1
detects 99.99999997% of all burst errors of length 34
or more.
• CRC-64-ISO
x64 + x4 + x3 + x + 1
• CRC-64-ECMA-182
x64 + x62 + x57 + x55 + x54 + x53 + x52 + x47 + x46
+ x45 + x40 + x39 + x38 + x37 + x35 + x33 + x32 +
x31 + x29 + x27 + x24 + x23 + x22 + x21 + x19 + x17
+ x13 + x12 + x10 + x9 + x7 + x4 + x + 1
CSE 325 Architecture

How effective is a 64-bit CRC?


• 1/264 ≈ 5 x 10-20
• Estimated Internet traffic for 2019:
– ~ 2 zettabytes (2 x 1021 bytes)
• Assume mean packet length is 500 bytes
• Assume 5% (1/20) of packets experience errors
– Probably the percentage is far lower

CSE 325 Architecture


How about 128-bit CRC?
• But, Internet traffic is increasing
• What if it increases by a factor of 1,000,000?

• Would a 128-bit CRC be sufficient?


• 1/2128 ≈ 3 x 10-39
• Same calculations indicates this code will fail to
detect an error every (how many?) years.

CSE 325 Architecture

CRC Summary
• Basic idea:
– divide a relatively short number into a long frame
– subtract remainder
– divide as frame arrives, check for 0 remainder.
• Extremely effective in detecting errors!
• And due to implementation in hardware as data is
transmitted and received, overhead is essentially 0!
• Wow!

CSE 325 Architecture


I/O Summary
• Character vs block devices
– character (byte oriented, arbitrary number of bytes)
– block (transfers only whole chunks (typically 4096 bytes)
• Direct Memory Access (DMA)
– enables CPU to work on other things during data transfers
• Need for kernel buffering of data
– peripheral device operates asynchronously
– kernel pages can be ”pinned” so they will not page out
• Link layer of network protocol stack
– communication between directly connected nodes
– beauty, power and low-cost of CRC error checking

Architecture Summary
• The system architecture enables the execution of
computer programs
• But in ways beyond simply
– executing instructions
– reading and writing memory
• Examples:
– Interrupts guide the entire execution of the system
• interaction with peripheral devices
• clock interrupt for time slicing
• exceptions
• system calls
• ....
Architecture Summary (cont.)
– Execution mode:
• user mode – limits access to hardware by applications
• kernel mode – provides OS direct access the hardware
– Cache memory
• exploit locality to greatly improve performance
– Virtual memory - page tables set up by OS, but...
• architecture walks page tables (fast compared to OS!)
• takes actions (e.g., exception) based on control bits in PTE
• TLB cache of PTEs makes it even faster!
– High-speed connections to wide variety of chip sets and
peripherals, including high-speed graphics, networking
– DMA (CPU can do other work while transfer takes place)
– and finally, CRC, near zero-cost, powerful, error detection

Next Up.... The Operating System!


Part 2: Operating Systems
2-1 Processes
Process elements and time slicing
Role of the process control block
System calls
Signals
2-2 Kernel Operation
Process scheduling
Process creation
Managing virtual memory
2-3 File systems and Virtualization
File system structures
Role of the virtual file system
Virtual machines

You might also like