0% found this document useful (0 votes)
22 views

ES Notes

Uploaded by

ansariaqsa150607
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views

ES Notes

Uploaded by

ansariaqsa150607
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 112

Chapter 1: Introduction

 Definition of ES.
 Block Diagram of ES
 Embedded System Architectures: Von-Neumann/Harvard, RISC/CISC, DSP
 Characteristics of Embedded Systems: size, performance, flexibility,
maintainability, latency, throughput, correctness, processor power, power
consumption, safety, NRE cost, cost
 Classification of Embedded Systems:
 Based on Performance of microcontroller: Small scale, Medium scale,
Sophisticated
 Based on performance and functional requirements: Real time,
Standalone, Networked, Mobile
 History of ES.
 Applications of ES
 Purpose of ES

Embedded System: Definition


 An Embedded System is a system that has software embedded into computer-
hardware, which makes a system dedicated for a variety of application or specific part
of an application or product or part of a larger system.
 An embedded system is a dedicated computer system, designed to work for single or
few specific functions often within a larger system. Embedded Systems, therefore, are
– Built to function with little or no human intervention
– Specially designed keeping in consideration the tasks that need completion in
the most efficient way

Embedded System: Architecture


Embedded System: Characteristics/ Design Metrics:
 NRE cost (nonrecurring engineering cost): It is one-time cost of designing the
system. Once the system is designed, any number of units can be manufactured without
incurring any additional design cost; hence the term nonrecurring.
 Unit cost: The monetary cost of manufacturing each copy of the system, excluding
NRE cost.
 Size: The physical space required by the system, often measured in bytes for software,
and gates or transistors for hardware.
 Performance: The execution time of the system
 Power Consumption: It is the amount of power consumed by the system, which may
determine the lifetime of a battery, or the cooling requirements of the IC, since more
power means more heat.
 Flexibility: The ability to change the functionality of the system without incurring
heavy NRE cost. Software is typically considered very flexible.
 Maintainability: It is the ability to modify the system after its initial release, especially
by designers who did not originally design the system.
 Correctness: This is the measure of the confidence that we have implemented the
system’s functionality correctly. We can check the functionality throughout the process
of designing the system, and we can insert test circuitry to check that manufacturing
was correct.
 Latency: This is the time between the start of the task’s execution and the end. For
example, processing an image may take 0.25 second.
 Throughput: This is the number of tasks that can be processed per unit time. For
example, a camera may be able to process 4 images per second
History of Embedded Systems:
 Embedded systems date back to the 1960s. Charles Stark Draper developed an
integrated circuit in 1961 to reduce the size and weight of the Apollo Guidance
Computer, the digital system installed on the Apollo Command Module and Lunar
Module. The first computer to use ICs, it helped astronauts collect real-time flight data.
 In 1965, Autonetics [Division of North American Aviation], now a part of Boeing,
developed the D-17B, the computer used in the Minuteman-I missile guidance system.
It is widely recognized as the first mass-produced embedded system.
 When the Minuteman II went into production in 1966, the D-17B was replaced with
the NS-17 missile guidance system, known for its high-volume use of integrated
circuits.
 In 1968, the first embedded system for a vehicle was released; the Volkswagen 1600
used a microprocessor to control its electronic fuel injection system.
 By the late 1960s and early 1970s, the price of integrated circuits dropped and usage
surged. The first microcontroller was developed by Texas Instruments in 1971. The
TMS 1000 series, which became commercially available in 1974, contained a 4-bit
processor, read-only memory (ROM) and random-access memory (RAM), and cost
around $2 apiece in bulk orders.
 Also in 1971, Intel released what is widely recognized as the first commercially
available processor, the 4004. The 4-bit microprocessor was designed for use in
calculators and small electronics, though it required eternal memory and support chips.
 The 8-bit Intel 8008, released in 1972 had 16 KB of memory; the Intel 8080 followed
in 1974 with 64 KB of memory. The 8080's successor, x86 series, was released in 1978
and is still largely in use today.
 In 1987, the first embedded operating system, the real-time VxWorks [real-time
operating system], was released by Wind River. VxWorks is designed for use in
embedded systems requiring real-time, deterministic performance and, in many cases,
safety and security certification, for industries, such as aerospace and defence, medical
devices, industrial equipment, robotics, energy, transportation, network infrastructure,
automotive, and consumer electronics.
 Followed by Microsoft's Windows Embedded CE in 1996.
 By the late 1990s, the first embedded Linux products began to appear. Today, Linux is
used in almost all embedded devices.

Classification of Embedded Systems:


 Standalone embedded system: This system don’t require host system like a computer
system, it works by itself. It takes the input from the input ports either analog or digital
and processes, computes and transfers the data and gives the resulting data through the
connected device-which controls, drives or displays the associated devices. For
examples standalone embedded systems are mp3 players, digital cameras, video game
consoles, microwave ovens and temperature measurement systems.
 Real time embedded systems: A system called real time embedded system, which
gives a required output in a particular time. These types of embedded systems follow
the time deadlines for completion of a task. Real time embedded systems are classified
into two types such as soft real time embedded system and hard real time embedded
systems based on the time preciseness.
 Networked embedded system: Networked embedded systems are related to a network
to access the resources. The connected network can be LAN, WAN or the internet. The
connection can be any wired or wireless. This kind of embedded system is the fastest
growing technological area in embedded system applications. The embedded web
server is a type of system wherein all embedded devices are connected to a web server
and accessed and controlled by a web browser. For example the LAN networked
embedded system is a home security system wherein all sensors are connected and run
on the protected protocol TCP/IP.
 Mobile Embedded Systems: Mobile embedded systems are highly preferable in
portable embedded devices like cell phones, mobiles, digital cameras, wireless mp3
players and personal digital assistants, etc. The basic limitation of these devices is the
other resources and limitation of memory.
 Small Scale Embedded Systems: These types of embedded systems are designed with
a single 8-bit or 16-bit microcontroller. They have tiny scaled hardware, software
complexities and involve board-level design. They may even be battery operated. When
embedded software is developing for this tiny scaled hardware, an editor, an assembler
or cross assembler, specific to the microcontroller or processor used, are the main
programming tools. Usually, ‘C programming language’ is used for developing these
systems. ‘C’ program compilation is done into the assembly, and executable codes are
then appropriately located in the system memory.

 Medium Scale Embedded Systems: These systems are usually designed with a
single or few 16-bit or 32-bit microcontrollers or Digital Signal Processor (DSPs) or
Reduced Instruction Set Computers (RISCs) being used. These system have both
hardware and software complexities. For complex software design of medium scale
embedded system, there are the following programming tools: RTOS, Source code
engineering tool, Simulator, Debugger and Integrated Development Environment
(IDE). Software tools also give the clarifications to the hardware complexities. An
assembler is of slight use as a programming tool. These systems may also utilize the
readily available Application-Specific Standard Product (ASSPs) and IPs for the
various functions. For example, for the bus interfacing, encrypting, deciphering,
discrete cosine transformation and inverse transformation, TCP/IP protocol is stacking
and network connecting functions.
 Sophisticated Embedded Systems: Sophisticated embedded systems have massive
hardware and software complexities and may require ASIPs, IPs and PLAs scalable or
configurable processors and programmable logic arrays. They are used for cutting
edge applications that require hardware and software co-design and integration in the
final system. They are constrained by the processing speeds available in their
hardware units. Certain software functions such as encryption and deciphering
algorithms, discrete cosine transformation and inverse transformation algorithms,
TCP/IP protocol stacking and network driver functions are implemented in the
hardware to obtain additional speeds by saving time. Some of the functions of the
hardware resources in the system are also implemented by the software.

Applications of Embedded Systems:


 Embedded Systems in Automobiles and in telecommunications
 Motor and cruise control system
 Body or Engine safety
 Entertainment and multimedia in car
 E-Com and Mobile access
 Robotics in assembly line
 Wireless communication
 Mobile computing and networking
 Embedded Systems in Smart Cards, Missiles and Satellites
 Security systems
 Telephone and banking
 Defence and aerospace
 Communication
 Embedded Systems in Peripherals & Computer Networking
 Displays and Monitors
 Networking Systems
 Image Processing
 Network cards and printers
 Embedded Systems in Consumer Electronics
 Digital Cameras
 Set top Boxes
 High Definition TVs
 DVDs
Purpose of Embedded Systems:
 Data collection/Storage/Representation
Data: text, voice, image, video, electrical signal etc. In analog/digital form.
Storage:
 non storage type- analog/digital cro, medical monitoring equipments
etc.
 Storage type- digital camera, DSO, data loggers, medical equipments
like EMG, EEG etc.
 Representation: digital camera, medical equipments etc.
 Data Communication
 Data facilitating e.g. Packetizing, encryption, decryption
 Wired/ wireless interfaces- zigbee, bluetooth, wi-fi, edge, GPRS etc.
 Different protocols- RS232, RS485, USB, Ethernet, I2C, CAN,PS2
 Data communication, security over network
 Data Processing
 Data: text, voice, image, video, electrical signal etc. In analog/digital form.
 Digital hearing aid.
 Data Monitoring
 ECG, CRO
 Control
 Application specific user interface
 Mobile phones, adidas Smart Shoes etc.

The advantages of Embedded Systems are:


 They are convenient for mass production. This results in low price per piece.
 These systems are highly stable and reliable.
 Embedded systems are made for specific tasks.
 The embedded systems are very small in size, hence can be carried and loaded
anywhere.
 These systems are fast. They also use less power.
 The embedded systems optimize the use or resources available.
 They improve the product quality.
The disadvantages of Embedded Systems are as follows:
 Once configured, these systems cannot be changed. Hence, no improvement or
upgradation on the ones designed and created can be made.
 They are hard to maintain. It is also difficult to take a back-up of embedded files.
 Troubleshooting is difficult for embedded systems. Transferring data from one system
to another is also quite problematic.
 Because these systems are made for specific tasks, hardware is limited.
Chapter 2: Open source Embedded Development Board (Arduino)
 Aurdino: Birth, open source community
 Functional block diagram of Arduino
 Functions of each pins of Arduino
 Arduino development board diagram
 I/O functions, Looping Techniques, Decision making Techniques
 Programming of an Arduino
 Arduino bootloader
 Serial protocol (initialization)
 Basic Circuit for Arduino

Arduino: Birth, open source community:


 In the year 2005, the first ever Arduino board was born in the classrooms of the
Interactive Design Institute in Ivrea, Italy.
 An Arduino is an Open Source microcontroller based development board that has
opened the doors of electronics to a number of designers and creative engineers.
 It was in the Interactive Design Institute that a hardware thesis was contributed for a
wiring design by a Colombian student named Hernando Barragan. The title of the thesis
was “ The Revolution of Open Hardware”.
 A team of five developers worked on this thesis and when the new wiring platform was
complete, they worked to make it much lighter, less expensive, and available to the
open source community.
 The new prototype board, the Arduino, created by Massimo Banzi and other founders,
is a low cost microcontroller board that allows even a novice to do great things in
electronics. An Arduino can be connected to all kind of lights, motors, sensors and other
devices; easy-to-learn programming language can be used to program how the new
creation behaves. Using the Arduino, you can build an interactive display or a mobile
robot or anything that you can imagine.
 Inexpensive - Arduino boards are relatively inexpensive compared to other
microcontroller platforms.
 Cross-platform - The Arduino Software (IDE) runs on Windows, Macintosh OSX, and
Linux operating systems. Most microcontroller systems are limited to Windows.
 Simple, clear programming environment - The Arduino Software (IDE) is easy-to-use
for beginners, yet flexible enough for advanced users to take advantage of as well.
 Open source and extensible software - The Arduino software is published as open
source tools, available for extension by experienced programmers. The language can
be expanded through C++ libraries, and people wanting to understand the technical
details can make the leap from Arduino to the AVR C programming language on which
it's based. Similarly, you can add AVR-C code directly into your Arduino programs if
you want to.
 Open source and extensible hardware - The plans of the Arduino boards are published
under a Creative Commons license, so experienced circuit designers can make their
own version of the module, extending it and improving it. Even relatively inexperienced
users can build the breadboard version of the module in order to understand how it
works and save money.

Features of Arduino: Atmega328P:


 High performance, low power AVR 8-bit microcontroller
 Advanced RISC architecture
 131 powerful instructions – most single clock cycle execution
 32 x 8 general purpose working registers
 Fully static operation
 Up to 16MIPS throughput at 16MHz
 On-chip 2-cycle multiplier
 High endurance non-volatile memory segments
 32K bytes of in-system self-programmable flash program memory
 1Kbytes EEPROM
 2Kbytes internal SRAM
 Write/erase cycles: 10,000 flash/100,000 EEPROM
 Optional boot code section with independent lock bits
 In-system programming by on-chip boot program
 True read-while-write operation
 Programming lock for software security
 Peripheral features
 Two 8-bit Timer/Counters with separate prescaler and compare mode
 One 16-bit Timer/Counter with separate prescaler, compare mode, and capture
mode
 Real time counter with separate oscillator
 Six PWM channels
 8-channel 10-bit ADC in TQFP and QFN/MLF package, 6 in PDIP
 Temperature measurement
 Programmable serial USART
 Master/slave SPI serial interface
 Byte-oriented 2-wire serial interface (Phillips I2C compatible)
 Programmable watchdog timer with separate on-chip oscillator
 On-chip analog comparator
 Interrupt and wake-up on pin change
 Special microcontroller features
 Power-on reset and programmable brown-out detection
 Internal calibrated oscillator
 External and internal interrupt sources
 Six sleep modes: Idle, ADC noise reduction, power-save, power-down, standby,
and extended standby
 Operating voltage:
 2.7V to 5.5V for ATmega328P
 Temperature range:
 Automotive temperature range: –40°C to +125°C
 Speed grade:
 0 to 8MHz at 2.7 to 5.5V (automotive temperature range: –40°C to +125°C)
 0 to 16MHz at 4.5 to 5.5V (automotive temperature range: –40°C to +125°C)
 Low power consumption
 Active mode: 1.5mA at 3V - 4MHz
 Power-down mode: 1µA at 3V
Functional Block Diagram of Arduino:

Functional Block Diagram of Atmega 328P:


Architecture of Arduino: Atmega328P:
• In order to maximize performance and parallelism, the AVR uses a Harvard architecture
– with separate memories and buses for program and data.
• Instructions in the program memory are executed with a single level pipelining. While
one instruction is being executed, the next instruction is pre-fetched from the program
memory. This concept enables instructions to be executed in every clock cycle. The
program memory is In-System Reprogrammable Flash memory.
• The AVR core combines a rich instruction set with 32 general purpose working
registers.
• All the 32 registers are directly connected to the Arithmetic Logic Unit (ALU), allowing
two independent registers to be accessed in one single instruction executed in one clock
cycle.
• The resulting architecture is more code efficient while achieving throughputs up to ten
times faster than conventional CISC microcontrollers.
• Six of the 32 registers can be used as three 16-bit indirect address register pointers for
Data Space addressing – enabling efficient address calculations.
• One of these address pointers can also be used as an address pointer for look up tables
in Flash program memory. These added function registers are the 16-bit X-, Y-, and Z-
register.
• Most AVR instructions have a single 16-bit word format. Every program memory
address contains a 16- or 32-bit instruction.
• Program Flash memory space is divided in two sections, the Boot Program section and
the Application Program section. Both sections have dedicated Lock bits for write and
read/write protection.
• During interrupts and subroutine calls, the return address Program Counter (PC) is
stored on the Stack. The Stack is effectively allocated in the general data SRAM, and
consequently the Stack size is only limited by the total SRAM size and the usage of the
SRAM.
• All user programs must initialize the SP in the Reset routine
• A flexible interrupt module has its control registers in the I/O space with an additional
Global Interrupt Enable bit in the Status Register.
• All interrupts have a separate Interrupt Vector in the Interrupt Vector table. The
interrupts have priority in accordance with their Interrupt Vector position. The lower
the Interrupt Vector address, the higher the priority.
• The I/O memory space contains 64 addresses for CPU peripheral functions as Control
Registers, SPI, and other I/O functions. The I/O Memory can be accessed directly, or
as the Data Space locations following those of the Register File, 0x20 - 0x5F.
• ALU:
• The high-performance AVR ALU operates in direct connection with all the 32
general purpose working registers.
• Within a single clock cycle, arithmetic operations between general purpose
registers or between a register and an immediate are executed.
• The ALU operations are divided into three main categories – arithmetic, logical,
and bit-functions. Some implementations of the architecture also provide a
powerful multiplier supporting both signed/unsigned multiplication and
fractional format
• Status Register
• The Status Register contains information about the result of the most recently
executed arithmetic instruction. This information can be used for altering
program flow in order to perform conditional operations.
• Note that the Status Register is updated after all ALU operations, as specified
in the Instruction Set Reference.
• The Status Register is not automatically stored when entering an interrupt
routine and restored when returning from an interrupt. This must be handled by
software.

Status Register:
 Bit 7 – I: Global Interrupt Enable
 Bit 6 – T: Bit Copy Storage
The Bit Copy instructions BLD (Bit LoaD) and BST (Bit STore) use the T-bit as source or
destination.
 Bit 5 – H: Half Carry Flag
 Bit 4 – S: Sign Bit, S = N ⊕ V
 Bit 3 – V: Two’s Complement Overflow Flag
 Bit 2 – N: Negative Flag
 Bit 1 – Z: Zero Flag
 Bit 0 – C: Carry Flag
General Purpose Registers:

– Most of the instructions operating on the Register File have direct access to all
registers, and most of them are single cycle instructions.
– Each register is also assigned a data memory address, mapping them directly
into the first 32 locations of the user Data Space.
– Although not being physically implemented as SRAM locations, this memory
organization provides great flexibility in access of the registers, as the X-, Y-
and Z-pointer registers can be set to index any register in the file.
– The registers R26..R31 have some added functions to their general purpose
usage. These registers are 16-bit address pointers for indirect addressing of the
data space. The three indirect address registers X, Y, and Z
– In the different addressing modes these address registers have functions as fixed
displacement, automatic increment, and automatic decrement
– Stack pointer:
– The Stack is mainly used for storing temporary data, for storing local variables
and for storing return addresses after interrupts and subroutine calls.
– Here Stack is implemented as growing from higher to lower memory locations.
– The Stack Pointer Register always points to the top of the Stack.
– The Stack Pointer points to the data SRAM Stack area where the Subroutine
and Interrupt Stacks are located.
– A Stack PUSH command will decrease the Stack Pointer.
– The Stack in the data SRAM must be defined by the program before any
subroutine calls are executed or interrupts are enabled.
– Initial Stack Pointer value equals the last address of the internal SRAM and the
Stack Pointer must be set to point above start of the SRAM
– Interrupt Handling:
– Global Interrupt Enable bit
– Depending on the Program Counter value, interrupts may be automatically
disabled when Boot Lock bits BLB02 or BLB12 are programmed. This feature
improves software security.
– When an interrupt occurs, the Global Interrupt Enable I-bit is cleared and all
interrupts are disabled.
– The user software can write logic one to the I-bit to enable nested interrupts. All
enabled interrupts can then interrupt the current interrupt routine. The I-bit is
automatically set when a Return from Interrupt instruction – RETI – is executed.
– AVR Memories :
– The ATmega328P contains 32K bytes On-chip In-System Reprogrammable
Flash memory for program storage.
– Since all AVR instructions are 16 or 32 bits wide, the Flash is organized as 16K
x 16.
– For software security, the Flash Program memory space is divided into two
sections, Boot Loader Section and Application Program Section
– The Flash memory has an endurance of at least 10,000 write/erase cycles.
– The lower 2303 data memory locations address both the Register File, the I/O
memory, Extended I/O memory, and the internal data SRAM.
– The first 32 locations address the Register File, the next 64 location the standard
I/O memory, then 160 locations of Extended I/O memory, and the next 2048
locations address the internal data SRAM.

– The five different addressing modes for the data memory cover: Direct, Indirect
with Displacement, Indirect, Indirect with Pre-decrement, and Indirect with
Post-increment.
– In the Register File, registers R26 to R31 feature the indirect addressing pointer
registers.
– The direct addressing reaches the entire data space.
– The Indirect with Displacement mode reaches 63 address locations from the
base address given by the Y- or Z-register.
– When using register indirect addressing modes with automatic pre-decrement
and post-increment, the address registers X, Y, and Z are decremented or
incremented.
– The ATmega328P contains 1K bytes of data EEPROM memory.
– It is organized as a separate data space, in which single bytes can be read and
written.
– The EEPROM has an endurance of at least 100,000 write/erase cycles.
– I/O memory- 64 bytes
– BOD:
– When the Brown-out Detector (BOD) is enabled by BODLEVEL fuses, the
BOD is actively monitoring the power supply voltage during a sleep period. To
save power, it is possible to disable the BOD by software for some of the sleep
modes. The sleep mode power consumption will then be at the same level as
when BOD is globally disabled by fuses. If BOD is disabled in software, the
BOD function is turned off immediately after entering the sleep mode. Upon
wake-up from sleep, BOD is automatically enabled again. This ensures safe
operation in case the VCC level has dropped during the sleep period.
– Idle Mode:
– Stops CPU but allowing the SPI, USART, Analog Comparator, ADC, 2-wire
Serial Interface, Timer/Counters, Watchdog, and the interrupt system to
continue operating. This sleep mode basically halts clkCPU and clkFLASH,
while allowing the other clocks to run.
– Idle mode enables the MCU to wake up from external triggered interrupts as
well as internal ones like the Timer Overflow and USART Transmit Complete
interrupts.
– Power Down Mode:
– In this mode, the external Oscillator is stopped, while the external interrupts,
the 2- wire Serial Interface address watch, and the Watchdog continue operating
(if enabled). Only an External Reset, a Watchdog System Reset, a Watchdog
Interrupt, a Brown-out Reset, a 2-wire Serial Interface address match, an
external level interrupt on INT0 or INT1, or a pin change interrupt can wake up
the MCU. This sleep mode basically halts all generated clocks, allowing
operation of asynchronous modules only.
– Power Save Mode:
– This mode is identical to Power-down, with one exception: If Timer/Counter2
is enabled, it will keep running during sleep. The device can wake up from either
Timer Overflow or Output Compare event from Timer/Counter2 if the
corresponding Timer/Counter2 interrupt enable bits are set in TIMSK2, and the
Global Interrupt Enable bit in SREG is set.
Power Management and Sleep Modes:
• ADC Noise Reduction Mode:
– Noise Reduction mode, stopping the CPU but allowing the ADC, the external
interrupts, the 2- wire Serial Interface address watch, Timer/Counter2(1), and
the Watchdog to continue operating (if enabled). This sleep mode basically halts
clkI/O, clkCPU, and clkFLASH, while allowing the other clocks to run.
• Standby Mode:
– This mode is identical to Power-down with the exception that the Oscillator is
kept running.
• Extended Standby Mode:
– This mode is identical to Power-save with the exception that the Oscillator is
kept running.

Pin Diagram of Arduino:

Pin
Pin name Description Secondary Function
No.
Pin by default is used as RESET pin. PC6 can only
1 PC6 () Pin6 of PORTC be used as I/O pin when RSTDISBL Fuse is
programmed.
RXD (Data Input Pin for USART)
2 PD0 (RXD) Pin0 of PORTD
USART Serial Communication Interface
TXD (Data Output Pin for USART)
3 PD1 (TXD) Pin1 of PORTD USART Serial Communication Interface
INT2( External Interrupt 2 Input)

External Interrupt source 0


4 PD2 (INT0) Pin2 of PORTD

External Interrupt source1


PD3
5 Pin3 of PORTD
(INT1/OC2B) OC2B(PWM - Timer/Counter2 Output Compare
Match B Output)

T0( Timer0 External Counter Input)


6 PD4 (XCK/T0) Pin4 of PORTD
XCK ( USART External Clock I/O)

7 VCC Connected to positive voltage

8 GND Connected to ground

XTAL1 (Chip Clock Oscillator pin 1 or External


PB6
9 Pin6 of PORTB clock input)
(XTAL1/TOSC1)
TOSC1 (Timer Oscillator pin 1)
PB7 XTAL2 (Chip Clock Oscillator pin 2)
10 Pin7 of PORTB
(XTAL2/TOSC2) TOSC2 (Timer Oscillator pin 2)
T1(Timer1 External Counter Input)
PD5
11 Pin5 of PORTD OC0B(PWM - Timer/Counter0 Output Compare
(T1/OC0B)
Match B Output)
AIN0(Analog Comparator Positive I/P)
PD6
12 Pin6 of PORTD OC0A(PWM - Timer/Counter0 Output Compare
(AIN0/OC0A)
Match A Output)

13 PD7 (AIN1) Pin7 of PORTD AIN1(Analog Comparator Negative I/P)

ICP1(Timer/Counter1 Input Capture Pin)


PB0
14 Pin0 of PORTB CLKO (Divided System Clock. The divided system
(ICP1/CLKO)
clock can be output on the PB0 pin)
OC1A (Timer/Counter1 O/p Compare Match A
15 PB1 (OC1A) Pin1 of PORTB
O/p)
(SPI Slave Select Input). This pin is low when
controller acts as slave. [SPI for programming]
16 PB2 (/OC1B) Pin2 of PORTB
OC1B (Timer/Counter1 O/p Compare Match B
O/p)
MOSI (Master Output Slave Input). When
PB3 controller acts as slave, the data is received by this
17 Pin3 of PORTB
(MOSI/OC2A) pin. [SPI for programming]
OC2 (Timer/Counter2 O/p Compare Match O/p)
MISO (Master Input Slave Output). When
controller acts as slave, the data is sent to master by
18 PB4 (MISO) Pin4 of PORTB
this controller through this pin. [SPI for
programming]
SCK (SPI Bus Serial Clock). This is the clock
19 PB5 (SCK) Pin5 of PORTB shared between this controller and other system for
accurate data transfer. [SPI for programming]

20 AVCC Power for Internal ADC Converter

21 AREF Analog Reference Pin for ADC

22 GND GROUND

23 PC0 (ADC0) Pin0 of PORTC ADC0 (ADC Input Channel 0)

24 PC1 (ADC1) Pin1 of PORTC ADC1 (ADC Input Channel 1)

25 PC2 (ADC2) Pin2 of PORTC ADC2 (ADC Input Channel 2)

26 PC3 (ADC3) Pin3 of PORTC ADC3 (ADC Input Channel 3)

PC4 ADC4 (ADC Input Channel 4)


27 Pin4 of PORTC
(ADC4/SDA) SDA (Two-wire Serial Bus Data Input/output Line)
ADC5 (ADC Input Channel 5)
28 PC5 (ADC5/SCL) Pin5 of PORTC
SCL (Two-wire Serial Bus Clock Line)

What is Arduino Not?


• It is not a chip (IC)
• It is not a board (PCB)
• It is not a company or a manufacturer
• It is not a programming language
• It is not a computer architecture
(although it involves all of these things...)

So what is Arduino?
• Based on “Wiring Platform”
• Open-source hardware platform
• Open source development environment
– Easy-to learn language and libraries (based on Wiring language)
– Integrated development environment (based on Processing programming environment)
– Available for Windows / Mac / Linux

The Many Flavors of Arduino:


• Arduino Uno
• Arduino Leonardo
• Arduino LilyPad
• Arduino Mega
• Arduino Nano
• Arduino Mini
• Arduino Mini Pro
• Arduino BT
Getting to know the Arduino:

Arduino programming Functions:


• Constants
Floating Point Constants
Integer Constants
HIGH | LOW
INPUT | OUTPUT | INPUT_PULLUP
LED_BUILTIN
true | false
• Conversion
(unsigned int)
(unsigned long)
byte()
char()
float()
int()
long()
word()
• Data Types
String()
array
bool
boolean
byte
char
double
float
int
long
short
size_t
string
unsigned char
unsigned int
unsigned long
void
word
• Variable Scope & Qualifiers
const
scope
static
volatile
• Utilities
PROGMEM
sizeof()

pinMode():
Description: Configures the specified pin to behave either as an input or an output. As of
Arduino 1.0.1, it is possible to enable the internal pullup resistors with the mode
INPUT_PULLUP. Additionally, the INPUT mode explicitly disables the internal
pullups.
Syntax: pinMode(pin, mode)
Parameters:
pin: the number of the pin whose mode you wish to set
mode: INPUT, OUTPUT, or INPUT_PULLUP.
Returns: None
digitalWrite():
Description: Write a HIGH or a LOW value to a digital pin. If the pin has been configured
as an OUTPUT with pinMode(), its voltage will be set to the corresponding value: 5V
(or 3.3V on 3.3V boards) for HIGH, 0V (ground) for LOW.
If the pin is configured as an INPUT, digitalWrite() will enable (HIGH) or disable (LOW)
the internal pullup on the input pin. It is recommended to set the pinMode() to
INPUT_PULLUP to enable the internal pull-up resistor.
NOTE: If you do not set the pinMode() to OUTPUT, and connect an LED to a pin, when
calling digitalWrite(HIGH), the LED may appear dim. Without explicitly setting
pinMode(), digitalWrite() will have enabled the internal pull-up resistor, which acts like
a large current-limiting resistor.
Syntax: digitalWrite(pin, value)
Parameters:
pin: the pin number
value: HIGH or LOW
Returns: none
digitalRead():
Description: Reads the value from a specified digital pin, either HIGH or LOW.
Syntax: digitalRead(pin)
Parameters:
pin: the number of the digital pin you want to read (int)
Returns: HIGH or LOW
analogReference():
Description: Configures the reference voltage used for analog input (i.e. the value used
as the top of the input range). The options are:
– DEFAULT: the default analog reference of 5 volts (on 5V Arduino boards) or 3.3 volts
(on 3.3V Arduino boards)
– INTERNAL: an built-in reference, equal to 1.1 volts on the ATmega168 or ATmega328
and 2.56 volts on the ATmega8 (not available on the Arduino Mega)
– INTERNAL1V1: a built-in 1.1V reference (Arduino Mega only)
– INTERNAL2V56: a built-in 2.56V reference (Arduino Mega only)
– EXTERNAL: the voltage applied to the AREF pin (0 to 5V only) is used as the reference.
Syntax: analogReference(type)
Parameters:
type: which type of reference to use (DEFAULT, INTERNAL, INTERNAL1V1,
INTERNAL2V56, or EXTERNAL).
Returns: None.
analogWrite():
Description: Writes an analog value (PWM wave) to a pin. Can be used to light a LED
at varying brightnesses or drive a motor at various speeds. After a call to analogWrite(),
the pin will generate a steady square wave of the specified duty cycle until the next call
to analogWrite() (or a call to digitalRead() or digitalWrite() on the same pin). The
frequency of the PWM signal on most pins is approximately 490 Hz. On the Uno and
similar boards, pins 5 and 6 have a frequency of approximately 980 Hz.
You do not need to call pinMode() to set the pin as an output before calling analogWrite().
The analogWrite function has nothing to do with the analog pins or the analogRead
function.
Syntax: analogWrite(pin, value)
Parameters:
pin: the pin to write to.
value: the duty cycle: between 0 (always off) and 255 (always on).
Returns: nothing
Notes and Known Issues
The PWM outputs generated on pins 5 and 6 will have higher-than-expected duty cycles.
This is because of interactions with the millis() and delay() functions, which share the
same internal timer used to generate those PWM outputs.
analogRead():
Description: Reads the value from the specified analog pin. The Arduino board contains
a 6 channel (8 channels on the Mini and Nano, 16 on the Mega), 10-bit analog to digital
converter. This means that it will map input voltages between 0 and 5 volts into integer
values between 0 and 1023. This yields a resolution between readings of: 5 volts / 1024
units or, .0049 volts (4.9 mV) per unit. The input range and resolution can be changed
using analogReference().
It takes about 100 microseconds (0.0001 s) to read an analog input, so the maximum
reading rate is about 10,000 times a second.
Syntax: analogRead(pin)
Parameters:
pin: the number of the analog input pin to read from (0 to 5 on most boards, 0 to 7 on the
Mini and Nano, 0 to 15 on the Mega)
Returns: int (0 to 1023)
Note:
If the analog input pin is not connected to anything, the value returned by analogRead()
will fluctuate based on a number of factors.
shiftOut():
Description: Shifts out a byte of data one bit at a time. Starts from either the most (i.e.
the leftmost) or least (rightmost) significant bit. Each bit is written in turn to a data pin,
after which a clock pin is pulsed (taken high, then low) to indicate that the bit is available.
Note: if you're interfacing with a device that's clocked by rising edges, you'll need to
make sure that the clock pin is low before the call to shiftOut(), e.g. with a call to
digitalWrite (clockPin, LOW).
Syntax: shiftOut(dataPin, clockPin, bitOrder, value)
Parameters
dataPin: the pin on which to output each bit (int)
clockPin: the pin to toggle once the dataPin has been set to the correct value (int)
bitOrder: which order to shift out the bits; either MSBFIRST or LSBFIRST.
value: the data to shift out. (byte)
Returns: None
Note
The dataPin and clockPin must already be configured as outputs by a call to pinMode().
shiftOut is currently written to output 1 byte (8 bits) so it requires a two step operation to
output values larger than 255.
shiftIn():
Description: Shifts in a byte of data one bit at a time. Starts from either the most (i.e. the
leftmost) or least (rightmost) significant bit. For each bit, the clock pin is pulled high, the
next bit is read from the data line, and then the clock pin is taken low.
If you're interfacing with a device that's clocked by rising edges, you'll need to make sure
that the clock pin is low before the first call to shiftIn(), e.g. with a call to
digitalWrite(clockPin, LOW).
Syntax:
byte incoming = shiftIn(dataPin, clockPin, bitOrder)
Parameters
dataPin: the pin on which to input each bit (int)
clockPin: the pin to toggle to signal a read from dataPin
bitOrder: which order to shift in the bits; either MSBFIRST or LSBFIRST.
(Most Significant Bit First, or, Least Significant Bit First)
Returns : the value read (byte)
pulseIn():
Description: Reads a pulse (either HIGH or LOW) on a pin. For example, if value is
HIGH, pulseIn() waits for the pin to go HIGH, starts timing, then waits for the pin to go
LOW and stops timing. Returns the length of the pulse in microseconds or 0 if no
complete pulse was received within the timeout.
The timing of this function has been determined empirically and will probably show
errors in shorter pulses. Works on pulses from 10 microseconds to 3 minutes in length.
Please also note that if the pin is already high when the function is called, it will wait for
the pin to go LOW and then HIGH before it starts counting. This routine can be used
only if interrupts are activated. Furthermore the highest resolution is obtained with short
intervals.
Syntax:
pulseIn(pin,value)
pulseIn(pin, value, timeout)
Parameters:
pin: the number of the pin on which you want to read the pulse. (int)
value: type of pulse to read: either HIGH or LOW. (int)
timeout (optional): the number of microseconds to wait for the pulse to be completed:
the function returns 0 if no complete pulse was received within the timeout. Default is
one second (unsigned long).
Returns: the length of the pulse (in microseconds) or 0 if no pulse is completed before
the timeout (unsigned long)
millis():
Description: Returns the number of milliseconds since the Arduino board began running
the current program. This number will overflow (go back to zero), after approximately
50 days.
Parameters: None
Returns: Number of milliseconds since the program started (unsigned long)
Note: Please note that the return value for millis() is an unsigned long, logic errors may
occur if a programmer tries to do arithmetic with smaller data types such as int's. Even
signed long may encounter errors as its maximum value is half that of its unsigned
counterpart.
micros():
Description: Returns the number of microseconds since the Arduino board began running
the current program. This number will overflow (go back to zero), after approximately
70 minutes. On 16 MHz Arduino boards (e.g. Duemilanove and Nano), this function has
a resolution of four microseconds (i.e. the value returned is always a multiple of four).
On 8 MHz Arduino boards (e.g. the LilyPad), this function has a resolution of eight
microseconds.
Note: there are 1,000 microseconds in a millisecond and 1,000,000 microseconds in a
second.
Parameters: None
Returns: Number of microseconds since the program started (unsigned long)
delay()/delayMicroseconds():
Description: Pauses the program for the amount of time (in miliseconds/microseconds)
specified as parameter.
Syntax: delay(ms) /delayMicrseconds(us)
Parameters: ms: the number of milliseconds to pause (unsigned long) OR
us: the number of microseconds to pause (unsigned long/ unsigned int)
Returns: nothing
Serial.println():
Description: Prints data to the serial port as human-readable ASCII text followed by a
carriage return character (ASCII 13, or '\r') and a newline character (ASCII 10, or '\n').
This command takes the same forms as Serial.print().
Syntax:
Serial.println(val)
Serial.println(val, format)
Parameters:
Serial: serial port object.
val: the value to print. Allowed data types: any data type.
format: specifies the number base (for integral data types) or number of decimal places
(for floating point types).
Returns: println() returns the number of bytes written, though reading that number is
optional. Data type: size_t.
if(Serial):
Description:
Indicates if the specified Serial port is ready. On the boards with native USB, if
(Serial) indicates whether or not the USB CDC serial connection is open. For all other
boards, and the non-USB CDC ports, this will always return true.
Syntax: if (Serial)
Parameters: None
Returns: Returns true if the specified serial port is available. Data type: bool.
Serial.available():
Description: Get the number of bytes (characters) available for reading from the serial
port. This is data that’s already arrived and stored in the serial receive buffer (which
holds 64 bytes).
Syntax: Serial.available()
Parameters:
Serial: serial port object.
Returns: The number of bytes available to read.
Serial.print():
Description: Prints data to the serial port as human-readable ASCII text. This command
can take many forms. Numbers are printed using an ASCII character for each digit. Floats
are similarly printed as ASCII digits, defaulting to two decimal places. Bytes are sent as
a single character. Characters and strings are sent as is. For example-
Serial.print(78) gives "78“ OR Serial.print(1.23456) gives "1.23“ OR
Serial.print('N') gives "N“ OR Serial.print("Hello world.") gives "Hello world."
An optional second parameter specifies the base (format) to use; permitted values
are BIN(binary, or base 2), OCT(octal, or base 8), DEC(decimal, or base
10), HEX(hexadecimal, or base 16). For floating point numbers, this parameter
specifies the number of decimal places to use. For example-
Serial.print(78, BIN) gives "1001110“ OR Serial.print(78, OCT) gives "116“ OR
Serial.print(78, DEC) gives "78“ OR Serial.print(78, HEX) gives "4E“ OR
Serial.print(1.23456, 0) gives "1“ OR Serial.print(1.23456, 2) gives "1.23” OR
Serial.print(1.23456, 4) gives "1.2346"
You can pass flash-memory based strings to Serial.print() by wrapping them with F().
For example:
Serial.print(F(“Hello World”))
To send data without conversion to its representation as characters, use Serial.write().
Syntax:
Serial.print(val)
Serial.print(val, format)
Parameters:
Serial: serial port object.
val: the value to print. Allowed data types: any data type.
Returns: print() returns the number of bytes written, though reading that number is
optional. Data type: size_t.
Serial.write():
Description: Writes binary data to the serial port. This data is sent as a byte or series of
bytes; to send the characters representing the digits of a number use the print() function
instead.
Syntax:
Serial.write(val)
Serial.write(str)
Serial.write(buf, len)
Parameters:
Serial: serial port object.
val: a value to send as a single byte.
str: a string to send as a series of bytes.
buf: an array to send as a series of bytes.
len: the number of bytes to be sent from the array.
Returns: write() will return the number of bytes written, though reading that number is
optional. Data type: size_t.
Serial.read():
Description: Reads incoming serial data.
Syntax: Serial.read()
Parameters:
Serial: serial port object.
Returns: The first byte of incoming serial data available (or -1 if no data is available).
Data type: int.
Serial.begin():
Description: Sets the data rate in bits per second (baud) for serial data transmission. For
communicating with Serial Monitor, make sure to use one of the baud rates listed in the
menu at the bottom right corner of its screen. You can, however, specify other rates -
for example, to communicate over pins 0 and 1 with a component that requires a
particular baud rate.
An optional second argument configures the data, parity, and stop bits. The default is
8 data bits, no parity, one stop bit.
Syntax:
Serial.begin(speed)
Serial.begin(speed, config)
Parameters:
Serial: serial port object.
speed: in bits per second (baud). Allowed data types: long.
config: sets data, parity, and stop bits. E.G. SERIAL_8N1
Returns:Nothing
sq():
Description: Calculates the square of a number: the number multiplied by itself.
Syntax: sq(x)
Parameters:
x: the number. Allowed data types: any data type.
Returns: The square of the number. Data type: double.
Notes and Warnings:
Because of the way the sq() function is implemented, avoid using other functions
inside the brackets, it may lead to incorrect results.
This code will yield incorrect results:
int inputSquared = sq(Serial.parseInt()); // avoid this
sqrt():
Description: Calculates the square root of a number.
Syntax: sqrt(x)
Parameters:
x: the number. Allowed data types: any data type.
Returns: The number’s square root.
Data type: double.
pow():
Description: Calculates the value of a number raised to a power. pow() can be used to
raise a number to a fractional power. This is useful for generating exponential mapping
of values or curves.
Syntax: pow(base, exponent)
Parameters:
base: the number. Allowed data types: float.
exponent: the power to which the base is raised. Allowed data types: float.
Returns: The result of the exponentiation. Data type: double.
map():
Description: Re-maps a number from one range to another. That is, a value
of fromLow would get mapped to toLow, a value of fromHigh to toHigh, values in-
between to values in-between, etc. Does not constrain values to within the range,
because out-of-range values are sometimes intended and useful.
The constrain() function may be used either before or after this function, if limits to the
ranges are desired.
Note that the "lower bounds" of either range may be larger or smaller than the "upper
bounds" so the map() function may be used to reverse a range of numbers, for example
y = map(x, 1, 50, 50, 1);
The function also handles negative numbers well, so that this example y = map(x, 1,
50, 50, -100); is also valid.
The map() function uses integer math so will not generate fractions, when the math
might indicate that it should do so. Fractional remainders are truncated, and are not
rounded or averaged.
Syntax: map(value, fromLow, fromHigh, toLow, toHigh)
Parameters:
value: the number to map.
fromLow: the lower bound of the value’s current range.
fromHigh: the upper bound of the value’s current range.
toLow: the lower bound of the value’s target range.
toHigh: the upper bound of the value’s target range.
Returns: The mapped value.

Arduino Bootloader:
• Almost all microcontroller (and microprocessor) development systems use some form
of a bootloader. Often called firmware, mistakenly, the Arduino bootloader is one
example. Since it is a rather popular platform, let’s use it as an example. Let’s talk about
what a bootloader does and how it works.
• When a microcontroller turns on, it only knows how to do one thing. Typically, that
one thing is to run an instruction found at a specific memory location. Often this
location address 0x0000, but not always. Usually, this memory location will contain a
jump instruction to another place in memory, which is the start of the user program.
The bootloader, however, exists in a slightly separate memory space from the user
program.
• On power-up or reset, a bootloader is a section of program memory that runs before the
main code runs. It can be used to setup the microcontroller or provide limited ability to
update the main program’s code.

How the Arduino Bootloader works:


• The Arduino bootloader supports re-programming the program memory (or Flash) over
serial. Without it, you would need a dedicated hardware programmer, like the USBtiny,
to change the code in the Uno’s ATmega328p.
• Like many microcontrollers, the ATmega328p dedicates a portion of its program
memory for bootloading code. In the case of the Uno, the Arduino bootloader waits a
short time while watching the UART pins. If the bootloader does not receive a particular
sequence of bytes over the serial port, then the processor jumps to the “user” program
section to load whatever is already in program memory. This jump loads your program
(or sketch).
• If the pre-defined sequence is received, the Arduino bootloader pretends to be an AVR
programmer. In the case of the Uno, a unique programming protocol called “arduino”
is used. This “special protocol” is a variant of the stk500 hardware programmer. Other
Arduino boards have other programmer types they emulate.
• At this point avrdude will begin sending your program’s binary data with a protocol
over the virtual serial port. The program memory (PROGMEM) of the ATmega328p
stores the incoming byte stream. Once completed, the chip resets and starts the new
code.
• Re-programming the FLASH, without a dedicated programmer, is only possible when
the code is running during the “bootload” phase. The idea is to prevent malicious (or
buggy) code from changing the contents of Flash.
• In short:
– The Arduino bootloader is always the first stuff to run.
– If the bootloader receives a unique sequence, then the byte stream is
programmed into FLASH
– Else, the bootloader turns control over to the existing user code.
Bootloader Advantages:
• At a higher level, bootloaders can be used to either pre-configure a microcontroller /
microprocessor or, in the case of the Arduino bootloader, allow “field programming”
of an end device. While you might use the Arduino bootloader to test and develop code
in your project, you could also use the bootloader to update code in a shipping product.
• While bootloaders offer incredible advantages, there are some aspects you should
consider if you keep a bootloader installed on a finished project.
Chapter 3: Core of Embedded Systems
 CPU Architectures
 Von Neumonn Architecture
 Harvard Architecture
 Difference between the Two
 Instruction Set Architecture
 Reduced Instruction Set Architecture
 Complex Instruction Set Architecture
 Difference between the Two
 Endian
 Little Endian Processor
 Big Endian Processor
 Application Specific Integrated Circuits(ASIC)
 Processor Performance Enhancement
 Pipelining
 Super Scalar Execution
 CPU Power Consumption

Von Neumann and Harvard Architecture:


 There are two types of digital computer architectures that describe the functionality
and implementation of computer systems.
 One is the Von Neumann architecture that was designed by the renowned physicist
and mathematician John Von Neumann in the late 1940s, and the other one is the
Harvard architecture which was based on the original Harvard Mark I relay-based
computer which employed separate memory systems to store data and instructions.
 The original Harvard architecture used to store instructions on punched tape and data
in electro-mechanical counters. The Von Neumann architecture forms the basis of
modern computing and is easier to implement.
Von Neumann Architecture:
 The architecture was designed by the renowned mathematician and physicist John Von
Neumann in 1945.
 In this architecture, one data path or bus exists for both instruction and data. As a result,
the CPU does one operation at a time. It either fetches an instruction from memory, or
performs read/write operation on data. So an instruction fetch and a data operation
cannot occur simultaneously, sharing a common bus.
 Von-Neumann architecture supports simple hardware. It allows the use of a single,
sequential memory. Today's processing speeds vastly outpace memory access times,
and we employ a very fast but small amount of memory (cache) local to the processor.
 It’s a theoretical design based on the concept of stored-program computers where
program data and instruction data are stored in the same memory.
 Until the Von Neumann concept of computer design, computing machines were
designed for a single predetermined purpose that would lack sophistication because of
the manual rewiring of circuitry.

 The idea behind the Von Neumann architectures is the ability to store instructions in
the memory along with the data on which the instructions operate. In short, the Von
Neumann architecture refers to a general framework that a computer’s hardware,
programming, and data should follow.
 The Von Neumann architecture consists of three distinct components: a central
processing unit (CPU), memory unit, and input/output (I/O) interfaces. The CPU is the
heart of the computer system that consists of three main components: the Arithmetic
and Logic Unit (ALU), the control unit (CU), and registers.
 The Von Neumann bottleneck occurs when data taken in or out of memory must wait
while the current memory operation is completed. That is, if the processor just
completed a computation and is ready to perform the next, it has to write the finished
computation into memory (which occupies the bus) before it can fetch new data out of
memory (which also uses the bus). The Von Neumann bottleneck has increased over
time because processors have improved in speed while memory has not progressed as
fast. Some techniques to reduce the impact of the bottleneck are to keep memory in
cache to minimize data movement, hardware acceleration, and speculative execution.
Harvard Architecture:
 The Harvard architecture stores machine instructions and data in separate memory units
that are connected by different busses.
 This architecture has data storage entirely contained within the CPU, and there is no
access to the instruction storage as data. Computers have separate memory areas for
program instructions and data using internal data buses, allowing simultaneous access
to both instructions and data.
 Programs needed to be loaded by an operator; the processor could not boot itself. In a
Harvard architecture, there is no need to make the two memories share properties.
 In this case, there are at least two memory address spaces to work with, so there is a
memory register for machine instructions and another memory register for data.
 Computers designed with the Harvard architecture are able to run a program and access
data independently, and therefore simultaneously.
 Harvard architecture has a strict separation between data and code. Thus, Harvard
architecture is more complicated but separate pipelines remove the bottleneck that Von
Neumann creates.

Von-Neumann Architecture Harvard Architecture

Single memory to be shared by both code Separate memories for code and data.
and data.

Processor needs to fetch code in a separate Single clock cycle is sufficient, as separate
clock cycle and data in another clock cycle. buses are used to access code and data.
So it requires two clock cycles.

Higher speed, thus less time consuming. Slower in speed, thus more time-consuming.

Simple in design. Complex in design.

It is ancient computer architecture based on It is modern computer architecture based on


stored program computer concept. Harvard Mark I relay based model.

There is common bus for data and Separate buses are used for transferring data
instruction transfer. and instruction.

It is cheaper in cost. It is costly than Von Neumann Architecture.

CPU cannot access instructions and CPU can access instructions and read/write at
read/write at the same time. the same time.

It is used in personal computers and small It is used in micro controllers and signal
computers. processing.
Reduced Instruction Set Architecture (RISC) –
 A reduced instruction set computer is a computer which only uses simple commands
that can be divided into several instructions which achieve low-level operation within
a single CLK cycle, as its name proposes “Reduced Instruction Set”.
 The term RISC stands for ‘’Reduced Instruction Set Computer’’. It is a CPU design
plan based on simple orders and acts fast.
 This is small or reduced set of instructions. Here, every instruction is expected to attain
very small jobs. In this machine, the instruction sets are modest and simple, which help
in comprising more complex commands. Each instruction is of the similar length; these
are wound together to get compound tasks done in a single operation. Most commands
are completed in one machine cycle. This pipelining is a crucial technique used to speed
up RISC machines.
 The main idea behind is to make hardware simpler by using an instruction set composed
of a few basic steps for loading, evaluating, and storing operations just like a load
command will load data, store command will store the data.
 Reduce the cycles per instruction at the cost of the number of instructions per program.
 ARM, PA-RISC, Power Architecture, Alpha, AVR, ARC and the SPARC.

Characteristic of RISC –
 Simpler instruction, hence simple instruction decoding.
 Instruction comes undersize of one word.
 Instruction takes a single clock cycle to get executed.
 More number of general-purpose registers.
 Simple Addressing Modes.
 Less Data types.
 Pipeline can be achieved.

Complex Instruction Set Architecture (CISC) –


 A complex instruction set computer is a computer where single instructions can perform
numerous low-level operations like a load from memory, an arithmetic operation, and
a memory store or are accomplished by multi-step processes or addressing modes in
single instructions, as its name proposes “Complex Instruction Set”.
 It is a CPU design plan based on single commands, which are skilled in executing multi-
step operations.
 CISC computers have small programs. It has a huge number of compound instructions,
which takes a long time to perform. Here, a single set of instruction is protected in
several steps; each instruction set has additional than 300 separate instructions.
Maximum instructions are finished in two to ten machine cycles. In CISC, instruction
pipelining is not easily implemented.
 The main idea is that a single instruction will do all loading, evaluating, and storing
operations just like a multiplication command will do stuff like loading data, evaluating,
and storing it, hence it’s complex.
 The CISC approach attempts to minimize the number of instructions per program but
at the cost of increase in number of cycles per instruction.
 VAX, Motorola 68000 family, System/360, AMD and the Intel x86 CPUs.

Characteristic of CISC –
 Complex instruction, hence complex instruction decoding.
 Instructions are larger than one-word size.
 Instruction may take more than a single clock cycle to get executed.
 Less number of general-purpose registers as operation get performed in memory
itself.
 Complex Addressing Modes.
 More Data types.

RISC Architecture CISC Architecture

Focus on software Focus on hardware


Uses both hardwired and micro
Uses only Hardwired control unit
programmed control unit

Simpler design of compiler, considering


Complex design of compiler.
larger set of instructions.

Transistors are used for storing complex


Transistors are used for more registers
Instructions

Fixed sized instructions Variable sized instructions

Can perform only Register to Register Can perform REG to REG or REG to MEM
Arithmetic operations or MEM to MEM

Requires more number of registers Requires less number of registers

Code size is large Code size is small

An instruction execute in a single clock cycle Instruction takes more than one clock cycle

Instructions are larger than the size of one


An instruction fit in one word
word

Pipelining of instructions is possible, Pipelining is not possible.


considering single clock cycle.

DSP processors:
 These are microprocessors designed for efficient mathematical manipulation of digital
signals. DSP evolved from Analog Signal Processors (ASPs), using analog hardware to
transform physical signals
 ASP to DSP because
 DSP insensitive to environment
 DSP performance identical even with variations in components; 2 analog
systems behavior varies even if built with same components with 1% variations
 DSPs tend to run one program, not many programs. – Hence OSes are much simpler,
there is no virtual memory or protection, ...
 DSPs usually run applications with hard real-time constraints:
 You must account for anything that could happen in a time slot
 All possible interrupts or exceptions must be accounted for and their collective
time be subtracted from the time interval.
 Therefore, exceptions are bad.
 DSPs usually process infinite continuous data streams.
 The design of DSP architectures and ISAs driven by the requirements of DSP
algorithms.

ENDIAN:
 Endianness is a term that describes the order in which a sequence of bytes are stored in
computer memory. Endianness can be either big or small, with the adjectives referring
to which value is stored first.
 Big-endian is an order in which the "big end" (most significant value in the sequence)
is stored first (at the lowest storage address). For example, in a big-endian computer,
the two bytes required for the hexadecimal number 4F52 would be stored as 4F52 in
storage (if 4F is stored at storage address 1000, for example, 52 will be at address 1001).
 Little-endian is an order in which the "little end" (least significant value in the
sequence) is stored first. For example, in a little-endian computer, the two bytes
required for the hexadecimal number 4F52 would be stored as 524F (52 at address
1000, 4F at 1001).
 For people who use languages that read left-to-right, big endian seems like the natural
way to think of a storing a string of characters or numbers - in the same order you expect
to see it presented to you. Many of us would thus think of big-endian as storing
something in forward fashion, just as we read.
 Note that within both big-endian and little-endian byte orders, the bits within each byte
are big-endian. That is, there is no attempt to be big- or little-endian about the entire bit
stream represented by a given number of stored bytes. or example, whether
hexadecimal 4F is put in storage first or last with other bytes in a given storage address
range, the bit order within the byte will be:
01001111
 It is possible to be big-endian or little-endian about the bit order, but CPUs and
programs are almost always designed for a big-endian bit order. In data transmission,
however, it is possible to have either bit order.
 Experts observe that Internet domain name addresses and e-mail addresses are little-
endian. For example, a big-endian version of our domain name address would be:
com.whatis.www
 IBM's 370 mainframes, most RISC-based computers, and Motorola
microprocessors use the big-endian approach. TCP/IP also uses the big-endian
approach (and thus big-endian is sometimes called network order).
 On the other hand, Intel processors (CPUs) and DEC Alphas and at least some
programs that run on them are little-endian.
 The diagram shows the different storage formats for big and little endian double words,
words, half words and bytes.
 The most significant byte in each pair is shaded to highlight its position.
 Note that there is no difference when storing individual bytes
ASIC: Application Specific IC:
 An application-specific integrated circuit (ASIC) is a kind of integrated circuit that is
specially built for a specific application or purpose. Compared to a programmable logic
device or a standard logic integrated circuit, an ASIC can improve speed because it is
specifically designed to do one thing and it does this one thing well. It can also be made
smaller and use less electricity.
 The disadvantage of this circuit is that it can be more expensive to design and
manufacture, particularly if only a few units are needed.
 An ASIC can be found in almost any electronic device and its uses can range from
custom rendering of images to sound conversion.
 Because ASICs are all custom-made and thus only available to the company that
designed them, they are considered to be proprietary technology.
ASIC: Types
There are three different categories of ASICS:
 Full-Custom ASICs: These are custom-made from scratch for a specific application.
Their ultimate purpose is decided by the designer. All the photolithographic layers of
this integrated circuit are already fully defined, leaving no room for modification during
manufacturing.
 Semi-Custom ASICs: These are partly customized to perform different functions within
the field of their general area of application. These ASICS are designed to allow some
modification during manufacturing, although the masks for the diffused layers are
already fully defined.
 Platform ASICs: These are designed and produced from a defined set of methodologies,
intellectual properties and a well-defined design of silicon that shortens the design cycle
and minimizes development costs. Platform ASICs are made from predefined platform
slices, where each slice is a pre-manufactured device, platform logic or entire system.
The use of pre-manufactured materials reduces development costs for these circuits.
Processor Performance:
A. Pipelining:
 Pipelining is the process of accumulating instruction from the processor through a
pipeline. It allows storing and executing instructions in an orderly process. It is also
known as pipeline processing.
 Pipelining is a technique where multiple instructions are overlapped during execution.
Pipeline is divided into stages and these stages are connected with one another to form
a pipe like structure. Instructions enter from one end and exit from another end.
 Pipelining increases the overall instruction throughput.
 In pipeline system, each segment consists of an input register followed by a
combinational circuit. The register is used to hold data and combinational circuit
performs operations on it. The output of combinational circuit is applied to the input
register of the next segment.
 It allows storing, prioritizing, managing and executing tasks and instructions in an
orderly process.
 A stream of instructions can be executed by
overlapping fetch, decode and execute phases of an instruction cycle. This type of
technique is used to increase the throughput of the computer system.
 An instruction pipeline reads instruction from the memory while previous instructions
are being executed in other segments of the pipeline. Thus we can execute multiple
instructions simultaneously.
 The pipeline will be more efficient if the instruction cycle is divided into segments of
equal duration
 The cycle time of the processor is reduced.
 It increases the throughput of the system
 It makes the system reliable
B. Superscalar Processor:
 A Scalar processor is a normal processor, which works on simple instruction at a time,
which operates on single data items.
 But in today's world, this technique will prove to be highly inefficient, as the overall
processing of instructions will be very slow. An instruction pipeline reads instruction
from the memory while previous instructions are being executed in other segments of
the pipeline. Thus we can execute multiple instructions simultaneously.
 Vector processors: There is a class of computational problems that are beyond the
capabilities of a conventional computer. These problems require vast number of
computations on multiple data items that will take a conventional computer (with scalar
processor) days or even weeks to complete.
 Such complex instructions, which operates on multiple data at the same time, requires
a better way of instruction execution, which was achieved by Vector processors.
 Scalar CPUs can manipulate one or two data items at a time, which is not very efficient.
Also, simple instructions like ADD A to B, and store into C are not practically efficient.
 Addresses are used to point to the memory location where the data to be operated will
be found, which leads to added overhead of data lookup. So until the data is found, the
CPU would be sitting ideal, which is a big performance issue.
 A Hence, the concept of Instruction Pipeline comes into picture, in which the
instruction passes through several sub-units in turn. Vector processor, not only use
Instruction pipeline, but it also pipelines the data, working on multiple data at the same
time.
 A normal scalar processor instruction would be ADD A, B, which leads to addition of
two operands, but what if we can instruct the processor to ADD a group of numbers
(from 0to n memory location) to another group of numbers(lets say, n to k memory
location). This can be achieved by vector processors.
 In vector processor a single instruction, can ask for multiple data operations, which
saves time, as instruction is decoded once, and then it keeps on operating on different
data items.
 Superscalar Processors: It was first invented in 1987. It is a machine which is
designed to improve the performance of the scalar processor. In most applications, most
of the operations are on scalar quantities. Superscalar approach produces the high
performance general purpose processors.
 The main principle of superscalar approach is that it executes instructions
independently in different pipelines. As we already know, that Instruction pipelining
leads to parallel processing thereby speeding up the processing of instructions. In
Superscalar processor, multiple such pipelines are introduced for different operations,
which further improves parallel processing.
 There are multiple functional units each of which is implemented as a pipeline. Each
pipeline consists of multiple stages to handle multiple instructions at a time which
support parallel execution of instructions.
 It increases the throughput because the CPU can execute multiple instructions per clock
cycle. Thus, superscalar processors are much faster than scalar processors.
 A scalar processor works on one or two data items, while the vector processor works
with multiple data items. A superscalar processor is a combination of both. Each
instruction processes one data item, but there are multiple execution units within each
CPU thus multiple instructions can be processing separate data items concurrently.
 While a superscalar CPU is also pipelined, there are two different performance
enhancement techniques. It is possible to have a non-pipelined superscalar CPU or
pipelined non-superscalar CPU. The superscalar technique is associated with some
characteristics, these are:
 Instructions are issued from a sequential instruction stream.
 CPU must dynamically check for data dependencies.
 Should accept multiple instructions per clock cycle.
C. CPU Power Consumption:
 Embedded systems are becoming more and more complex. Due to technological
improvements, it is now possible to integrate a lot of components in a unique circuit.
 Nowadays, homogeneous or heterogeneous multiprocessor architectures within SoC
(System on Chip) or SiP (System in a Package) offer increasing computing capacities.
Meanwhile, applications are growing in complexity.
 Thus, embedded systems commonly have to perform different multiple tasks, from
control oriented (innovative user interfaces, adaptation to the environment, compliance
to new formats, quality of service management) to data intensive (multimedia, audio
and video coding/decoding, software radio, 3D image processing, communication
streaming), and to sustain high throughputs and bandwidths.
 One side effect of this global evolution is a drastic increase of the circuit’s power
consumption.
 Leakage power increases exponentially as the process evolves to finer technologies.
 Dynamic power is proportional to the operating frequency.
 With higher chip densities, thermal dissipation may involve costly cooling devices, and
battery life is definitely shortened.
 The role of an Operating System (OS) is essential in such a context. It also offers power
management services which may exploit low-level mechanisms (low operating/low-
standby power modes, voltage/frequency scaling, clock gating!!!) to reduce the
system’s energy consumption.
 But the Operating System itself has a non negligible impact on the energy consumption.
The Operating System’s energy overhead depends on the complexity of the applications
and the number of services called.
 it was observed that, depending on the application, the energy consumption of an
embedded system could rise from 6% to 50%. This ratio gets higher if the frequency
and supply voltage of the processor increase.
 Power consumption is now a major constraint in many designs. Being able to estimate
this consumption for the whole system and for all its components is now compulsory.
Estimating energy consumption due to the Operating Systems is thus unavoidable. It is
the first step towards the application of off-line or on-line power optimization
technique.
Chapter 4: Embedded Systems Memory
 CPU Bus:
 BUS Protocol
 Bus Organization
 Memory System Architecture
 Cache Memory, Virtual memory
 Memory Management Unit
 Address Translation
 Memory devices & their Characteristics
 SRAM, DRAM
 ROM, UVROM, EEPROM
 Flash Memory
CPU Bus:
 The bus is the mechanism by which the CPU communicates with memory and devices.
 A bus is, at a minimum, a collection of wires but it also defines a protocol by which the
CPU, memory, and devices communicate.
 One of the major roles of the bus is to provide an interface to memory. (Of course, I/O
devices also connect to the bus.) Based on understanding of the bus, we study the
characteristics of memory components in this section, focusing on DMA.
CPU Bus & Organization:
 A bus is a common connection between components in a system. The CPU, memory,
and I/O devices are all connected to the bus.
 The signals that make up the bus provide the necessary communication: the data itself,
addresses, a clock, some control signals.
 In a typical bus system, the CPU serves as the bus master and initiates all transfers. If
any device could request a transfer, then other devices might be starved of bus
bandwidth. As bus master, the CPU reads and writes data and instructions from
memory. It also initiates all reads or writes on I/O devices.
 The basic building block of most bus protocols is the four-cycle handshake. The
handshake ensures that when two devices want to communicate, one is ready to transmit
and the other is ready to receive.
 The handshake uses a pair of wires dedicated to the handshake: enq (meaning enquiry)
and ack (meaning acknowledge). Extra wires are used for the data transmitted during
the handshake. The four cycles are described below.
 Device 1 raises its output to signal an enquiry, which tells device 2 that it should
get ready to listen for data.
 When device 2 is ready to receive, it raises its output to signal an
acknowledgment. At this point, devices 1 and 2 can transmit or receive.
 Once the data transfer is complete, device 2 lowers its output, signalling that it
has received the data.
 After seeing that ack has been released, device 1 lowers its output.
 At the end of the handshake, both handshaking signals are low, just as they were at the
start of the handshake. The system has thus returned to its original state in readiness for
another handshake-enabled data transfer.
 Microprocessor buses build on the handshake for communication between the CPU and
other system components. The term bus is used in two ways.
 The most basic use is as a set of related wires, such as address wires. However, the term
may also mean a protocol for communicating between components.
 To avoid confusion, we will use the term bundle to refer to a set of related signals. The
fundamental bus operations are reading and writing. Figure shows the structure of a
typical bus that supports reads and writes.
 The major components follow:
 Clock provides synchronization to the bus components,
 R/W is true when the bus is reading and false when the bus is writing,
 Address is an a-bit bundle of signals that transmits the address for an access,
 Data is an n-bit bundle of signals that can carry data to or from the CPU, and
 Data ready signals when the values on the data bundle are valid.

Types of Buses:
• ISA: Industry Standard architecture
• EISA: Extended Industry Standard architecture
• MCA: Micro Channel Architecture
• VESA: Video Electronics Standards Association
• PCI: Peripheral Component Interconnect
• PCI-X: PCI express
• PCMCIA: Personal Computer Memory Card Industry Association
• AGP: Accelerated Graphics Port
• SCSI: Small Computer Systems Interface
• USB
• IEEE1394

Memory System Architecture:


 Volatile/Non Volatile
 Auxiliary Memory:
 Magnetic Disks
 Magnetic tapes
 Primary Memory: Main Memory
 Cache Memory
 Register Memory

 The total memory capacity of a computer can be visualized by hierarchy of components.


The memory hierarchy system consists of all storage devices contained in a computer
system from the slow Auxiliary Memory to fast Main Memory and to smaller Cache
memory.
 Auxiliary memory access time is generally 1000 times that of the main memory, hence
it is at the bottom of the hierarchy.
 The main memory occupies the central position because it is equipped to communicate
directly with the CPU and with auxiliary memory devices through Input/output
processor (I/O).
 When the program not residing in main memory is needed by the CPU, they are brought
in from auxiliary memory. Programs not currently needed in main memory are
transferred into auxiliary memory to provide space in main memory for other programs
that are currently in use.
 The cache memory is used to store program data which is currently being executed in
the CPU. Approximate access time ratio between cache memory and main memory is
about 1 to 7~10

Main Memory:
 The memory unit that communicates directly within the CPU, Auxiliary memory and
Cache memory, is called main memory. It is the central storage unit of the computer
system.
 It is a large and fast memory used to store data during computer operations. Main
memory is made up of RAM and ROM, with RAM integrated circuit chips holding the
major share.
 RAM: Random Access Memory
 DRAM: Dynamic RAM, is made of capacitors and transistors, and must be
refreshed every 10~100ms. It is slower and cheaper than SRAM.
 SRAM: Static RAM, has a six transistor circuit in each cell and retains data,
until powered off.
 NVRAM: Non-Volatile RAM, retains its data, even when turned off. Example:
Flash memory.
 ROM: Read Only Memory, is non-volatile and is more like a permanent storage for
information. It also stores the bootstrap loader program, to load and start the operating
system when computer is turned on. PROM(Programmable ROM), EPROM(Erasable
PROM) and EEPROM(Electrically Erasable PROM) are some commonly used ROMs.
Auxiliary & Cache Memory:
 Auxiliary Memory
 Devices that provide backup storage are called auxiliary memory. For example:
Magnetic disks and tapes are commonly used auxiliary devices. Other devices
used as auxiliary memory are magnetic drums, magnetic bubble memory and
optical disks.
 It is not directly accessible to the CPU, and is accessed using the Input/Output
channels.
 Cache Memory
 The data or contents of the main memory that are used again and again by CPU,
are stored in the cache memory so that we can easily access that data in shorter
time.
 Whenever the CPU needs to access memory, it first checks the cache memory.
If the data is not found in cache memory then the CPU moves onto the main
memory. It also transfers block of recent data into the cache and keeps on
deleting the old data in cache to accommodate the new one.
Types of Cache –
 Primary Cache – A primary cache is always located on the processor chip. This cache
is small and its access time is comparable to that of processor registers.
 Secondary Cache – Secondary cache is placed between the primary cache and the rest
of the memory. It is referred to as the level 2 (L2) cache. Often, the Level 2 cache is
also housed on the processor chip.
Virtual Memory:
 Virtual memory is the separation of logical memory from physical memory. This
separation provides large virtual memory for programmers when only small physical
memory is available.
 Virtual memory is used to give programmers the illusion that they have a very large
memory even though the computer has a small main memory. It makes the task of
programming easier because the programmer no longer needs to worry about the
amount of physical memory available.

Difference Virtual & Cache memory:


VIRTUAL MEMORY CACHE MEMORY
Virtual memory increases the capacity While cache memory increase the
of main memory. accessing speed of CPU.

Virtual memory is not a memory unit,


Cache memory is exactly a memory unit.
its a technique.
The size of virtual memory is greater While the size of cache memory is less
than the cache memory. than the virtual memory.

Operating System manages the Virtual On the other hand hardware manages the
memory. cache memory.
In virtual memory, The program with
While in cache memory, recently used
size larger than the main memory are
data is copied into.
executed.

Memory Management Unit:


 Memory management is the functionality of an operating system which handles or
manages primary memory and moves processes back and forth between main memory
and disk during execution.
 Memory management keeps track of each and every memory location, regardless of
either it is allocated to some process or it is free. It checks how much memory is to be
allocated to processes. It decides which process will get memory at what time.
 It tracks whenever some memory gets freed or unallocated and correspondingly it
updates the status.
 The process address space is the set of logical addresses that a process references in its
code. For example, when 32-bit addressing is in use, addresses can range from 0 to
0x7fffffff; that is, 2^31 possible numbers, for a total theoretical size of 2 gigabytes.
 The operating system takes care of mapping the logical addresses to physical addresses
at the time of memory allocation to the program. There are three types of addresses
used in a program before and after memory is allocated −
 Symbolic addresses: The addresses used in a source code. The variable names,
constants, and instruction labels are the basic elements of the symbolic address
space.
 Relative addresses: At the time of compilation, a compiler converts symbolic
addresses into relative addresses.
 Physical addresses: The loader generates these addresses at the time when a
program is loaded into main memory.
 The set of all logical addresses generated by a program is referred to as a logical
address space. The set of all physical addresses corresponding to these logical
addresses is referred to as a physical address space.
 The runtime mapping from virtual to physical address is done by the memory
management unit (MMU) which is a hardware device. MMU uses following
mechanism to convert virtual address to physical address.
 The value in the base register is added to every address generated by a user process,
which is treated as offset at the time it is sent to memory. For example, if the base
register value is 10000, then an attempt by the user to use address location 100 will be
dynamically reallocated to location 10100.
 The user program deals with virtual addresses; it never sees the real physical addresses.
 A computer’s memory management unit (MMU) is the physical hardware that handles
its virtual memory and caching operations. The MMU is usually located within the
computer’s central processing unit (CPU), but sometimes operates in a separate
integrated chip (IC). All data request inputs are sent to the MMU, which in turn
determines whether the data needs to be retrieved from RAM or ROM storage.

 A memory management unit is also known as a paged memory management unit.


 The memory management unit performs three major functions:
 Hardware memory management
 Operating system (OS) memory management
 Application memory management
 Hardware memory management deals with a system's RAM and cache memory, OS
memory management regulates resources among objects and data structures, and
application memory management allocates and optimizes memory among programs.
 The MMU also includes a section of memory that holds a table that matches virtual
addresses to physical addresses, called the translation lookaside buffer (TLB).
 MMUs mainly allow for the flexibility in a system of having a larger virtual memory
space within an actual smaller physical memory.
 An MMU can exist outside the master processor and is used to
 Translate logical (virtual) addresses into physical addresses (memory mapping)
 Handle memory security (memory protection)
 Controlling cache
 Handling bus arbitration between the CPU and memory
 Generating appropriate exceptions.
 In the case of translated addresses, the MMU can use level 1 cache or portions of cache
allocated as buffers for caching address translations, commonly referred to as the
translation lookaside buffer (TLB), on the processor to store the mappings of logical
addresses to physical addresses.
 MMUs also must support the various schemes in translating addresses, mainly
segmentation, paging, or some combination of both schemes. In general, segmentation
is the division of logical memory into large variable size sections, whereas paging is
the dividing up of logical memory into smaller fixed size units. When both schemes are
implemented, logical memory is first divided into segments, and segments are then
divided into pages.
 The memory protection schemes then provide shared, read/write or read-only
accessibility to the various pages and/or segments. If a memory access is not defined or
allowed, an interrupt is typically triggered. An interrupt is also triggered if a page or
segment isn’t accessible during address translation—for example, in the case of a
paging scheme, or a page fault. At that point the interrupt would need to be handled
(e.g., the page or segment would have to be retrieved from secondary memory).

Paging
 A computer can address more memory than the amount physically installed on
the system. This extra memory is actually called virtual memory and it is a
section of a hard that's set up to emulate the computer's RAM. Paging technique
plays an important role in implementing virtual memory.
 Paging is a memory management technique in which process address space is
broken into blocks of the same size called pages (size is power of 2, between
512 bytes and 8192 bytes). The size of the process is measured in the number
of pages.
 Similarly, main memory is divided into small fixed-sized blocks of (physical)
memory called frames and the size of a frame is kept the same as that of a page
to have optimum utilization of the main memory and to avoid external
fragmentation.

Address Translation:
 Page address is called logical address and represented by page number and
the offset.
 Logical Address = Page number + page offset Frame address is called physical
address and represented by a frame number and the offset.
 Physical Address = Frame number + page offset A data structure called page map
table is used to keep track of the relation between a page of a process to a frame in
physical memory.
 When the system allocates a frame to any page, it translates this logical address into a
physical address and create entry into the page table to be used throughout execution
of the program.
 When a process is to be executed, its corresponding pages are loaded into any
available memory frames. Suppose you have a program of 8Kb but your memory can
accommodate only 5Kb at a given point in time, then the paging concept will come
into picture. When a computer runs out of RAM, the operating system (OS) will move
idle or unwanted pages of memory to secondary memory to free up RAM for other
processes and brings them back when needed by the program.
 This process continues during the whole execution of the program where the OS keeps
removing idle pages from the main memory and write them onto the secondary
memory and bring them back when required by the program.
Advantages and Disadvantages of Paging
 Paging reduces external fragmentation, but still suffer from internal fragmentation.
 Paging is simple to implement and assumed as an efficient memory management
technique.
 Due to equal size of the pages and frames, swapping becomes very easy.
 Page table requires extra memory space, so may not be good for a system having
small RAM.
Demand Paging:
 A demand paging system is quite similar to a paging system with swapping where
processes reside in secondary memory and pages are loaded only on demand, not in
advance. When a context switch occurs, the operating system does not copy any of
the old program’s pages out to the disk or any of the new program’s pages into the
main memory Instead, it just begins executing the new program after loading the first
page and fetches that program’s pages as they are referenced.
 While executing a program, if the program references a page which is not available in
the main memory because it was swapped out a little ago, the processor treats this
invalid memory reference as a page fault and transfers control from the program to
the operating system to demand the page back into the memory.
 Advantages
 Large virtual memory.
 More efficient use of memory.
 There is no limit on degree of multiprogramming.
 Disadvantages
 Number of tables and the amount of processor overhead for handling page
interrupts are greater than in the case of the simple paged management
techniques.

SRAM: Static RAM


 Static Random Access Memory (SRAM) is a type of semiconductor memory. There
are two key features to SRAM - Static random Access Memory, and these set it out
against other types of memory that are available:
 The data is held statically: This means that the data is held in the semiconductor
memory without the need to be refreshed as long as the power is applied to the memory.
 SRAM memory is a form of random access memory: A random access memory is one
in which the locations in the semiconductor memory can be written to or read from in
any order, regardless of the last memory location that was accessed.
 The circuit for an individual SRAM memory cell comprises typically four transistors
configured as two cross coupled inverters.
 In this format the circuit has two stable states, and these equate to the logical "0" and
"1" states. In addition to the four transistors in the basic memory cell, and additional
two transistors are required to control the access to the memory cell during the read and
write operations. This makes a total of six transistors, making what is termed a 6T
memory cell.
 Sometimes further transistors are used to give either 8T or 10T memory cells. These
additional transistors are used for functions such as implementing additional ports in a
register file, etc for the SRAM memory.
 Although any three terminal switch device can be used in an SRAM, MOSFETs and in
particular CMOS technology is normally used to ensure that very low levels of power
consumption are achieved.

 The operation of the SRAM memory cell is relatively straightforward. When the cell is
selected, the value to be written is stored in the cross-coupled flip-flops. The cells are
arranged in a matrix, with each cell individually addressable. Most SRAM memories
select an entire row of cells at a time, and read out the contents of all the cells in the
row along the column lines.
 While it is not necessary to have two bit lines, using the signal and its inverse, this is
normal practice which improves the noise margins and improves the data integrity. The
two bit lines are passed to two input ports on a comparator to enable the advantages of
the differential data mode to be accessed, and the small voltage swings that are present
can be more accurately detected.
 Access to the SRAM memory cell is enabled by the Word Line. This controls the two
access control transistors which control whether the cell should be connected to the bit
lines. These two lines are used to transfer data for both read and write operations.
DRAM: Dynamic RAM
 Dynamic RAM, or DRAM is a form of random access memory, RAM which is used in
many processor systems to provide the working memory.
 DRAM is widely used in digital electronics where low-cost and high-capacity memory
is required.
 Dynamic RAM, DRAM is used where very high levels of memory density are needed,
although against this it is quite power hungry so this needs to be considered if it is to
be use
 DRAM, or dynamic random access memory stores each bit of data on a small capacitor
within the memory cell. The capacitor can be either charged or discharged and this
provides the two states, "1" or "0" for the cell.
 Since the charge within the capacitor leaks, it is necessary to refresh each memory cell
periodically. This refresh requirement gives rise to the term dynamic - static memories
do not have a need to be refreshed.
 The advantage of a DRAM is the simplicity of the cell - it only requires a single
transistor compared to around six in a typical static RAM, SRAM memory cell. In view
of its simplicity, the costs of DRAM are much lower than those for SRAM, and they
are able to provide much higher levels of memory density. However the DRAM has
disadvantages as well, and as a result, most computers use both DRAM technology and
SRAM, but in different areas.
 In view of the fact that power is required for the DRAM to maintain its data, it is what
is termed a volatile memory.

 Asynchronous DRAM
This was the first type of DRAM in use but was gradually replaced by synchronous
DRAM. This was called asynchronous because the memory access was not
synchronized with the system clock.
 Synchronous DRAM
This DRAM replaced the asynchronous RAM and is used in most computer systems
today. In synchronous DRAM, the clock is synchronised with the memory interface.
All the signals are processed on the rising edge of the clock.
 Graphics DRAM
There are many graphics related tasks that can be accomplished with both synchronous
and asynchronous DRAM. Some of the DRAM used for these tasks are Video DRAM,
Window DRAM, Multibank DRAM etc.
Difference : SRAM Vs. DRAM
 Both static and dynamic RAM are types of RAM but SRAM is formed using flip flops
and DRAM using capacitors.
 It is necessary that the data in DRAM is refreshed periodically to store it correctly. This
is not necessary for SRAM.
 SRAM is normally only used in Cache memory while DRAM is used in main memory.
 Static RAM is much more faster and expensive as compared to Dynamic RAM.
 Since SRAM is used as Cache memory, its size is 1MB to 16MB. On the other hand,
dynamic memory is larger as it is used as main memory. Its size is 4GB to 16GB in
computers and laptops.
 SRAM is usually present on processors or between processors and main memory.
DRAM is present on the motherboard.
ROM Memories:
 Mask ROM (MROM). Data bits are permanently programmed into a microchip by the
manufacturer of the external MROM chip. MROM designs are usually based upon
MOS (NMOS, CMOS) or bipolar transistor-based circuitry. This was the original type
of ROM design. Because of expensive setup costs for a manufacturer of MROMs, it is
usually only produced in high volumes and there is a wait time of several weeks to
several months. However, using MROMs in design of products is a cheaper solution.
 One-Time Programmable ROM (OTP or OTPRom/PROM). This type of ROM can
only be programmed (permanently) one time as its name implies, but it can be
programmed outside the manufacturing factory, using a ROM burner. OTPs are based
upon bipolar transistors, in which the ROM burner burns out fuses of cells to program
them to “1” using high voltage/current pulses.
 Erasable Programmable ROM (EPROM). An EPROM can be erased more than one
time using a device that outputs intense short-wavelength, ultraviolet light into the
EPROM package’s built-in transparent window. (OTPs are one-time programmable
EPROMs without the window to allow for erasure; the packaging without the window
used in OTPs is cheaper)
 EPROMs are made up of MOS (i.e., CMOS, NMOS) transistors whose extra “floating
gate” (gate capacitance) is electrically charged, and the charge trapped, to store a “0”
by the Romizer through “avalanche induced migration”—a method in which a high
voltage is used to expose the floating gate.
 The floating gate is made up of a conductor floating within the insulator, which allows
enough of a current flow to allow for electrons to be trapped within the gate, with the
insulator of that gate preventing electron leakage.
 The floating gates are discharged via UV light, to store a “1” for instance. This is
because the high-energy photons emitted by UV light provide enough energy for
electrons to escape the insulating portion of the floating gate. The total number of
erasures and rewrites is limited depending on the EPROM.
 Electrically Erasable Programmable ROM (EEPROM). Like EPROM, EEPROMs
can be erased and reprogrammed more than once. The number of times erasure and
reuse occur depends on the EEPROMs.
 Unlike EPROMs, the content of EEPROM can be written and erased “in bytes” without
using any special devices. In other words, the EEPROM can stay on its residing board,
and the user can connect to the board interface to access and modify an EEPROM.
 EEPROMs are based upon NMOS transistor circuitry, except insulation of the floating
gate in an EEPROM is thinner than that of the EPROM, and the method used to charge
the floating gates is called the Fowler–Nordheim tunneling method (in which the
electrons are trapped by passing through the thinnest section of the insulating material).
 Erasing an EEPROM which has been programmed electrically is a matter of using a
high-reverse polarity voltage to release the trapped electrons within the floating gate.
 Electronically discharging an EEPROM can be tricky, though, in that any physical
defects in the transistor gates can result in an EEPROM not being discharged
completely before a new reprogram.
 EEPROMs typically have more erase/write cycles than EPROMs, but are also usually
more expensive. A cheaper and faster variation of the EEPROM is Flash memory.
Flash Memory:
 Flash memory storage is a form of non-volatile memory that was born out of a
combination of the traditional EPROM and E2PROM.
 In essence it uses the same method of programming as the standard EPROM and the
erasure method of the E2PROM.
 One of the main advantages that flash memory has when compared to EPROM is its
ability to be erased electrically. However it is not possible to erase each cell in a flash
memory individually unless a large amount of additional circuitry is added into the chip.
This would add significantly to the cost and accordingly most manufacturers dropped
this approach in favour of a system whereby the whole chip, or a large part of it is block
or flash erased - hence the name.
 Flash memory is able to provide high density memory because it requires only a few
components to make up each memory cell. In fact the structure of the memory cell is
very similar to the EPROM.
 Each Flash memory cell consists of the basic channel with the source and drain
electrodes separated by the channel about 1 µm long. Above the channel in the Flash
memory cell there is a floating gate which is separated from the channel by an
exceedingly thin oxide layer which is typically only 100 Å thick. It is the quality of this
layer which is crucial to the reliable operation of the memory.
 Above the floating gate there is the control gate. This is used to charge up the gate
capacitance during the write cycle.
 The Flash memory cell functions by storing charge on the floating gate. The presence
of charge will then determine whether the channel will conduct or not. During the read
cycle a "1" at the output corresponds to the channel being in its low resistance or ON
state.
 Programming the Flash memory cell is a little more complicated, and involves a process
known as hot-electron injection. When programming the control gate is connected to a
"programming voltage". The drain will then see a voltage of around half this value
while the source is at ground. The voltage on the control gate is coupled to the floating
gate through the dielectric, raising the floating gate to the programming voltage and
inverting the channel underneath. This results in the channel electrons having a higher
drift velocity and increased kinetic energy.
 Collisions between the energetic electrons and the crystal lattice dissipate heat which
raises the temperature of the silicon. At the programming voltage it is found that the
electrons cannot transfer their kinetic energy to the surrounding atoms fast enough and
they become "hotter" and scatter further afield, many towards the oxide layer. These
electrons overcome the 3.1 eV (electron volts) needed to overcome the barrier and they
accumulate on the floating gate. As there is no way of escape they remain there until
they are removed by an erase cycle.
 The erase cycle for Flash memory uses a process called Fowler-Nordheim tunnelling.
The process is initiated by routing the programming voltage to the source, grounding
the control gate and leaving the drain floating. In this condition electrons are attracted
towards the source and they tunnel off the floating gate, passing through the thin oxide
layer. This leaves the floating gate devoid of charge.
I/O devices
5.1 I/O Devices
5.1.1 Timers and Counters.
5.1.2 Watchdog Timers.
5.1.3 Interrupt Controllers.
5.1.4 DMA Controllers.
5.1.5 A/D and D/A Converters.
5.1.6 Displays.
5.1.7 Keyboards.
5.1.8 Infrared devices.
5.2 Component Interfacing.
5.2.1 Memory Interfacing.
5.2.2 I/O Device Interfacing.
5.3 Interfacing Protocols

5.3.1 GPIB.
5.3.2 FIREWIRE
5.3.3 USB
5.3.4 IRDA

Timers/Counters:
 Counter/timer hardware is a crucial component of most embedded systems. In some
cases a timer is needed to measure elapsed time; in others we want to count or time
some external events.
 The names counter and timer can be used interchangeably when talking about the
hardware. The difference in terminology has more to do with how the hardware is
used in a given application.
 Digital timer/counters are used throughout embedded designs to provide a series of
time or count related events within the system with the minimum of processor and
software overhead.
 Most embedded systems have a time component within them such as timing
references for control sequences, to provide system ticks for operating systems and
even the generation of waveforms for serial port baud rate generation and audible
tones.
 They are available in several different types but are essentially based around a simple
structure as shown.
 The central timing is derived from a clock input. This clock may be internal to the
timer/counter or be external and thus connected via a separate pin.
 The clock may be divided using a simple divider which can provide limited division
normally based on a power of two or through a pre-scalar which effectively scales
down or divides the clock by the value that is written into the pre-scalar register.
 The divided clock is then passed to a counter which is normally configured in a count-
down operation, i.e. it is loaded with a preset value which is clocked down towards
zero. When a zero count is reached, this causes an event to occur such as an interrupt
of an external line changing state.
 The final block is loosely described as an I/O control block but can be more
sophisticated than that. It generates interrupts and can control the counter based on
external signals which can gate the count-down and provide additional control. This
expands the functionality that the timer.
 Auto Reload: A timer with automatic reload capability will have a latch register to hold
the count written by the processor. When the processor writes to the latch, the count
register is written as well. When the timer later overflows, it first generates an output
signal. Then, it automatically reloads the contents of the latch into the count register.
Since the latch still holds the value written by the processor, the counter will begin
counting again from the same initial value.
 Such a timer will produce a regular output with the same accuracy as the input clock.
This output could be used to generate a periodic interrupt like a real-time operating
system (RTOS) timer tick, provide a baud rate clock to a UART, or drive any device
that requires a regular pulse.

Watchdog Timers:
 In a small embedded device, it is easy to find the exact cause for the bug, but not in a
complex embedded system, even though it is perfectly designed & tested embedded
system on which a perfect code executes can fail, due to a small bug.
 Most embedded systems need to be self-reliant. It's not usually possible to wait for
someone to reboot them if the software hangs. Some embedded designs, such as space
probes, are simply not accessible to human operators. If their software ever hangs, such
systems are permanently disabled. In other cases, the speed with which a human
operator might reset the system would be too slow to meet the uptime requirements of
the product.
 A watchdog timer (WDT) is a safety mechanism that brings the system back to life
when it crashes.
 A watchdog timer is a piece of hardware that can be used to automatically detect
software anomalies and reset the processor if any occur. Generally speaking, a
watchdog timer is based on a counter that counts down from some initial value to zero.
The embedded software selects the counter's initial value and periodically restarts it. If
the counter ever reaches zero before the software restarts it, the software is presumed
to be malfunctioning and the processor's reset signal is asserted. The processor (and the
embedded software it's running) will be restarted as if a human operator had cycled the
power.
 WDT is a hardware that contains a timing device and clock source(besides system
clock).
 A timing device is a free-running timer, which is set to a certain value that gets
decremented continuously. When the value reaches zero, a short pulse is generated by
WDT circuitry that resets and restarts the system.
 In short, WDT constantly watches the execution of the code and resets the system if
software is hung or no longer executing the correct sequence of the code. Reloading of
WDT value by the software is called kicking the watchdog.
 These are of two types :
 External WDT
 Internal WDT

 As shown, the watchdog timer is a chip external to the processor. However, it could
also be included within the same chip as the CPU. This is done in many
microcontrollers. In either case, the output from the watchdog timer is tied directly to
the processor's reset signal.
 The process of restarting the watchdog timer's counter is sometimes called "kicking the
dog." The appropriate visual metaphor is that of a man being attacked by a vicious dog.
If he keeps kicking the dog, it can't ever bite him. But he must keep kicking the dog at
regular intervals to avoid a bite. Similarly, the software must restart the watchdog timer
at a regular rate, or risk being restarted.
 Software Problems:
 A watchdog timer can get a system out of a lot of dangerous situations.
However, if it is to be effective, resetting the watchdog timer must be considered
within the overall software design. Designers must know what kinds of things
could go wrong with their software, and ensure that the watchdog timer will
detect them, if any occur.
 Systems hang for any number of reasons. A logical fallacy resulting in the
execution of an infinite loop is the simplest.
 Another possibility is that an unusual number of interrupts arrives during one
pass of the loop. Any extra time spent in ISRs is time not spent executing the
main loop. A dangerous delay in feeding the motor new control instructions
could result.
 When multitasking kernels are used, deadlocks can occur. For example, a group
of tasks might get stuck waiting on each other and some external signal that one
of them needs, leaving the whole set of tasks hung indefinitely.
 If such faults are transient, the system may function perfectly for some length
of time after each watchdog-induced reset. However, failed hardware could lead
to a system that constantly resets. For this reason it may be wise to count the
number of watchdog-induced resets, and give up trying after some fixed number
of failures.
Interrupt Controller:
 In a typical SoC, there could be hundreds of peripherals – interrupt sources that may
vie for processor attention. It is impossible to route all these signals to the CPU. So
there is another special purpose peripheral called Interrupt Controller to which all the
peripheral interrupts signals are connected.
 The controller is in turn connected to the CPU with one or few lines of signals, with
which it can interrupt the CPU. Controller is provided with various registers to
mask/unmask the interrupts, read pending/raw status, set priority etc.
 The controller monitors the signals from peripherals and if there are any active
interrupts, based on the preconfigured priority decides the source to be processed first.
Then it signals the CPU with interrupt. The CPU, in the ISR, can then read the pending
interrupt register and process the same.
 Interrupt Mechanism:
 The first part of the sequence is the recognition of the interrupt or exception. This in
itself does not necessarily immediately trigger any processor reaction. If the interrupt
is not an error condition or the error condition is not connected with the currently
executing instruction, the interrupt will not be internally processed until the currently
executing instruction has completed.
 At this point, known as an instruction boundary, the processor will start to internally
process the interrupt. If, on the other hand, the interrupt is due to an error with the
currently executing instruction, the instruction will be aborted to reach the instruction
boundary.
 At the instruction boundary, the processor must now save certain state information to
allow it to continue its previous execution path prior to the interrupt.
 This will typically include a copy of the condition code register, the program counter
and the return address.
 The next phase is to get the location of the ISR to service the interrupt. This is normally
kept in a vector table somewhere in memory and the appropriate vector can be supplied
by the peripheral or pre-assigned, or a combination of both approaches.
 Once the vector has been identified, the processor starts to execute the code within the
ISR until it reaches a return from interrupt type of instruction. At this point, the
processor, reloads the status information and processing continues the previous
instruction stream.
 Interrupt controllers
 In many embedded systems there are more external sources for interrupts than interrupt
pins on the processor. In this case, it is necessary to use an interrupt controller to provide
a larger number of interrupt signals. An interrupt controller performs several functions:
 It provides a large number of interrupt pins that can be allocated to many external
devices. Typically this is at least eight and higher numbers can be supported by
cascading two or more controllers together. This is done on the IBM PC AT where two
8 port controllers are cascaded to give 15 interrupt levels.
 It orders the interrupt pins in a priority level so that a high level interrupt will inhibit a
lower level interrupt.
 It may provide registers for each interrupt pin which contain the vector number to be
used during an acknowledge cycle. This allows peripherals that do not have the ability
to provide a vector to do so.
 They can provide interrupt masking. This allows the system software to decide when
and if an interrupt is allowed to be serviced. The controller, through the use of masking
bits within a controller, can prevent an interrupt request from being passed through to
the processor. In this way, the system has a multi-level approach to screening interrupts.
It uses the screening provided by the processor to provide coarse grain granularity while
the interrupt controller provides a finer level.
 Interrupt latency
 This is usually defined as the time taken by the processor from recognition of the
interrupt to the start of the ISR. It consists of several stages and is dependent on both
hardware and software factors. Its importance is that it defines several aspects of an
embedded system with reference to its ability to respond to real-time events. The stages
involved in calculating a latency are:
 The time taken to recognise the interrupt Do not assume that this is instantaneous as it
will depend on the processor design and its own interrupt recognition mechanism. As
previously mentioned, some processors will repeatedly sample an interrupt signal to
ensure that it is a real one and not a false one.
 The time taken by the CPU to complete the current instruction This will also vary
depending on what the CPU is doing and its complexity. For a simple CISC processor,
this time will vary as its instructions all take a different number of clocks to complete.
Usually the most time-consuming instructions are those that perform multiplication or
division or some complex data manipulation such as bit field operations. For RISC
processors with single cycle execution, the time is usually that to clear the execution
pipeline and is 1 or 2 clocks.
 The time for the CPU to perform a context switch. This is the time taken by the
processor to save its internal context information such as its program counter, internal
data registers and anything else it needs. For CISC processors, this can involve creating
blocks of data on the stack by writing the information externally. For RISC processors
this may mean simply switching registers internally without explicitly saving any
information. Register windowing or shadowing is normally used.
 The time taken to fetch the interrupt vector is normally the time to fetch a single value
from memory but even this time can be longer than you think! We will come back to
this topic.
 The time taken to start the interrupt service routine execution typically very short.
However remember that because the pipeline is cleared, the instruction will need to be
clocked through to execute it and this can take a few extra clocks, even with a RISC
architecture.
DMA Controller:
 Direct memory access (DMA) controllers are frequently an elegant hardware solution
to a recurring software/system problem of providing an efficient method of transferring
data from a peripheral to memory.
 In systems without DMA, the solution is to use the processor to either regularly poll the
peripheral to see if it needs servicing or to wait for an interrupt to do so.
 Interrupts are a far better solution. An interrupt is sent from the peripheral to the
processor to request servicing. In many cases, all that is needed is to simply empty or
load a buffer. This solution starts becoming an issue as the servicing rate increases.
 With high speed ports, the cost of interrupting the processor can be higher than the
couple of instructions that it executes to empty a buffer. In these cases, the limiting
factor for the data transfer is the time to recognise, process and return from the interrupt.
If the data needs to be processed on a byte by byte basis in real-time, this may have to
be tolerated but with high speed transfers this is often not the case as the data is treated
in packets.
 This is where the DMA controller comes into its own. It is a device that can initiate and
control bus accesses between I/O devices and memory, and between two memory areas.
With this type of facility, the DMA controller acts as a hardware implementation of the
low-level buffer filling or emptying interrupt routine.
A generic controller consists of several components which control the operation:
 Address generator: This is the most important part of a DMA controller and typically
consists of a base address register and an auto-incrementing counter which increments
the address after every transfer. The generated addresses are used within the actual bus
transfers to access memory and/or peripherals. When a predefined number of bytes have
been transferred, the base address is reloaded and the count cleared to zero ready to
repeat the operation.
 Address bus: This is where the address created by the address generator is used to
access a specific memory location or peripheral.
 Data bus: This is the data bus that is used to transfer data from the DMA controller to
the destination location. In some cases, the data transfer may be made direct from the
peripheral to the memory with the DMA controller directly selecting the peripheral.
 Bus requester: This is used to request the bus from the main CPU.
 Local peripheral control: This allows the DMA controller to select the peripheral and
get it to accept or provide data directly or for a peripheral to request a data transfer,
depending on the DMA controller’s design.
 Interrupt signals: Most DMA controllers can interrupt the processor when the data
transfers are complete or if an error has occurred. This prompts the processor to either
reprogram the DMA controller for a different transfer or acts as a signal that a new
batch of data has been transferred and is ready for processing.
DMA Controller: operation
 Program the controller: Prior to using the DMA controller, it must be configured with
parameters that define the addressing such as base address and byte count that will be
used to transfer the data. In addition, the device will be configured in terms of its
communication with the processor and peripheral. Processor communication will
normally include defining the conditions that will generate an interrupt. The peripheral
communication may include defining which request pin is used by the peripheral and
any arbitration mechanism that is used to reconcile simultaneous requests for DMA
from two or more peripherals. The final part of this process is to define how the
controller will transfer blocks of data: all at once or individually or some other
combination.
 Start a transfer: A DMA transfer is normally initiated in response to a peripheral request
to start a transfer. It usually assumes that the controller has been correctly configured
to support this request. With a peripheral and processor, the processor will normally
request a service by asserting an interrupt pin which is connected to the processor’s
interrupt input(s). With a DMA controller, this peripheral interrupt signal can be used
to directly initiate a transfer or if it is left attached to the processor, the interrupt service
routine can start the DMA transfers by writing to the controller.
 Request the bus: The next stage is to request the bus from the processor. With most
modern processors supporting bus arbitration directly, the DMA controller issues a bus
request signal to the processor which will release the bus when convenient and allowthe
DMA controller to proceed. Without this support, the DMA controller has to cycle steal
from the processor so that it is held off the bus while the DMA controller uses it. As
will be described later on in this chapter, most DMA controllers provide some
flexibility concerning how they use and compete with bus bandwidth with the
processor and other bus masters.
 Issue the address: Assuming the controller has the bus, it will then issue the bus to
activate the target memory location. A variety of interfaces are used — usually
dependent on the number of pins that are available and include both non-multiplexed
and multiplexed buses. In addition, the controller provides other signals such as
read/write and strobe signals that can be used to work with the bus. DMA controllers
tend to be designed for a specific processor family bus but most recent devices are also
generic enough to be used with nearly any bus.
 Transfer the data: The data is transferred either from a holding buffer within the DMA
controller or directly from a peripheral.
 Update address generator: Once the data transfer has been completed, the address
generator uses the completion to calculate the address for the next transfer and update
the byte/transfer counters.
 Update processor: Depending on how the DMA controller has been programmed it can
notify the processor using interrupts of events within the transfer process such as an
address error or the completion of a data or block transfer.

 For devices with caches and/or internal memory, their external bus bandwidth
requirements are a lot lower and thus the DMA controller can use bus cycles without
impeding the processor’s performance. This last statement depends on the chances of
the DMA controller using the bus at the same time as the processor. This in turn depends
on the frequency and size of the DMA transfers. To provide some form of flexibility
for the designer so that a suitable trade-off can be made, most DMA controllers support
different types of bus utilisation.
 Single transfer: Here the bus is returned back to the processor after every transfer so
that the longest delay it will suffer in getting access to memory will be a bus cycle.
 Block transfer: Here the bus is returned back to the processor after the complete block
has been sent so that the longest delay the processor will suffer will be the time of a bus
cycle multiplied by the number of transfers to move the block. In effect, the DMA
controller has priority over the CPU in using the bus.
 Demand transfer: In this case, the DMA controller will hold the bus for as long as an
external device requests it to do so. While the bus is held, the DMA controller is at
liberty to transfer data as and when needed. In this respect, there may be gaps when the
bus is retained but no data is transferred.
A/D Converter:

 Multiplexer: The device contains an 8-channel single-ended analog signal multiplexer.


A particular input channel is selected by using the address decoder. The address is
latched into the decoder on the low-to-high transition of the address latch enable signal.
 The Converter: The heart of this single chip data acquisition system is its 8-bit analog-
to-digital converter. The converter is designed to give fast, accurate, and repeatable
conversions over a wide range of temperatures. The converter is partitioned into 3 major
sections: the 256R ladder network, the successive approximation register, and the
comparator. The converter’s digital outputs are positive true.
 The 256R ladder network approach (Figure 1) was chosen over the conventional R/2R
ladder because of its inherent monotonicity, which guarantees no missing digital codes.
Monotonicity is particularly important in closed loop feedback control systems. A non-
monotonic relationship can cause oscillations that will be catastrophic for the system.
Additionally, the 256R network does not cause load variations on the reference voltage.
 The successive approximation register (SAR) performs 8 iterations to approximate the
input voltage. For any SAR type converter, n-iterations are required for an n-bit
converter. Figure 2 shows a typical example of a 3-bit converter. In the ADC0808,
ADC0809, the approximation technique is extended to 8 bits using the 256R network.
 The A/D converter’s successive approximation register (SAR) is reset on the positive
edge of the start conversion (SC) pulse. The conversion is begun on the falling edge of
the start conversion pulse. A conversion in process will be interrupted by receipt of a
new start conversion pulse. Continuous conversion may be accomplished by tying the
end-of-conversion (EOC) output to the SC input. If used in this mode, an external start
conversion pulse should be applied after power up. End-of-conversion will go low
between 0 and 8 clock pulses after the rising edge of start conversion.

D/A Converter:

 Input Circuit: It receives the binary digital inputs safely and does some process or
filtration if required. It does not have any vital role in the whole circuit.
 Voltage Switching Circuit: It switches voltages between input digital signals and
reference voltage sources and passes to the main resistive circuit. It also makes
connection or isolation with the ground.
 Resistive Network: It is the main part of the digital to analog converter circuit. It
basically helps to multiple digital input processing before the amplifier circuit. There
are two types of DAC available according to the resistive network - weighted resistor
network and R-2R network. R-2R network has more advantages than the weighted
resistor network.
 Amplifier: Generally, a differential or operational amplifier is used in the DAC system.
That not only amplifies the signal even it make differentiate or comprises signal or
process signal such as summation, etc.
Digital to Analog Converter(DAC) Operation:
As we know there are two types of DAC, weighted resistor network can generate the
analog signal almost equal to the input digital signal. It works with an inverting adder
circuit. The main disadvantage of the weighted resistor network is the difference between
the resistance value will increase with the increasing of bits in the input digital signal
corresponding to LSB & MSB.
On the other hand, the R-2R network has so many advantages over the weighted resistor
network. It also generates the analog signal almost equal to the input binary digital signal.
The main advantage of the R-2R networks is, it contains only two values of resistors R and
2R so it is very easy to design and easy to select the resistor during the operation. Also,
the difference between the resistor values during the increase of the number of bits in the
signal can be adjusted by adding R-2R sections.
I/O Devices: Keyboard:
 A keyboard is basically an array of switches, but it may include some internal logic to
help simplify the interface to the microprocessor. In this chapter, we build our
understanding from a single switch to a microprocessor-controlled keyboard.
 A switch uses a mechanical contact to make or break an electrical circuit. The major
problem with mechanical switches is that they bounce as shown in Figure.
 When the switch is depressed by pressing on the button attached to the switch’s arm,
the force of the depression causes the contacts to bounce several times until they settle
down. If this is not corrected, it will appear that the switch has been pressed several
times, giving false inputs.
 A hardware debouncing circuit can be built using a one-shot timer. Software can also
be used to debounce switch inputs. A raw keyboard can be assembled from several
switches. Each switch in a raw keyboard has its own pair of terminals, making raw
keyboards impractical when a large number of keys is required.
 More expensive keyboards, such as those used in PCs, actually contain a
microprocessor to preprocess button inputs. PC keyboards typically use a 4-bit
microprocessor to provide the interface between the keys and the computer.
 The microprocessor can provide debouncing, but it also provides other functions as
well. An encoded keyboard uses some code to represent which switch is cur- rently
being depressed. At the heart of the encoded keyboard is the scanned array of switches
shown in Figure.
Memory and IO Interfacing:
 Memory Interfacing
 When we are executing any instruction, we need the microprocessor to access
the memory for reading instruction codes and the data stored in the memory.
For this, both the memory and the microprocessor requires some signals to read
from and write to registers.
 The interfacing process includes some key factors to match with the memory
requirements and microprocessor signals. The interfacing circuit therefore
should be designed in such a way that it matches the memory signal
requirements with the signals of the microprocessor.
 IO Interfacing
 There are various communication devices like the keyboard, mouse, printer, etc.
So, we need to interface the keyboard and other devices with the microprocessor
by using latches and buffers. This type of interfacing is known as I/O
interfacing.

GPIB: General Purpose Interface Bus


 The GPIB or IEEE 488 bus is a very flexible system, allowing data to flow between
any of the instruments on the bus, at a speed suitable for the slowest active instrument.
Up to fifteen instruments may be connected together with a maximum bus length not
exceeding 20 m.
 A further requirement for the bus is that there must also be no more than 2 m between
two adjacent test instruments.
 Devices have a unique address on the bus. Test instruments are allocated addresses in
the range 0 to 30, and no two instruments on the same bus are allowed to have the same
address. The addresses on the instruments can be changed and this may typically be
done via the front panel, or by using switches often located on the rear panel.
 Active extenders are available and these items allow longer buses: up to 31 devices
theoretically possible, along with a greater overall length dependent upon the extender.
 Within IEEE 488, the equipment on the bus falls into three categories, although items
can fulfil more than one function:
 Controller: As the name suggests, the controller is the entity that controls the
operation of the bus. It is usually a computer and it signals that instruments are to
perform the various functions. The GPIB controller also ensures that no conflicts occur
on the bus. If two talkers tried to talk at the same time then data would become corrupted
and the operation of the whole system would be seriously impaired. It is possible for
multiple controllers to share the same bus; but only one can act as a controller at any
particular time.
 Listener: A listener is an entity connected to the bus that accepts instructions from the
bus. An example of a listener is an item such as a printer that only accepts data from
the bus. It could also be a test instrument such as a power supply or switching matrix
that does not take measurements.
 Talker: This is an entity on the bus that issues instructions / data onto the bus. Many
items of test equipment will fulfil more than one function. For example a voltmeter
which is controlled over the bus will act as a listener when it is being set up, and then
when it is returning the data, it will act as a talker. As such it is known as a talker /
listener.

 Advantages
 Simple & standard hardware interface
 Interface present on many bench instruments
 Rugged connectors & connectors used (although some insulation displacement
cables appear occasionally).
 Possible to connect multiple instruments to a single controller
 Disadvantages
 Bulky connectors
 Cable reliability poor - often as a result of the bulky cables.
 Low bandwidth - slow compared to more modern interfaces
 Basic IEEE 422 does not mandate a command language (SCPI used in later
implementations but not included on all instruments.
Features GPIB:

PARAMETER DETAILS
Max length of bus 20 metres
Max individual distance between 2 metres average 4 metres maximum in any instance.
instruments
Maximum number of instruments 14 plus controller, i.e. 15 instruments total with at
least two-thirds of the devices powered on.
Data bus width 8 lines.
Handshake lines 3
Bus management lines 5
Connector 24-pin Amphenol (typical) D-type occasionally used.
Max data rate ~ 1 Mbyte / sec (HS-488 allows up to ~8Mbyte / sec).

Firewire:
 IEEE 1394 is an interface standard for a serial bus for high-speed communications
and isochronous real-time data transfer. It was developed in the late 1980s and early
1990s by Apple in cooperation with a number of companies,
primarily Sony and Panasonic.
 Apple called the interface FireWire. It is also known by the brands i.LINK (Sony),
and Lynx (Texas Instruments).
 The copper cable used in its most common implementation can be up to 4.5 metres
(15 ft) long.
 Power and data is carried over this cable, allowing devices with moderate power
requirements to operate without a separate power supply. FireWire is also available
in Cat 5 and optical fiber versions.
 The 1394 interface is comparable to USB. USB was developed subsequently and
gained much greater market share.
 USB requires a master controller whereas IEEE 1394 is cooperatively managed by the
connected devices.
 The process of the bus deciding which node gets to transmit data at what time is known
as arbitration. Each arbitration round lasts about 125 microseconds. During the round,
the root node (device nearest the processor) sends a cycle start packet. All nodes
requiring data transfer respond, with the closest node winning. After the node is
finished, the remaining nodes take turns in order. This repeats until all the devices have
used their portion of the 125 microseconds, with isochronous transfers having priority.

 Firewire: Specifications/features
 FireWire can connect up to 63 peripherals in a tree or daisy-chain topology.
 It allows peer-to-peer device communication — such as communication between a
scanner and a printer — to take place without using system memory or the CPU.
 FireWire also supports multiple hosts per bus.
 It is designed to support plug and play and hot swapping.
 The copper cable it uses in its most common implementation can be up to 4.5 metres
(15 ft) long and is more flexible than most parallel SCSI cables.
 In its six-conductor or nine-conductor variations, it can supply up to 45 watts of power
per port at up to 30 volts, allowing moderate-consumption devices to operate without a
separate power supply.
 FireWire is capable of both asynchronous and isochronous transfer methods at once.
 Firewire Application:
 Consumer automobiles: IDB-1394 Customer Convenience Port (CCP) was the
automotive version of the 1394 standard.
 Consumer audio and video: IEEE 1394 was the High-Definition Audio-Video Network
Alliance (HANA) standard connection interface for A/V (audio/visual) component
communication and control.
 Military and aerospace vehicles: SAE Aerospace standard AS5643 originally released
in 2004 and reaffirmed in 2013 establishes IEEE-1394 standards as a military and
aerospace databus network in those vehicles.
 General networking: FireWire can be used for ad hoc (terminals only, no routers except
where a FireWire hub is used) computer networks. Specifically, RFC 2734 specifies
how to run IPv4 over the FireWire interface, and RFC 3146 specifies how to run IPv6.
 IIDC: IIDC (Instrumentation & Industrial Digital Camera) is the FireWire data format
standard for live video, and is used by Apple's iSight A/V camera.
 DV: Digital Video (DV) is a standard protocol used by some digital camcorders. All
DV cameras that recorded to tape media had a FireWire interface
USB:
 Universal Serial Bus (USB) communications has greatly advanced the versatility of
PCs and embedded systems with external peripherals. USB provides the next-
generation interface for all devices that at one time communicated through the serial
port, parallel port, or utilized a custom connector interface to the host computer system.
 USB, Universal Serial Bus is one of the most common interfaces for connecting a
variety of peripherals to computers and providing relatively local and small levels of
data transfer.
 USB interfaces are found on everything from personal computers and laptops, to
peripheral devices, mobile phones, cameras, flash memory sticks, back up hard-drives
and very many other devices. Its combination of convenience and performance has
meant that it is now one of the most widely used computer interfaces.
 The Universal Serial Bus, USB provides a very simple and effective means of providing
connectivity, and as a result it is very widely used.
 Whilst USB provides a sufficiently fast serial data transfer mechanism for data
communications, it is also possible to obtain power through the connector making it
possible to power small devices via the connector and this makes it even more
convenient to use, especially ‘on-the-go.’
 USB is an asynchronous serial interconnect between a host and up to a total of 127
devices.
 It utilizes a four wire shielded cable with a length of up to 5m between a host and a
device or hub.
 Two of the wires are power and ground which are sourced by the host or a hub providing
limited supply power for attached devices.
 All USB information is transmitted on the other two wires. These wires are a twisted
pair and the USB data is transmitted as a differential signal.
 USB 1.0 was released in January 1996. In September 1998, USB 1.1 was released
which added an additional method of transferring data between the host and device.
USB 1.0 and 1.1 supported two data transfer speeds; a low speed of 1.5 Mbps and a full
speed of 12 Mbps.
 USB 2.0 was released in April 2000 and added a faster data transfer speed of 480 Mbps
while retaining all the characteristics of the original USB 1.1. This introduction of the
faster transfer speed allowed devices such as external hard drives, video transport,
scanners, and the newer printers that require high data throughput to utilize USB
communications.
 USB utilizes a tiered star topology in which each device is connected to hubs which
may be connected to other hubs or directly connected to the host which is the PC or
primary embedded system.
 The host controls all communications on a USB bus and only one device at a time
communicates with the host. This is referred to as speak when spoken to
communications protocol.
 USB Data Transfer:
 Control transfer: Control transfers are the only type of transfer that all USB devices
must support. A device can utilize one of the other transfer types that may be better
suited for continued communications with the host but it must support control transfers
at a minimum. Control transfers are utilized during the enumeration stage of a USB
communication. After enumeration the device can continue to utilize the control
transfer type for communications. Control transfer type supports low, full and high
transmission speeds and includes error correction.
 Bulk transfer: Bulk transfers are used for moving large amounts of data between a
host and a device. Examples of devices that utilize bulk transfers are printers, scanners,
hard drives and flash drives. Bulk transfers occur on the USB bus when no other types
of transfers are occurring and transmission of the data is not time critical. Bulk transfer
type is only available for devices that support full- and high-speed USB
communications; it includes error correction.
 Interrupt transfer: The interrupt transfer is not the same as the interrupts that are
present in an embedded system or PC. Interrupt transfers are utilized by devices that
require the host to periodically poll the device to see if any information is required to
be transferred. Examples of such devices are the keyboard and mouse that need to
transfer small pieces of information when a key is depressed or mouse movement is
detected. Interrupt transfers can be utilized by any of the USB bus transmission speeds
and includes error correction.
 Isochronous transfer: Isochronous transfer allows devices to transfer large amounts
of data in real time. They guarantee a delivery time or bandwidth for the device data to
be transmitted but it does not incorporate any type of error correction as the other
transfer types support. Devices that utilize isochronous transfer have to be tolerant of
occasional corrupt data packets arriving at the destination. Examples of devices that
would utilize isochronous transfer would be a web camera or USB speakers.
Isochronous transfer type is only available for devices that support full- or high-speed
USB communications.
 Features:
 A maximum of 127 peripherals can be connected to a single USB host controller.
 USB device has a maximum speed up to 480 Mbps (for USB 2.0).
 Length of individual USB cable can reach up to 5 meters without a hub and 40
meters with hub.
 USB acts as "plug and play" device and hot swappable.
 USB can draw power by its own supply or from a computer. USB devices use
power up to 5 voltages and deliver up to up to 500 mA.
 If a computer turns into power-saving mode, some USB devices will automatically
convert themselves into "sleep" mode.

 USB: versions:

USB USB USB USB


USB 3.0 USB 3.1 USB 3.2 USB4
Connectors 1.0 1.1 2.0 2.0
2011 2014 2017 2019
1996 1998 2001 Revised

1.5 1.5
40
Mbit/s Mbit/s 1.5 Mbit/s
Gbit/s
(Low (Low (Low Speed) 10 20
5 Gbit/s (SuperS
Speed) Speed) 12 Mbit/s Gbit/s Gbit/s
Data rate (SuperS peed+
12 12 (Full Speed) (SuperS (SuperS
peed) and
Mbit/s Mbit/s 480 Mbit/s peed+) peed+)
Thunder
(Full (Full (High Speed)
bolt 3)
Speed) Speed)
 USB: Advantages & Limitations

IrDA:
 The Infrared Data Association (IrDA) is an industry-driven interest group that was
founded in 1993 by around 50 companies.
 IrDA provides specifications for a complete set of protocols for wireless infrared
communications, and the name "IrDA" also refers to that set of protocols.
 The main reason for using the IrDA protocols had been wireless data transfer over the
"last one meter" using point-and-shoot principles.
 Thus, it has been implemented in portable devices such as mobile telephones, laptops,
cameras, printers, and medical devices.
 Main characteristics of this kind of wireless optical communication is physically secure
data transfer, line-of-sight (LOS) and very low bit error rate (BER) that makes it very
efficient.
 IrDA devices provide a walk-up, point-to-point method of data transfer that is adaptable
to a broad range of computing and communicating devices. The first version of the
IrDA specification (version 1.0) provides communication at data rates up to 115.2
Kbps.
 Later versions (version 1.1) extended the data rate to 4 Mbps, while maintaining
backward compatibility with version 1.0 interfaces.
 The protocol described in this application note is only for 115.2 Kbps. The 4 Mbps
interface uses a pulse position modulation scheme which sends two bits per light pulse.
 The IrDA standard contains three specifications. These relate to the Physical Layer, the
Link Access Protocol, and the Link Management Protocol.
 The data is first encoded before being transmitted as IR pulses. As shown in Figure 2,
the serial encoding of the UART is NRZ (non return to zero) encoding. NRZ encoded
outputs do not transition during the bit period, and may remain High or Low for
consecutive bit periods.
 This is not an efficient method for IR data transmission with LEDs. To limit the power
consumption of the LED, IrDA requires pulsing the LED in a RZI (return to zero,
inverted) modulation scheme so that the peak power to average power ratio can be
increased.

 The mandatory IrPHY (Infrared Physical Layer Specification) is the physical layer of
the IrDA specifications. It comprises optical link definitions, modulation,
coding, cyclic redundancy check (CRC) and the framer.
 Range:
 standard: 2 m;
 low-power to low-power: 0.2 m;
 Standard to low-power: 0.3 m.
 The 10 GigaIR also define new usage models that supports higher link
distances up to several meters.
 Angle: minimum cone ±15°
 Speed: 2.4 kbit/s to 1 Gbit/s
 Modulation: baseband, no carrier
 Infrared window (part of the device body transparent to infrared light beam)
 Wavelength: 850–900 nm
 he mandatory IrLAP (Infrared Link Access Protocol) is the second layer of the IrDA
specifications. It lies on top of the IrPHY layer and below the IrLMP layer. It represents
the data link layer of the OSI model.
 Access control
 Discovery of potential communication partners
 Establishing of a reliable bidirectional connection
 Distribution of the primary/secondary device roles
 Negotiation of QoS parameters
 The mandatory IrLMP (Infrared Link Management Protocol) is the third layer of the
IrDA specifications. It can be broken down into two parts. First, the LM-MUX (Link
Management Multiplexer), which lies on top of the IrLAP layer. Its most important
achievements are:
 Provides multiple logical channels
 Allows change of primary/secondary devices
 IrDA: Features
 IrDA compatible for communication with other standard compliant products
 1200 bps to 115.2 Kbps data rate designed for lower speed applications
 3 V and 5 V operation suitable for low power and portable applications
 Decodes negative or positive pulses for increased system flexibility
Basics of RTOS:
 Real-time concepts.
 Hard Real time and Soft Real-time.
 Differences between General Purpose OS & RTOS.
 Basic architecture of an RTOS.
 Scheduling Systems.
 Inter-process communication.
 Performance Matric in scheduling models.
 Interrupt management in RTOS environment.
 Memory management, File systems, I/O Systems.
 Advantage and disadvantage of RTOS.
 Overview of Open source RTOS for Embedded systems and application
development.

RTOS: Basics
 Real-time embedded systems have become pervasive. They are in your cars, cell
phones, Personal Digital Assistants (PDAs), watches, televisions, and home electrical
appliances. There are also larger and more complex real-time embedded systems, such
as air-traffic control systems, industrial process control systems, networked multimedia
systems, and real-time database applications. In fact, our daily life has become more
and more dependent on real-time embedded applications.
 An embedded system is a microcomputer system embedded in a larger system and
designed for one or two dedicated services. It is embedded as part of a complete device
that often has hardware and mechanical parts. Examples include the controllers built
inside our home electrical appliances. Most embedded systems have real-time
computing constraints. Therefore, they are also called real-time embedded systems.
Compared with general-purpose computing systems that have multiple functionalities,
embedded systems are often dedicated to specific tasks. For example, the embedded
airbag control system is only responsible for detecting collision and inflating the airbag
when necessary, and the embedded controller in an air conditioner is only responsible
for monitoring and regulating the temperature of a room.
 Real-time systems, however, are required to compute and deliver correct results within
a specified period of time. In other words, a job of a real-time system has a deadline,
being it hard or soft.
 Real-Time: Real-Time indicates an expectant response or reaction to an event on the
instant of its evolution. The expectant response depicts the logical correctness of the
result produced. The instant of the events’ evolution depicts deadline for producing the
result.
 Operating System (OS) is a system program that provides an interface between
hardware and application programs. OS is commonly equipped with features like:
Multitasking, Synchronization, Interrupt and Event Handling, Input/ Output, Inter-task
Communication, Timers and Clocks and Memory Management to fulfill its primary
role of managing the hardware resources to meet the demands of application programs.
 RTOS is therefore an operating system that supports real-time applications and
embedded systems by providing logically correct result within the deadline required.
Such capabilities define its deterministic timing behavior and limited resource
utilization nature.
DIFFERENCE: RTOS v/s General Purpose OS
 Determinism – The key difference between general-computing operating systems and
real-time operating systems is the “deterministic ” timing behavior in the real-time
operating systems. “Deterministic” timing means that OS consume only known and
expected amounts of time. RTOS have their worst case latency defined. Latency is not
of a concern for General Purpose OS.
 Task Scheduling – General purpose operating systems are optimized to run a variety
of applications and processes simultaneously, thereby ensuring that all tasks receive at
least some processing time. As a consequence, low-priority tasks may have their
priority boosted above other higher priority tasks, which the designer may not want.
However, RTOS uses priority-based preemptive scheduling, which allows high-priority
threads to meet their deadlines consistently. All system calls are deterministic, implying
time bounded operation for all operations and ISRs. This is important for embedded
systems where delay could cause a safety hazard. The scheduling in RTOS is time
based. In case of General purpose OS, like Windows/Linux, scheduling is process
based.
 Preemptive kernel – In RTOS, all kernel operations are preemptible
 Priority Inversion – RTOS have mechanisms to prevent priority inversion
 Usage – RTOS are typically used for embedded applications, while General Purpose
OS are used for Desktop PCs or other generally purpose PCs.
RTOS Characteristics:
 Reliability: Embedded systems must be reliable. Depending on the application, the
system might need to operate for long periods without human intervention. Different
degrees of reliability may be required. For example, a digital solar-powered calculator
might reset itself if it does not get enough light, yet the calculator might still be
considered acceptable. On the other hand, a telecom switch cannot reset during
operation without incurring high associated costs for down time. The RTOSes in these
applications require different degrees of reliability. Although different degrees of
reliability might be acceptable, in general, a reliable system is one that is available
(continues to provide service) and does not fail.
 Predictability: Because many embedded systems are also real-time systems, meeting
time requirements is key to ensuring proper operation. The RTOS used in this case
needs to be predictable to a certain degree. The term deterministic describes RTOSes
with predictable behavior, in which the completion of operating system calls occurs
within known timeframes. Developers can write simple benchmark programs to
validate the determinism of an RTOS. The result is based on timed responses to specific
RTOS calls. In a good deterministic RTOS, the variance of the response times for each
type of system call is very small.
 Performance: This requirement dictates that an embedded system must perform fast
enough to fulfill its timing requirements. Typically, the more deadlines to be met-and
the shorter the time between them-the faster the system's CPU must be. Although
underlying hardware can dictate a system's processing power, its software can also
contribute to system performance. Typically, the processor's performance is expressed
in million instructions per second (MIPS). Throughput also measures the overall
performance of a system, with hardware and software combined. One definition of
throughput is the rate at which a system can generate output based on the inputs coming
in. Throughput also means the amount of data transferred divided by the time taken to
transfer it. Data transfer throughput is typically measured in multiples of bits per second
(bps). Sometimes developers measure RTOS performance on a call-by-call basis.
Benchmarks are written by producing timestamps when a system call starts and when
it completes. Although this step can be helpful in the analysis stages of design, true
performance testing is achieved only when the system performance is measured as a
whole.
 Compactness: Application design constraints and cost constraints help determine how
compact an embedded system can be. For example, a cell phone clearly must be small,
portable, and low cost. These design requirements limit system memory, which in turn
limits the size of the application and operating system. In such embedded systems,
where hardware real estate is limited due to size and costs, the RTOS clearly must be
small and efficient. In these cases, the RTOS memory footprint can be an important
factor. To meet total system requirements, designers must understand both the static
and dynamic memory consumption of the RTOS and the application that will run on it.
 Scalability: Because RTOSes can be used in a wide variety of embedded systems, they
must be able to scale up or down to meet application-specific requirements. Depending
on how much functionality is required, an RTOS should be capable of adding or
deleting modular components, including file systems and protocol stacks. If an RTOS
does not scale up well, development teams might have to buy or build the missing
pieces. Suppose that a development team wants to use an RTOS for the design of a
cellular phone project and a base station project. If an RTOS scales well, the same
RTOS can be used in both projects, instead of two different RTOSes, which saves
considerable time and money.
RTOS CLASSFICATION:
RTOS specifies a known maximum time for each of the operations that it performs. Based
upon the degree of tolerance in meeting deadlines, RTOS are classified into following
categories
 Hard real-time: Degree of tolerance for missed deadlines is negligible. A missed
deadline can result in catastrophic failure of the system
 Firm real-time: Missing a deadline might result in an unacceptable quality
reduction but may not lead to failure of the complete system
 Soft real-time: Deadlines may be missed occasionally, but system doesn’t fail and
also, system quality is acceptable
For a life saving device, automatic parachute opening device for skydivers,
delay can be fatal. Parachute opening device deploys the parachute at a specific altitude
based on various conditions. If it fails to respond in specified time, parachute may not get
deployed at all leading to casualty. Similar situation exists during inflation of air bags, used
in cars, at the time of accident. If airbags don’t get inflated at appropriate time, it may be
fatal for a driver. So such systems must be hard real time systems, whereas for TV live
broadcast, delay can be acceptable. In such cases, soft real time systems can be used.
RTOS: Architecture
The architecture of an RTOS is dependent on the complexity of its deployment. Good RTOSs
are scalable to meet different sets of requirements for different applications. For simple
applications, an RTOS usually comprises only a kernel. For more complex embedded systems,
an RTOS can be a combination of various modules, including the kernel, networking protocol
stacks, and other components as illustrated in Figure:

Kernel:
For simpler applications, RTOS is usually a kernel but as complexity increases, various
modules like networking protocol stacks debugging facilities, device I/Os are includes in
addition to the kernel. RTOS kernel acts as an abstraction layer between the hardware and the
applications. There are three broad categories of kernels
 Monolithic kernel: Monolithic kernels are part of Unix-like operating systems like
Linux, FreeBSD etc. A monolithic kernel is one single program that contains all of the
code necessary to perform every kernel related task. It runs all basic system services
(i.e. process and memory management, interrupt handling and I/O communication, file
system, etc) and provides powerful abstractions of the underlying hardware. Amount
of context switches and messaging involved are greatly reduced which makes it run
faster than microkernel.
 Microkernel: It runs only basic process communication (messaging) and I/O control.
It normally provides only the minimal services such as managing memory protection,
Inter process communication and the process management. The other functions such as
running the hardware processes are not handled directly by microkernels. Thus, micro
kernels provide a smaller set of simple hardware abstractions. It is more stable than
monolithic as the kernel is unaffected even if the servers failed (i.e.File
System). Microkernels are part of the operating systems like AIX, BeOS, Mach, Mac
OS X, MINIX, and QNX. Etc
 Hybrid Kernel: Hybrid kernels are extensions of microkernels with some properties
of monolithic kernels. Hybrid kernels are similar to microkernels, except that they
include additional code in kernel space so that such code can run more swiftly than it
would were it in user space. These are part of the operating systems such as Microsoft
Windows NT, 2000 and XP. DragonFly BSD, etc
 Exokernel: Exokernels provides efficient control over hardware. It runs only services
protecting the resources (i.e. tracking the ownership, guarding the usage, revoking
access to resources, etc) by providing low-level interface for library operating systems
and leaving the management to the application.
Task Management:
 In RTOS, The application is decomposed into small, schedulable, and sequential
program units known as “Task”, a basic unit of execution and is governed by three
time-critical properties; release time, deadline and execution time.
 Release time refers to the point in time from which the task can be executed.
 Deadline is the point in time by which the task must complete.
 Execution time denotes the time the task takes to execute.

Each task may exist in following states


 Dormant : Task doesn’t require computer time
 Ready: Task is ready to go active state, waiting processor time
 Active: Task is running
 Suspended: Task put on hold temporarily
 Pending: Task waiting for resource.

During the execution of an application program, individual tasks are continuously


changing from one state to another. However, only one task is in the running mode (i.e.
given CPU control) at any point of the execution. In the process where CPU control is
change from one task to another, context of the to-be-suspended task will be saved
while context of the to-be-executed task will be retrieved, the process referred to as
context switching.
A task object is defined by the following set of components:
 Task Control block: Task uses TCBs to remember its context. TCBs are data
structures residing in RAM, accessible only by RTOS
 Task Stack: These reside in RAM, accessible by stack pointer.
 Task Routine: Program code residing in ROM
Multitasking:
 Real-world events may occur simultaneously. Multitasking refers to the capability of
an OS that supports multiple independent programs running on the same computer.
 It is mainly achieved through time-sharing, which means that each program uses a share
of the computer’s time to execute. How to share processors’ time among multiple tasks
is addressed by schedulers, which follow scheduling algorithms to decide when to
execute which task on which processor.
 Each task has a context, which is the data indicating its execution state and stored in
the task control block (TCB), a data structure that contains all the information that is
pertinent to the execution of the task.
 When a scheduler switches a task out of the CPU, its context has to be stored; when the
task is selected to run again, the task’s context is restored so that the task can be
executed from the last interrupted point.
 The process of storing and restoring the context of a task during a task switch is called
a context switch, which is illustrated in Figure.
 Context switches are the overhead of multitasking. They are usually computationally
intensive. Context switch optimization is one of the tasks of OS design. This is
particularly the case in RTOS design.
Context Switching:
 A task is a sequential piece of code – it does not know when it is going to get suspended
(swapped out or switched out) or resumed (swapped in or switched in) by the kernel
and does not even know when this has happened.
 Consider the example of a task being suspended immediately before executing an
instruction that sums the values contained within two processor registers. While the
task is suspended other tasks will execute and may modify the processor register values.
Upon resumption the task will not know that the processor registers have been altered
– if it used the modified values the summation would result in an incorrect value.
 To prevent this type of error it is essential that upon resumption a task has a context
identical to that immediately prior to its suspension. The operating system kernel is
responsible for ensuring this is the case – and does so by saving the context of a task as
it is suspended. When the task is resumed its saved context is restored by the operating
system kernel prior to its execution. The process of saving the context of a task being
suspended and restoring the context of a task being resumed is called context switching.
Scheduler:
• The scheduler keeps record of the state of each task and selects from among them that
are ready to execute and allocates the CPU to one of them.
• A scheduler helps to maximize CPU utilization among different tasks in a multi-tasking
program and to minimize waiting time.
• There are generally two types of schedulers: non-preemptive and priority-based
preemptive.
• Non-preemptive scheduling or cooperative multitasking requires the tasks to cooperate
with each other to explicitly give up control of the processor. When a task releases the
control of the processor, the next most important task that is ready to run will be
executed. A task that is newly assigned with a higher priority will only gain control of
the processor when the current executing task voluntarily gives up the control. Figure
gives an example of a non-preemptive scheduling
Non-preemptive scheduling:

Priority Based-preemptive scheduling:


Priority-based preemptive scheduling requires control of the processor be given to the task
of the highest priority at all time. In the event that makes a higher priority task ready to run,
the current task is immediately suspended and the control of the processor is given to the higher
priority task. Figure shows an example of a preemptive scheduling.

Intertask Communication:
 The multi-tasking model, we have seen that each task is a quasi-independent program.
Although tasks in an embedded application have a degree of independence, it does not
mean that they have no “awareness” of one another. Although some tasks will be truly
isolated from others, the requirement for communication and synchronization between
tasks is very common.
 This represents a key part of the functionality provided by an RTOS. The actual range
of options offered by a different RTOSes may vary quite widely.
 Task-owned facilities – attributes that an RTOS imparts to tasks that provide
communication (input) facilities. The example we will look at some more is signals.
 Kernel objects – facilities provided by the RTOS which represent stand-alone
communication or synchronization facilities. Examples include: event flags, mailboxes,
queues/pipes, semaphores and mutexes.
 Message passing – a rationalized scheme where an RTOS allows the creation of
message objects, which may be sent from one task to another or to several others. This
is fundamental to the kernel design and leads to the description of such a product as
being a “message passing RTOS”.
 Signals- Signals are probably the simplest inter-task communication facility offered in
conventional RTOSes. They consist of a set of bit flags – there may be 8, 16 or 32,
depending on the specific implementation – which is associated with a specific task.
 A signal flag (or several flags) may be set by any task using an OR type of operation.
Only the task that owns the signals can read them. The reading process is generally
destructive – i.e. the flags are also cleared.
 In some systems, signals are implemented in a more sophisticated way such that a
special function – nominated by the signal owning task – is automatically executed
when any signal flags are set. This removes the necessity for the task to monitor the
flags itself. This is somewhat analogous to an interrupt service routine.
 Event Flag Groups: Event flag groups are like signals in that they are a bit-oriented
inter-task communication facility. They may similarly be implemented in groups of 8,
16 or 32 bits. They differ from signals in being independent kernel objects; they do not
“belong” to any specific task.
 Any task may set and clear event flags using OR and AND operations. Likewise, any
task may interrogate event flags using the same kind of operation. In many RTOSes, it
is possible to make a blocking API call on an event flag combination; this means that a
task may be suspended until a specific combination of event flags has been set. There
may also be a “consume” option available, when interrogating event flags, such that all
read flags are cleared.
 Semaphores: Semaphores are independent kernel objects, which provide a flagging
mechanism that is generally used to control access to a resource. There are broadly two
types: binary semaphores (that just have two states) and counting semaphores (that have
an arbitrary number of states). Some processors support (atomic) instructions that
facilitate the easy implementation of binary semaphores. Binary semaphores may also
be viewed as counting semaphores with a count limit of 1.
 Any task may attempt to obtain a semaphore in order to gain access to a resource. If the
current semaphore value is greater than 0, the obtain will be successful, which
decrements the semaphore value. In many OSes, it is possible to make a blocking call
to obtain a semaphore; this means that a task may be suspended until the semaphore is
released by another task. Any task may release a semaphore, which increments its
value.
 Mailboxes: Mailboxes are independent kernel objects, which provide a means for tasks
to transfer messages. The message size depends on the implementation, but will
normally be fixed. One to four pointer-sized items are typical message sizes.
Commonly, a pointer to some more complex data is sent via a mailbox. Some kernels
implement mailboxes so that the data is just stored in a regular variable and the kernel
manages access to it. Mailboxes may also be called “exchanges”, though this name is
now uncommon.
 Any task may send to a mailbox, which is then full. If a task then tries to send to a full
mailbox, it will receive an error response. In many RTOSes, it is possible to make a
blocking call to send to a mailbox; this means that a task may be suspended until the
mailbox is read by another task. Any task may read from a mailbox, which renders it
empty again. If a task tries read from an empty mailbox, it will receive an error
response. In many RTOSes, it is possible to make a blocking call to read from a
mailbox; this means that a task may be suspended until the mailbox is filled by another
task.
 Queues: Queues are independent kernel objects, that provide a means for tasks to
transfer messages. They are a little more flexible and complex than mailboxes. The
message size depends on the implementation, but will normally be a fixed size and
word/pointer oriented.
 Any task may send to a queue and this may occur repeatedly until the queue is full, after
which time any attempts to send will result in an error. The depth of the queue is
generally user specified when it is created or the system is configured. In many
RTOSes, it is possible to make a blocking call to send to a queue; this means that, if the
queue is full, a task may be suspended until the queue is read by another task. Any task
may read from a queue. Messages are read in the same order as they were sent – first
in, first out (FIFO). If a task tries to read from an empty queue, it will receive an error
response. In many RTOSes, it is possible to make a blocking call to read from a queue;
this means that, if the queue is empty, a task may be suspended until a message is sent
to the queue by another task.
 Mutexes: Mutual exclusion semaphores – mutexes – are independent kernel objects,
which behave in a very similar way to normal binary semaphores. They are slightly
more complex and incorporate the concept of temporary ownership (of the resource,
access to which is being controlled). If a task obtains a mutex, only that same task can
release it again – the mutex (and, hence, the resource) is temporarily owned by the task.
 Mutexes are not provided by all RTOSes, but it is quite straightforward to adapt a
regular binary semaphore. It would be necessary to write a “mutex obtain” function,
which obtains the semaphore and notes the task identifier. Then a complementary
“mutex release” function would check the calling task’s identifier and release the
semaphore only if it matches the stored value, otherwise it would return an error.
Inter-process Communication:
 Inter Process Communication (IPC) is a mechanism that involves communication of
one process with another process. This usually occurs only in one system.
 Communication can be of two types −
Between related processes initiating from only one process, such as parent and child
processes.
Between unrelated processes, or two or more different processes.
Following are some important terms that we need to know before proceeding further on this
topic.
 Pipes − Communication between two related processes. The mechanism is half duplex
meaning the first process communicates with the second process. To achieve a full
duplex i.e., for the second process to communicate with the first process another pipe
is required.
 FIFO − Communication between two unrelated processes. FIFO is a full duplex,
meaning the first process can communicate with the second process and vice versa at
the same time.
 Message Queues − Communication between two or more processes with full duplex
capacity. The processes will communicate with each other by posting a message and
retrieving it out of the queue. Once retrieved, the message is no longer available in the
queue.
 Shared Memory − Communication between two or more processes is achieved
through a shared piece of memory among all processes. The shared memory needs to
be protected from each other by synchronizing access to all the processes.
 Semaphores − Semaphores are meant for synchronizing access to multiple processes.
When one process wants to access the memory (for reading or writing), it needs to be
locked (or protected) and released when the access is removed. This needs to be
repeated by all the processes to secure data.
 Signals − Signal is a mechanism to communication between multiple processes by way
of signaling. This means a source process will send a signal (recognized by number)
and the destination process will handle it accordingly.
Interrupt Management:
• An interrupt is a signal from a device attached to a computer or from a running process
within the computer, indicating an event that needs immediate attention. The processor
responds by suspending its current activity, saving its state, and executing a function
called an interrupt handler (also called interrupt service routine, ISR) to deal with the
event.
• Modern OSs are interrupt-driven. Virtually, all activities are initiated by the arrival of
interrupts. Interrupt transfers control to the ISR, through the interrupt vector, which
contains the addresses of all the service routines. Interrupt architecture must save the
address of the interrupted instruction.
• Incoming interrupts are disabled while another interrupt is being processed. A system
call is a software-generated interrupt caused either by an error or by a user request.
Memory Management:
 Main memory is the most critical resource in a computer system in terms of speed at
which programs run. The kernel of an OS is responsible for all system memory that is
currently in use by programs. Entities in memory are data and instructions.
 Each memory location has a physical address. In most computer architectures, memory
is byte-addressable, meaning that data can be accessed 8 bits at a time, irrespective of
the width of the data and address buses. Memory addresses are fixed-length sequences
of digits. In general, only system software such as Basic Input/ Output System (BIOS)
and OS can address physical memory.
 Most application programs do not have knowledge of physical addresses. Instead, they
use logic addresses. A logical address is the address at which a memory location
appears to reside from the perspective of an executing application program. A logical
address may be different from the physical address due to the operation of an address
translator or mapping function.
 In a computer supporting virtual memory, the term physical address is used mostly to
differentiate from a virtual address. In particular, in computers utilizing a memory
management unit (MMU) to translate memory addresses, the virtual and physical
addresses refer to an address before and after translation performed by the MMU,
respectively. There are several reasons to use virtual memory.
 Among them is memory protection. If two or more processes are running at the same
time and use direct addresses, a memory error in one process (e.g., reading a bad
pointer) could destroy the memory being used by the other process, taking down
multiple programs due to a single crash. The virtual memory technique, on the other
hand, can ensure that each process is running in its own dedicated address space.
File Management:
 Files are the fundamental abstraction of secondary storage devices. Each file is a named
collection of data stored in a device. An important component of an OS is the file
system, which provides capabilities of file management, auxiliary storage management,
file access control, and integrity assurance.
 File management is concerned with providing the mechanisms for files to be stored,
referenced, shared, and secured. When a file is created, the file system allocates an
initial space for the data. Subsequent incremental allocations follow as the file grows.
 When a file is deleted or its size is shrunk, the space that is freed up is considered
available for use by other files. This creates alternating used and unused areas of various
sizes.
 When a file is created and there is not an area of contiguous space available for its initial
allocation, the space must be assigned in fragments. Because files do tend to grow or
shrink over time, and because users rarely know in advance how large their files will
be, it makes sense to adopt noncontiguous storage allocation schemes.
 Figure illustrates the block chaining scheme. The initial address of storage of a file is
identified by its file name. Typically, files on a computer are organized into directories,
which constitute a hierarchical system of tree structure.
I/O Management:
 Modern computers interact with a wide range of I/O devices. Keyboards, mice, printers,
disk drives, USB drives, monitors, networking adapters, and audio systems are among
the most common ones. One purpose of an OS is to hide peculiarities of hardware I/O
devices from the user.
 In memory-mapped I/O, each I/O device occupies some locations in the I/O address
space. Communication between the I/O device and the processor is enabled through
physical memory locations in the I/O address space. By reading from or writing to those
addresses, the processor gets information from or sends commands to I/O devices.
 Most systems use device controllers. A device controller is primarily an interface unit.
The OS communicates with the I/O device through the device controller. Nearly all
device controllers have direct memory access (DMA) capability, meaning that they can
directly access the memory in the system, without the intervention by the processor.
This frees up the processor of the burden of data transfer from and to I/O devices.
• Interrupts allow devices to notify the processor when they have data to transfer or when
an operation is complete, allowing the processor to perform other duties when no I/O
transfers need its immediate attention. The processor has the interrupt request line that
it senses after executing every instruction. When a device controller raises an interrupt
by asserting a signal on the interrupt request
• line, the processor catches it and saves the state and then transfers control to the
interrupt handler. The interrupt handler determines the cause of the interrupt, performs
necessary processing, and executes a return from interrupt instruction to return control
to the processor.
• I/O operations often have high latencies. Most of this latency is due to the slow speed
of peripheral devices. For example, information cannot be read from or written to a
hard disk until the spinning of the disk brings the target sectors directly under the
read/write head. The latency can be alleviated by having one or more input and output
buffers associated with each device.
RTOS: Perfomance Metrics:
 Memory – how much ROM and RAM does the kernel need and how is this affected by
options and configuration?
 ROM, which is normally flash memory, is used for the kernel, along with code for the
runtime library and any middleware components. This code – or parts of it – may be
copied to RAM on boot up, as this can offer improved performance. There is also likely
to be some read only data. If the kernel is statically configured, this data will include
extensive information about kernel objects. However, nowadays, most kernels are
dynamically configured.
 RAM space will be used for kernel data structures, including some or all of the kernel
object information, again depending upon whether the kernel is statically or
dynamically configured. There will also be some global variables. If code is copied
from flash to RAM, that space must also be accounted for.
 There are a number of factors that affect the memory footprint of an RTOS. The CPU
architecture is key. The number of instructions can vary drastically from one processor
to another, so looking at size figures for, say, PowerPC gives no indication of what the
ARM version might be like.
 Embedded compilers generally have a large number of optimization settings. These can
be used to reduce code size, but that will most likely affect performance. Optimizations
affect ROM footprint, and also RAM. Data size can also be affected by optimization,
as data structures can be packed or unpacked. Again both ROM and RAM can be
affected. Packing data has an adverse effect on performance.
 Latency, which is broadly the delay between something happening and the response to
that occurrence. This is a particular minefield of terminology and misinformation, but
there are two essential latencies to consider: interrupt response and task scheduling.
• Interrupt Latency: The time related performance measurements are probably of most
concern to developers using an RTOS. A key characteristic of a real time system is its
timely response to external events and an embedded system is typically notified of an
event by means of an interrupt, so interrupt latency is critical.
• System: the total delay between the interrupt signal being asserted and the start of the
interrupt service routine execution.
• OS: the time between the CPU interrupt sequence starting and the initiation of the ISR.
This is really the operating system overhead, but many people refer to it as the latency.
This means that some vendors claim zero interrupt latency.
• Interrupt latency is the sum of the hardware dependent time, which depends on the
interrupt controller as well as the type of the interrupt, and the OS induced overhead.
• Scheduling latency: A key part of the functionality of an RTOS is its ability to support
a multi-threading execution environment. Being real time, the efficiency at which
threads or tasks are scheduled is of some importance and the scheduler is at the core of
an RTOS. It is hard to get a clear picture of performance, as there is a wide variation in
the techniques employed to make measurements and in the interpretation of the results.
• There are really two separate measurements to consider:
– The context switch time
– The time overhead that the RTOS introduces when scheduling a task
• The scheduling latency is the maximum of two times: ƮSO, the scheduling overhead;
the end of the ISR to the start of task schedule and ƮCS, the time taken to save and
restore thread context.
 Performance of kernel services. How long does it take to perform specific actions?
• Timing kernel services
• An RTOS is likely to have a great many system service API (application program
interface) calls, probably numbering into the hundreds. To assess timing, it is not useful
to try to analyze the timing of every single call. It makes more sense to focus on the
frequently used services.
• For most RTOSes, there are four key categories of service call:
– Threading services
– Synchronization services
– Inter-process communication services
– Memory services
RTOS: Open Source:
Name Platforms Built-in Components Description
FreeRTOS • MSP430 • FileSystem FreeRTOS is a market leading
• ARM • Network RTOS from Amazon Web
• AVR • TLS/SSL Services that supports more than
• ColdFire • Command Line 35 architectures. It is distributed
• PIC Interface under the MIT license.
• x86 • Runtime
Analysis
RIOT • MSP430 • BLE RIOT is a real-time multi-
• ARM • LoRaWAN threading operating system that
• AVR • FileSystem supports a range of devices that
• MIPS • Network, are typically found in the Internet
• RISC-V 6LoWPAN, GUI, of Things (IoT): 8-bit, 16-bit and
TLS/SSL, USB 32-bit microcontrollers.
Device, OTA
TinyOS • MSP430 TinyOS is an open source, BSD-
• AVR licensed operating system
designed for low-power wireless
devices, such as those used in
sensor networks, ubiquitous
computing, personal area
networks, smart buildings, and
smart meters.
Applications of RTOS:
Real-time systems are used in:
 Airlines reservation system.
 Air traffic control system.
 Systems that provide immediate updating.
 Used in any system that provides up to date and minute information on stock prices.
 Defense application systems like RADAR.
 Networked Multimedia Systems
 Command Control Systems
 Internet Telephony
 Anti-lock Brake Systems
 Heart Pacemaker
Advantages of RTOS:
 When all the resources and devices are inactive, then the RTOS gives maximum
consumption of the system and more output.
 When a task is performing there is a no chance to get the error because the RTOS is an
error free.
 Memory allocation is the best type to manage in this type of system.
 In this type of system, the shifting time is very less.
 Because of the small size of the program, the RTOS is used in the embedded system like
transport and others.
Disadvantages of RTOS:
 RTOS system can run minimal tasks together, and it concentrates only on those
applications which contain an error so that it can avoid them.
 RTOS is the system that concentrates on a few tasks. Therefore, it is really hard for
these systems to do multi-tasking.
 Specific drivers are required for the RTOS so that it can offer fast response time to
interrupt signals, which helps to maintain its speed.
 Plenty of resources are used by RTOS, which makes this system expensive.
 The tasks which have a low priority need to wait for a long time as the RTOS maintains
the accuracy of the program, which are under execution.
 Minimum switching of tasks is done in Real time operating systems.
 It uses complex algorithms which is difficult to understand.
 RTOS uses lot of resources, which sometimes not suitable for the system.
Chapter 2: Communication Interfaces
 Modes of communication: Serial/Parallel, Synchronous/Asynchronous
 Onboard Communication Interfaces: I²C, CAN, SPI, PSI
 External Communication Interfaces: RS232, USB
 Wireless Communication Interfaces: IrDA, Bluetooth, Zigbee

Communication Interfaces:
Digital communication can be considered as the communication happening between two (or
more) devices in terms of bits. This transferring of data, either wirelessly or through wires, can
be either one bit at a time or the entire data (depending on the size of the processor inside i.e.,
8 bit, 16 bit etc.) at once. Based on this, we can have the following classification namely, Serial
Communication and Parallel Communication.
Serial Communication implies transferring of data bit by bit, sequentially. This is the most
common form of communication used in the digital word. Contrary to the parallel
communication, serial communication needs only one line for the data transfer. Thereby, the
cost for the communication line as well as the space required is reduced.
Parallel communication implies transferring of the bits in a parallel fashion at a time. This
communication comes for rescue when speed rather than space is the main objective. The
transfer of data is at high speed owing to the fact that no bus buffer is present.
Serial/ Parallel Communication:
 For a 8 bit data transfer in Serial communication one bit will be sent at a time. The
entire data is first fed into the serial port buffer. From this buffer one bit will be sent at
a time. Only after the last bit is received the data transferred can be forwarded for
processing.
 While in the Parallel Communication a serial port buffer is not required. According to
the length of the data, the number of bus lines are available plus a synchronization line
for synchronized transmission of data.
 Thus we can state that for the same frequency of data transmission Serial
communication is slower than parallel communication
Serial Transmission Parallel Transmission

In serial transmission, data (bit) flows in bi- In Parallel Transmission, data flows in
direction. multiple lines.

Serial Transmission is cost-efficient. Parallel Transmission is not cost-efficient.

In serial transmission, one bit transferred at In Parallel Transmission, eight bits


one clock pulse. transferred at one clock pulse.

Serial Transmission is slow in comparison Parallel Transmission is fast in comparison


of Parallel Transmission. of Serial Transmission.

Generally, Serial Transmission is used for Generally, Parallel Transmission is used for
long-distance. short distance.

The circuit used in Serial Transmission is The circuit used in Parallel Transmission is
simple. relatively complex.

Why serial communication is preferred over parallel?


 While parallel communication is faster when the frequency of transmission is same, it
is cumbersome when the transmission is long distance.
 Also with the number of data channels it should also have a synchronous channel or a
clock channel to keep the data synchronized.
 In Serial the data is sent sequentially and latched up at the receiving end thus procuring
the entire data from the data bus using USART/UART (Universal Synchronous
Asynchronous Receiver Transmitter) without any loss in synchronization but in parallel
even if one wire takes more time to recover the received data will be faulty.
 The length of the wire for parallel interface is usually small. It is due to a phenomenon
called crosstalk. “In electronics, crosstalk is any phenomenon by which a signal
transmitted on one circuit or channel of a transmission system creates an undesired
effect in another circuit or channel.”
 This crosstalk worsens over length hence putting an upper limit on the length of the
parallel transmission. Thus it finds its place in the computer peripheral buses and RAM
where speed cannot be compromised and the length has to be short.
 The Serial Communication gives an option to increase the frequency of transmission
without hampering the data while it is not possible in case of a parallel communication.
The window for data collection at the receiving end reduces with every step increase in
the frequency. This increases the difficulty and eventually results in gibberish data.
Synchronous/ Asynchronous:
 The term synchronous is used to describe a continuous and consistent timed transfer of
data blocks.
 Synchronous data transmission is a data transfer method in which a continuous
stream of data signals is accompanied by timing signals (generated by an electronic
clock) to ensure that the transmitter and the receiver are in step (synchronized) with one
another. The data is sent in blocks (called frames or packets) spaced by fixed time
intervals.
 Synchronous transmission modes are used when large amounts of data must be
transferred very quickly from one location to the other. The speed of the synchronous
connection is attained by transferring data in large blocks instead of individual
characters.
 Synchronous transmission synchronizes transmission speeds at both the receiving and
sending end of the transmission using clock signals built into each component. A
continual stream of data is then sent between the two nodes.
 The data blocks are grouped and spaced in regular intervals and are preceded by special
characters called sync or synchronous idle characters.
 After the sync characters are received by the remote device, they are decoded and used
to synchronize the connection. After the connection is correctly synchronized, data
transmission may begin.
 Most network protocols (such as Ethernet, SONET, Token Ring) use synchronous
transmission.

 Asynchronous transmission works in spurts and must insert a start bit before each
data character and a stop bit at its termination to inform the receiver where it begins
and ends.
 The term asynchronous is used to describe the process where transmitted data is
encoded with start and stop bits, specifying the beginning and end of each character.
 These additional bits provide the timing or synchronization for the connection by
indicating when a complete character has been sent or received; thus, timing for each
character begins with the start bit and ends with the stop bit.
 When gaps appear between character transmissions, the asynchronous line is said to be
in a mark state. A mark is a binary 1 (or negative voltage) that is sent during periods of
inactivity on the line as shown in the following figure.
 The following is a list of characteristics specific to asynchronous communication:
 Each character is preceded by a start bit and followed by one or more stop bits.
 Gaps or spaces between characters may exist.
 With asynchronous transmission, a large text document is organized into long strings
of letters (or characters) that make up the words within the sentences and paragraphs.
These characters are sent over the communication link one at a time and reassembled
at the remote location.
 Asynchronous transmission is used commonly for communications over telephone
lines.

Synchronous Transmission Asynchronous Transmission

In Synchronous transmission, Data is sent in In asynchronous transmission, Data is sent


form of blocks or frames. in form of byte or character.

Synchronous transmission is fast. Asynchronous transmission is slow.

Synchronous transmission is costly. Asynchronous transmission is economical.

In Synchronous transmission, time interval In asynchronous transmission, time interval


of transmission is constant. of transmission is not constant, it is random.

In Synchronous transmission, There is no In asynchronous transmission, There is


gap present between data. present gap between data.

While in asynchronous transmission,


Efficient use of transmission line is done in
transmission line remains empty during gap
synchronous transmission.
in character transmission.

Synchronous transmission needs precisely Asynchronous transmission have no need of


synchronized clocks for the information of synchronized clocks as parity bit is used in
new bytes. this transmission

I2C: Inter-Integrated Circuit:


 I2C combines the best features of SPI and UARTs. With I2C, you can connect multiple
slaves to a single master (like SPI) and you can have multiple masters controlling single,
or multiple slaves.
 This is really useful when you want to have more than one microcontroller logging data
to a single memory card or displaying text to a single LCD.
 I2C is a serial communication protocol, so data is transferred bit by bit along a single
wire (the SDA line).
 Like SPI, I2C is synchronous, so the output of bits is synchronized to the sampling of
bits by a clock signal shared between the master and the slave. The clock signal is
always controlled by the master.
 SDA (Serial Data) – The line for the master and slave to send and receive data.

 SCL (Serial Clock) – The line that carries the clock signal.

 With I2C, data is transferred in messages. Messages are broken up into frames of data.
Each message has an address frame that contains the binary address of the slave, and
one or more data frames that contain the data being transmitted.
 The message also includes start and stop conditions, read/write bits, and ACK/NACK
bits between each data frame:
 Start Condition: The SDA line switches from a high voltage level to a low voltage
level before the SCL line switches from high to low.
 Stop Condition: The SDA line switches from a low voltage level to a high voltage
level after the SCL line switches from low to high.
 Address Frame: A 7 or 10 bit sequence unique to each slave that identifies the slave
when the master wants to talk to it.
 Read/Write Bit: A single bit specifying whether the master is sending data to the slave
(low voltage level) or requesting data from it (high voltage level).
 ACK/NACK Bit: Each frame in a message is followed by an acknowledge/no-
acknowledge bit. If an address frame or data frame was successfully received, an ACK
bit is returned to the sender from the receiving device.

 The master sends the start condition to every connected slave by switching the SDA
line from a high voltage level to a low voltage level before switching the SCL line from
high to low:
 The master sends each slave the 7 or 10 bit address of the slave it wants to communicate
with, along with the read/write bit.
 Each slave compares the address sent from the master to its own address. If the address
matches, the slave returns an ACK bit by pulling the SDA line low for one bit. If the
address from the master does not match the slave’s own address, the slave leaves the
SDA line high.
 The master sends or receives the data frame:
 After each data frame has been transferred, the receiving device returns another ACK
bit to the sender to acknowledge successful receipt of the frame.
 To stop the data transmission, the master sends a stop condition to the slave by
switching SCL high before switching SDA high.

ADVANTAGES
 Only uses two wires
 Supports multiple masters and multiple slaves
 ACK/NACK bit gives confirmation that each frame is transferred successfully
 Hardware is less complicated than with UARTs
 Well known and widely used protocol
DISADVANTAGES
 Slower data transfer rate than SPI
 The size of the data frame is limited to 8 bits
 More complicated hardware needed to implement than SPI

SPI: Serial Peripheral Interface:


 Serial Peripheral Interface or SPI is a synchronous serial communication protocol that
provides full – duplex communication at very high speeds. Serial Peripheral Interface
(SPI) is a master – slave type protocol that provides a simple and low cost interface
between a microcontroller and its peripherals.
 SPI Interface bus is commonly used for interfacing microprocessor or microcontroller
with memory like EEPROM, RTC, ADC, DAC, displays like LCDs, Audio ICs, sensors
like temperature and pressure, memory cards like MMC or SD Cards or even other
 In order to overcome this, UART uses synchronization bits i.e. Start bit and Stop bits
and also a pre agreed data transfer speeds (typically 9600 bps). If the baud rates of
transmitter and receiver are not matched, the data sent from the transmitter will not
reach the receiver properly and often garbage or junk values are received.
 SPI is a Synchronous type serial communication i.e. it uses a dedicated clock signal to
synchronize the transmitter and receiver or Master and Slave, speaking in SPI terms.
The transmitter and receiver are connected with separate data and clock lines and the
clock signal will help the receiver when to look for data on the bus.
 The clock signal must be supplied by the Master to the slave (or all the slaves in case
of multiple slave setup). There are two types of triggering mechanisms on the clock
signal that are used to intimate the receiver about the data: Edge Triggering and Level
Triggering.

 The most commonly used triggering is edge triggering and there are two types: rising
edge and falling edge. Depending on how the receiver is configured, up on detecting
the edge, the receiver will look for data on the data bus from the next bit.
 SPI or Serial Peripheral Interface was developed by Motorola in the 1980’s as a
standard, low – cost and reliable interface between the Microcontroller
(microcontrollers by Motorola in the beginning) and its peripheral ICs.
 In SPI protocol, the devices are connected in a Master – Slave relationship in a multi –
point interface. In this type of interface, one device is considered the Master of the bus
and all the other devices are considered as slaves.
 The SPI bus consists of 4 signals or pins. They are
 Master – Out / Slave – In (MOSI)
 Master – In / Slave – Out (MISO)
 Serial Clock (SCLK) and
 Chip Select (CS) or Slave Select (SS)
 Master – Out / Slave – Data generated by the Master and received by the Slave. Hence,
MOSI pins on both the master and slave are connected together. Master – In / Slave –
Out or MISO is the data generated by Slave and must be transmitted to Master.
 MISO pins on both the master and slave are ties together. Even though the Signal in
MISO is produced by the Slave, the line is controlled by the Master. The Master
generates a clock signal at SCLK and is supplied to the clock input of the slave. Chip
Select (CS) or Slave Select (SS) is used to select a particular slave by the master.
 Since the clock is generated by the Master, the flow of data is controlled by the master.
For every clock cycle, one bit of data is transmitted from master to slave and one bit of
data is transmitted from slave to master.
 This process happen simultaneously and after 8 clock cycles, a byte of data is
transmitted in both directions and hence, SPI is a full – duplex communication.
 If the data has to be transmitted by only one device, then the other device has to send
something (even garbage or junk data) and it is up to the device whether the transmitted
data is actual data or not.
 This means that for every bit transmitted by one device, the other device has to send
one bit data i.e. the Master simultaneously transmits data on MOSI line and receive data
from slave on MISO line.
 If the slave wants to transmit the data, the master has to generate the clock signal
accordingly by knowing when the slave wants to send the data in advance. If more than
one slave has to be connected to the master, then the setup will be something similar to
the following image.
 Even though multiple slaves are connected to the master in the SPI bus, only one slave
will be active at any time. In order to select the slave, the master will pull down the SS
(Slave Select) or CS (Chip Select) line of the corresponding slave.
 Hence, there must by a separate CS pin on the Master corresponding to each of the slave
device. We need to pull down the SS or CS line to select the slave because this line is
active low.

 There are two types of configurations in which the SPI devices can be connected in an
SPI bus. They are Independent Slave Configuration and Daisy Chain Configuration.
 In Independent Slave Configuration, the master has dedicated Slave Select Lines for
all the slaves and each slave can be selected individually. All the clock signals of the
slaves are connected to the master SCK.
 Similarly, all the MOSI pins of all the slaves are connected to the MOSI pin of the
master and all the MISO pins of all the slaves are connected to the MISO pin of the
master.

 In Daisy Chain Configuration, only a single Slave Select line is connected to all the
slaves. The MOSI of the master is connected to the MOSI of slave 1. MISO of slave 1
is connected to MOSI of slave 2 and so on. The MISO of the final slave is connected
to the MISO of the master.
 Consider the master transmits 3 bytes of data in to the SPI bus. First, the 1st byte of
data is shifted to slave 1. When the 2nd byte of data reaches slave 1, the first byte is
pushed in to slave 2.
 Finally, when the 3rd byte of data arrives in to the first slave, the 1st byte of data is
shifted to slave 3 and the second byte of data is shifted in to second slave.
Advantages
 SPI is very simple to implement and the hardware requirements are not that
complex.
 Supports full – duplex communication at all times.
 Very high speed of data transfer.
 No need for individual addresses for slaves as CS or SS is used.
 Only one master device is supported and hence there is no chance of conflicts.
 Clock from the master is configured based on speed of the slave and hence slave
doesn’t have to worry about clock.
Disadvantages
 Each additional slave requires an additional dedicated pin on master for CS or SS.
 There is no acknowledgement mechanism and hence there is no confirmation of
receipt of data.
 Slowest device determines the speed of transfer.
 There are no official standards and hence often used in application specific
implementations.
 There is no flow control.
Applications of SPI
 Memory: SD Card , MMC , EEPROM , Flash
 Sensors: Temperature and Pressure
 Control Devices: ADC , DAC , digital POTS and Audio Codec.
 Others: Camera Lens Mount, touchscreen, LCD, RTC, video game controller, etc.

CAN: Controller Area Network:


 CAN is an International Standardization Organization (ISO) defined serial
communications bus originally developed for the automotive industry to replace the
complex wiring harness with a two-wire bus. Such as the engine-management systems,
active suspension, central locking, air conditioning, airbags, etc.
 The idea was initiated by Robert Bosch GmbH in 1983 to improve the quality and safety
of automobiles, enhancing automobile reliability and fuel efficiency.
 The specification calls for high immunity to electrical interference and the ability to
self-diagnose and repair data errors. These features have led to CAN’s popularity in a
variety of industries including building automation, medical, and manufacturing.
 The CAN communication protocol is a carrier-sense, multiple-access protocol with
collision detection and arbitration on message priority (CSMA/CD+AMP). CSMA
means that each node on a bus must wait for a prescribed period of inactivity before
attempting to send a message. CD+AMP means that collisions are resolved through a
bit-wise arbitration, based on a preprogrammed priority of each message in the
identifier field of a message.
 The higher priority identifier always wins bus access. That is, the last logic high in the
identifier keeps on transmitting because it is the highest priority. Since every node on
a bus takes part in writing every bit "as it is being written," an arbitrating node knows
if it placed the logic-high bit on the bus

CAN: Controller Area Network:

An example of point-to-point wiring connection in a CAN protocol.

An algorithmic diagram that shows the connectivity between devices using the CAN protocol.
 SOF–The single dominant start of frame (SOF) bit marks the start of a message, and is
used to synchronize the nodes on a bus after being idle.
 Identifier-The Standard CAN 11-bit identifier establishes the priority of the message.
The lower the binary value, the higher its priority.
 RTR–The single remote transmission request (RTR) bit is dominant when information
is required from another node. All nodes receive the request, but the identifier
determines the specified node. The responding data is also received by all nodes and
used by any node interested. In this way, all data being used in a system is uniform.
 IDE–A dominant single identifier extension (IDE) bit means that a standard CAN
identifier with no extension is being transmitted.
 r0–Reserved bit (for possible use by future standard amendment).
 DLC–The 4-bit data length code (DLC) contains the number of bytes of data being
transmitted.
 Data–Up to 64 bits of application data may be transmitted.
 CRC–The 16-bit (15 bits plus delimiter) cyclic redundancy check (CRC) contains the
checksum (number of bits transmitted) of the preceding application data for error
detection.
 ACK–Every node receiving an accurate message overwrites this recessive bit in the
original message with a dominate bit, indicating an error-free message has been sent.
Should a receiving node detect an error and leave this bit recessive, it discards the
message and the sending node repeats the message after re-arbitration. In this way, each
node acknowledges (ACK) the integrity of its data. ACK is 2 bits, one is the
acknowledgment bit and the second is a delimiter.

 EOF–This end-of-frame (EOF), 7-bit field marks the end of a CAN frame (message)
and disables bit stuffing, indicating a stuffing error when dominant. When 5 bits of the
same logic level occur in succession during normal operation, a bit of the opposite logic
level is stuffed into the data.
 IFS–This 7-bit inter frame space (IFS) contains the time required by the controller to
move a correctly received frame to its proper position in a message buffer area.
CAN Messages:
 The CAN bus is a broadcast type of bus. This means that all nodes can ‘hear’ all
transmissions. There is no way to send a message to just a specific node; all nodes will
invariably pick up all traffic. The CAN hardware, however, provides local filtering so
that each node may react only on the interesting messages. CAN uses short messages –
the maximum utility load is 94 bits. There is no explicit address in the messages;
instead, the messages can be said to be contents-addressed, that is, their contents
implicitly determines their address.
The Data Frame:
 The Data Frame is the most common message type. It comprises;
 The Arbitration Field, which determines the priority of the message when two or more
nodes are contending for the bus. The Arbitration Field contains:
 For CAN 2.0A, an 11-bit Identifier and one bit, the RTR bit, which is dominant
for data frames.
 For CAN 2.0B, a 29-bit Identifier (which also contains two recessive bits: SRR
and IDE) and the RTR bit.
 The Data Field, which contains zero to eight bytes of data.
 The CRC Field, which contains a 15-bit checksum calculated on most parts of the
message. This checksum is used for error detection.
 an Acknowledgement Slot; any CAN controller that has been able to correctly receive
the message sends an Acknowledgement bit at the end of each message. The transmitter
checks for the presence of the Acknowledge bit and retransmits the message if no
acknowledge was detected.
The Remote Frame
 The intended purpose of the remote frame is to solicit the transmission of data from
another node. The remote frame is similar to the data frame, with two important
differences. First, this type of message is explicitly marked as a remote frame by a
recessive RTR bit in the arbitration field, and secondly, there is no data.
The Error Frame
 The error frame is a special message that violates the formatting rules of a CAN
message. It is transmitted when a node detects an error in a message, and causes all
other nodes in the network to send an error frame as well. The original transmitter then
automatically retransmits the message. An elaborate system of error counters in the
CAN controller ensures that a node cannot tie up a bus by repeatedly transmitting error
frames.
The Overload Frame
 The overload frame is mentioned for completeness. It is similar to the error frame with
regard to the format, and it is transmitted by a node that becomes too busy. It is
primarily used to provide for an extra delay between messages.
A Valid Frame
 A message is considered to be error free when the last bit of the ending EOF field of a
message is received in the error-free recessive state. A dominant bit in the EOF field
causes the transmitter to repeat a transmission.
Advantages Disadvantages

Low use of wiring due to its distributed


Limited length
control

Can be applied to many electrical Network must be wired in topology that


environments without any issues. limits stubs as much as possible.

Multi master and multicast features can be


Undesirable interaction between nodes
applied

High speed data rate Limited number of nodes (up to 64 nodes)

High cost for software development and


Low cost and light in weight and robustness
maintenance

Supports auto retransmission for attribution


Possibility of signal integrity issues
lost messages

Built in error detection capabilities. (ack


error, form error stuff error etc.)

PCI: Peripheral Components Interconnects


 PCI Bus Architecture is based on ISA (Industry Standard Architecture) Bus. PCI is
a local computer bus for attaching hardware devices to a computer.
 The PCI bus connects the processor memory subsystem to the fast devices and an
expansion bus that connects relatively slow devices such as a keyboard and serial and
parallel ports. In the upper right position of the figure: Three devices are connected on
a SCSI bus that is plugged into a SCSI (Small Computer System Interface) controller.
 A controller is a collection of electronics that can operate a port, a bus, or a device. A
serial port controller is an example of a single device controller. It is a single chip in a
computer that controls the signals on the wires of a serial port. The controller has one
or more registers for data and control signals. The processor communicates with the
controller by reading and writing bit patterns in those registers.

 PCI supports both 32-bit and 64-bit data width; therefore it is compatible with 486s and
Pentiums. The bus data width is equal to the processor, for example, a 32 bit processor
would have a 32 bit PCI bus, and operates at 33MHz.
 PCI was used in developing Plug and Play (PnP) and all PCI cards support PnP i.e. the
user can plug a new card into the computer, power it on and it will “self identify” and
“self specify” and start working without manual configuration using jumpers.

RS232 Bus Protocol:


 RS232 is one of the most widely used techniques to interface external equipment with
computers. RS232 is a Serial Communication Standard developed by the Electronic
Industry Association (EIA) and Telecommunications Industry Association (TIA).
 RS232 was introduced in 1960’s and was originally known as EIA Recommended
Standard 232. RS232 is one of the oldest serial communication standards with ensured
simple connectivity and compatibility across different manufacturers.
 RS232 has become a de facto standard for computer and instrumentation devices since
it was standardized in the year 1962 by EIA and as a result, it became the most widely
used communication standard.
 In RS232, the data is transmitted serially in one direction over a single data line. In
order to establish two way communication, we need at least three wires (RX, TX and
GND) apart from the control signals. A byte of data can transmitted at any time
provided the previous byte has already been transmitted.
 RS232 follows asynchronous communication protocol i.e. there is no clock signal to
synchronize transmitter and receiver. Hence, it uses start and stop bits to inform the
receiver when to check for data.
 There is a delay of certain time between the transmissions of each bit. This delay is
nothing but an inactive state i.e. the signal is set to logic ‘1’ i.e. -12V (if you remember,
logic ‘1’ in RS232 is -12V and logic ‘0’ is +12V).
 First, the transmitter i.e. the DTE sends a Start bit to the receiver i.e. the DCE to inform
it that data transmission starts from next bit. The Start bit is always ‘0’ i.e. +12V. The
next 5 to 9 characters are data bits.
 If parity bit is used, a maximum of 8 bits can be transmitted. If parity isn’t used, then 9
data bits can be transmitted. After the data is transmitted, the transmitter sends the stop
bits. It can be either 1 bit or 1.5 bits or 2 bits long. The following image shows the frame
format of the RS232 protocol.
 Handshaking is a process of dynamically setting the parameters of a communication
between the transmitter and receiver before the communication begins.
 The need for handshaking is dictated by the speed at with the transmitter (DTE)
transmits the data, the speed at which the receiver (DCE) receives the data and the rate
at which the data is transmitted.
 In an asynchronous data transmission system, there can be no handshaking, hardware
handshaking and software handshaking.
 In Hardware Handshaking, the transmitter first asks the receiver whether it is ready to
receive the data. The receiver then checks its buffer and if the buffer is empty, it will
then tell the transmitter that it is ready to receive.
 The transmitter will transmit the data and it is loaded into the receiver buffer. During
this time, the receiver tells the transmitter not to send any further data until the data in
the buffer has been read by the receiver.
 The RS232 Protocol defines four signals for the purpose of Handshaking:
 Ready to Send (RTS)
 Clear to Send (CTS)
 Data Terminal Ready (DTR) and
 Data Set Ready (DSR)
 With the help of Hardware Handshaking, the data from the transmitter is never lost or
overwritten in the receiver buffer. When the transmitter (DTE) wants to send data, it
pulls the RTS (Ready to Send) line to high. Then the transmitter waits for CTS (Clear
to Send) to go high and hence it keeps on monitoring it. If the CTS line is low, it means
that the receiver (DCE) is busy and not yet ready to receive data.
 When the receiver is ready, it pulls the CTS line to high. The transmitter then transmits
the data. This method is also called as RTS/CTS Handshaking.
 Additionally, there are two other wires used in Handshaking. They are DTR (Data
Terminal Ready) and DSR (Data Set Ready). These two signals are used by the DTE
and DCE to indicate their individual status. Often, these two signals are used in modem
communication.
Limitations of RS232
 RS232 Protocol requires a common ground between the transmitter (DTE) and receiver
(DCE). Hence, the reason for shorter cables between DTE and DCE in RS232 Protocol.
 The signal in the line is highly susceptible to noise. The noise can be either internal or
external.
 If there is an increase in baud rate and length of the cable, there is a chance of cross talk
introduced by the capacitance between the cables.
 The voltage levels in RS232 are not compatible with modern TTL or CMOS logics. We
need an external level converter.
Applications
 Though RS232 is a very famous serial communication protocol, it is now has been
replaced with advanced protocols like USB.
 Previously they we used for serial terminals like Mouse, Modem etc.
 But, RS232 is still being used in some Servo Controllers, CNC Machines, PLC
machines and some microcontroller boards use RS232 Protocol.

USB Bus Protocol:


 The USB protocol, also known as Universal Serial Bus, was first created and introduced
in 1996 as a way to institutionalize a more widespread, uniform cable and connector
that could be used across a multitude of different devices.
 USB, Universal Serial Bus is very easy to use providing a reliable and effective means
of transferring data. Whether USB 1, USB 2, USB 3 or even USB 4, the data requires
a standardized method of transfer over the USB interface along with a standard format
for the USB data packets.
 Host: The host is the computer or item that acts as the main element or controller for
the USB system. The host has a hub contained within it and this is called the Root Hub.
 Hub: The hub is a device that effectively expands the number of ports available - it
will have one connection, upstream connection, and several downstream. It is possible
to plug one hub into another to expand the capability and connectivity further.
 Port: This is the socket through which access to the USB network is gained. It can be
on a host, or a hub.
 Function: These are the peripherals or items to which the USB link is connected.
Mice, keyboards, Flash memories, etc.
 Device: This term is collectively used for hubs and functions.
 The use of twisted pairs and differential signalling reduces the effects of external
interference that may picked up. It also reduces the effect of any hum loops, etc that
could cause issues. As it is not related to ground, but the difference between the two
lines, the effects of hum are significantly reduced
 The data uses an NRZI system, i.e. non-return to zero. In terms of operation, when the
USB host powers up, it polls each of the slave devices in turn.
 The USB host has address 0, and then assigns addresses to each device as well as
discovering the slave device capabilities in a process called enumeration.
 Transactions between the host and device comprise a number of packets. As there are
several different types of data that can be sent, a token indicating the type is required,
and sometimes an acknowledgement is also returned.
 Each packet that is sent is preceded by a sync field and followed by an end of packet
marker. This defines the start and end of the packet and also enables the receiving node
to synchronize properly so that the various data elements fall into place.
 There are four basic types of data transaction that can be made within USB.
 Control: This type of data transaction within the overall USB protocol is used by the
host to send commands or query parameters. The packet lengths are defined within the
protocol as 8 bytes for Low speed, 8-64 bytes for Full, and 64 bytes for High Speed
devices.
 Interrupt: The USB protocol defines an interrupt message. This is often used by
devices sending small amounts of data, e.g. mice or keyboards. It is a polled message
from the host which has to request specific data of the remote device
 Bulk: This USB protocol message is used by devices like printers for which much
larger amounts of data are required. In this form of data transfer, variable length blocks
of data are sent or requested by the Host. The maximum length is 64-byte for full speed
Devices or 512 bytes for high speed ones. The data integrity is verified using cyclic
redundancy checking, CRC and an acknowledgement is sent. This USB data transfer
mechanism is not used by time critical peripherals because it utilizes bandwidth not
used by the other mechanisms.
 Isochronous: This form of data transfer is used to stream real time data and is used
for applications like live audio channels, etc. It does not use and data checking, as there
is not time to resend any data packets with errors - lost data can be accommodated better
than the delays incurred by resending data. Packet sizes can be up to 1024 bytes.
USB data packets
Within the USB system, there are four different types of data packets each used for
different types of data transfer.
 Token Packets: Essentially a Token USB data packet indicates the type of transaction
is to follow.
 Data Packets: The USB data packets carry the payload data, carrying the data as
required.
 Handshake Packets: The handshake packets are used acknowledging data packets
received or for reporting errors, etc.
 Start of Frame Packets: The Start of Frame packets used to indicate the start of a new
frame of data.

Bluetooth:
 Bluetooth is a short-range wireless technology standard that is used for exchanging
data between fixed and mobile devices over short distances using UHF radio waves in
the ISM bands, from 2.402 GHz to 2.48 GHz, and building personal area
networks (PANs), using methods like spread spectrum, frequency hopping and full
duplex signals.
 It was originally conceived as a wireless alternative to RS-232 data cables. It is mainly
used as an alternative to wire connections, to exchange files between nearby portable
devices and connect cell phones and music players with wireless headphones.

 In the most widely used mode, transmission power is limited to 2.5 milliwatts, giving
it a very short range of up to 10 meters (30 feet).
 Frequency-hopping spread spectrum (FHSS) is a method of transmitting radio
signals by rapidly changing the carrier frequency among many distinct frequencies
occupying a large spectral band. The changes are controlled by a code known to
both transmitter and receiver. FHSS is used to avoid interference, to prevent
eavesdropping, and to enable code-division multiple access (CDMA) communications.
 The available frequency band is divided into smaller sub-bands. Signals rapidly change
("hop") their carrier frequencies among the center frequencies of these sub-bands in a
predetermined order. Interference at a specific frequency will only affect the signal
during a short interval
 Piconet:
 Piconet is a type of bluetooth network that contains one primary node called master
node and seven active secondary nodes called slave nodes. Thus there are total of 8
active nodes which are present at a distance of 10 meters.
 The communication between the primary and secondary node can be one-to-one or one-
to-many. Possible communication is only between the master and slave; Slave-slave
communication is not possible. It also have 255 parked nodes, these are secondary
nodes and cannot take participation in communication unless it get converted to the
active state.
 Scatternet:
 It is formed by using various piconets. A slave that is present in one piconet can be act
as master or we can say primary in other piconet. This kind of node can receive message
from master in one piconet and deliver the message to its slave into the other piconet
where it is acting as a slave. This type of node is refer as bridge node.
 A station cannot be master in two piconets.
The core specifications of Bluetooth consists of 5 layers
 Radio: Radio specifies the requirements for radio transmission – including frequency,
modulation, and power characteristics – for a Bluetooth transceiver.
 Baseband Layer: It defines physical and logical channels and link types (voice or data);
specifies various packet formats, transmit and receive timing, channel control, and the
mechanism for frequency hopping (hop selection) and device addressing. It specifies
point to point or point to multipoint links. The length of a packet can range from 68 bits
(shortened access code) to a maximum of 3071 bits.
 LMP- Link Manager Protocol (LMP): It defines the procedures for link setup and
ongoing link management.
 Logical Link Control and Adaptation Protocol (L2CAP): It is responsible for
adapting upper-layer protocols to the baseband layer.
 Service Discovery Protocol (SDP): – Allows a Bluetooth device to query other
Bluetooth devices for device information, services provided, and the characteristics of
those services.

Advantages:

 Low cost.
 Easy to use.
 It can also penetrate through walls.
 It creates an adhoc connection immediately without any wires.
 It is used for voice and data transfer.
Disadvantages:
 It can be hacked and hence, less secure.
 It has slow data transfer rate: 3 Mbps.
 It has small range: 10 meters.
Zigbee:
 Zigbee is an IEEE 802.15.4-based specification for a suite of high-level
communication protocols used to create personal area networks with small, low-
power digital radios, such as for home automation, medical device data collection, and
other low-power low-bandwidth needs, designed for small scale projects which need
wireless connection.
 The technology defined by the Zigbee specification is intended to be simpler and less
expensive than other wireless personal area networks (WPANs), such as Bluetooth or
more general wireless networking such as Wi-Fi. Applications include wireless light
switches, home energy monitors, traffic management systems, and other consumer and
industrial equipment that requires short-range low-rate wireless data transfer.
 Zigbee was conceived in 1998, standardized in 2003, and revised in 2006. The name
refers to the waggle dance of honey bees after their return to the beehive.
 This communication standard defines physical and Media Access Control (MAC)
layers to handle many devices at low-data rates. These Zigbee’s WPANs operate at 868
MHz, 902-928MHz, and 2.4 GHz frequencies. The data rate of 250 kbps is best suited
for periodic as well as intermediate two-way transmission of data between sensors and
controllers.
 Zigbee technology works with digital radios by allowing different devices to converse
through one another. The devices used in this network are a router, coordinator as well
as end devices. The main function of these devices is to deliver the instructions and
messages from the coordinator to the single end devices such as a light bulb.
 In this network, the coordinator is used to perform different tasks. They choose a
suitable channel to scan a channel as well as to find the most appropriate one through
the minimum of interference, allocate an exclusive address to every device within the
network so that messages / instructions can be transferred in the network.
 Routers are arranged among the coordinator as well as end devices which are
accountable for messages routing among the various nodes. Routers get messages from
the coordinator and store them until their end devices are in a situation to get them.
These can also permit other end devices as well as routers to connect the network;
 In this network, the small information can be controlled by end devices by
communicating with the parent node like a router or the coordinator based on the Zigbee
network type. End devices don’t converse directly through each other.
 Zigbee system structure consists of three different types of devices as Zigbee
Coordinator, Router, and End device. Every Zigbee network must consist of at least
one coordinator which acts as a root and bridge of the network. The coordinator is
responsible for handling and storing the information while performing receiving and
transmitting data operations.
 Zigbee routers act as intermediary devices that permit data to pass to and fro to other
devices. End devices have limited functionality to communicate with the parent nodes
such that the battery power is saved. The no. of routers, coordinators, and end devices
depends on the type of networks such as star, tree, and mesh networks.
 Physical Layer: This layer does modulation and demodulation operations upon
transmitting and receiving signals respectively. This layer’s frequency, data rate, and
a number of channels are given below.
 MAC Layer: This layer is responsible for reliable transmission of data by accessing
different networks with the carrier sense multiple access collision avoidances (CSMA).
This also transmits the beacon frames for synchronizing communication.
 Network Layer: This layer takes care of all network-related operations such as
network setup, end device connection, and disconnection to network, routing, device
configurations, etc.
 Application Support Sub-Layer: This layer enables the services necessary for Zigbee
device objects and application objects to interface with the network layers for data
managing services. This layer is responsible for matching two devices according to
their services and needs.
 Application Framework: It provides two types of data services as key-value pair and
generic message services. The generic message is a developer-defined structure,
whereas the key-value pair is used for getting attributes within the application objects.
ZDO provides an interface between application objects and the APS layer in Zigbee
devices. It is responsible for detecting, initiating, and binding other devices to the
network.

 Zigbee Coordinator
In an FFD device, it is a PAN Coordinator is used to form the network. Once the
network is established, then it assigns the address of the network for the devices used
within the network. And also, it routes the messages among the end devices.
 Zigbee Router
A Zigbee Router is an FFD Device that allows the range of the Zigbee Network. This
Router is used to add more devices to the network. Sometimes, it acts as a Zigbee End
Device.
 Zigbee End Device
This is neither a Router nor a Coordinator that interfaces to a sensor physically
otherwise performs a control operation. Based on the application, it can be either an
RFD or an FFD.
Advantages:
 This network has a flexible network structure
 Battery life is good.
 Power consumption is less
 Very simple to fix.
 It supports approximately 6500 nodes.
 Less cost.
 It is self-healing as well as more reliable.
 Network setting is very easy as well as simple.
 Loads are evenly distributed across the network because it doesn’t include a central
controller
 Home appliances monitoring as well controlling is extremely simple using remote
 The network is scalable and it is easy to add/remote ZigBee end device to the network.
Disadvantages:
 It needs the system information to control Zigbee based devices for the owner.
 As compared with WiFi, it is not secure.
 The high replacement cost once any issue happens within Zigbee based home
appliances
 The transmission rate of the Zigbee is less
 It does not include several end devices.
 It is so highly risky to be used for official private information.
 It is not used as an outdoor wireless communication system because it has less coverage
limit.
 Similar to other types of wireless systems, this ZigBee communication system is prone
to bother from unauthorized people.
Bluetooth Zigbee

The frequency range of Bluetooth ranges The frequency range of Zigbee is 2.4 GHz
from 2.4 GHz – 2.483 GHz

It has 79 RF channels It has 16 RF channels


The modulation technique used in Zigbee uses different modulation techniques
Bluetooth is GFSK like BPSK, QPSK & GFSK.
Bluetooth includes 8-cell nodes Zigbee includes above 6500 cell nodes
Bluetooth uses IEEE 802.15.1
Zigbee uses IEEE 802.15.4 specification
specification
Bluetooth covers the radio signal upto
Zigbee covers the radio signal upto 100 meters
10meters
Bluetooth takes 3 seconds to join a network Zigbee takes 3 Seconds to join a network
The network range of Bluetooth ranges
The network range of Zigbee is up to 70 meters
from 1-100 meters based on radio class.
The protocol stack size of a Bluetooth is
The protocol stack size of a Zigbee is 28 Kbytes
250 Kbytes
The height of the TX antenna is 6meters The height of the TX antenna is 6meters
whereas the RX antenna is 1meter whereas the RX antenna is 1meter
Blue tooth uses rechargeable batteries Zigbee doesn’t use rechargeable batteries

Features:
Zigbee :
 Support for multiple network topologies such as point-to-point,
point-to-multipoint and mesh networks
 Low duty cycle – provides long battery life
 Low latency
 Direct Sequence Spread Spectrum (DSSS)
 Up to 65,000 nodes per network
 128-bit AES encryption for secure data connections
 Collision avoidance, retries and acknowledgements
Bluetooth:
 It operates in the 2.4GHz frequency band without having a license for wireless
communication.
 Up to 10-100 meters data can be transfer in real time.
 Close proximity & accuracy is not required for Bluetooth as in case of infrared data
(IrDA) communication device. Bluetooth does not suffer from interference from
obstacles such as walls while infrared suffers due to obstacles.
 Bluetooth supports both point-to-point and point-to-multipoint wireless connections
without cables between mobile phones and personal computers.
 Data transfer rate of blue tooth varies from version to version. Data rate of 1 Mbps for
Version 1.2 Up to 3 Mbps for Version 2.0.
CAN :
 The physical layer uses differential transmission on a twisted pair wire
 A non-destructive bit-wise arbitration is used to control access to the bus
 The messages are small (at most eight data bytes) and are protected by a checksum
 There is no explicit address in the messages, instead, each message carries a numeric
value which controls its priority on the bus, and may also serve as an identification of
the contents of the message
 An elaborate error handling scheme that results in retransmitted messages when they
are not properly received
 There are effective means for isolating faults and removing faulty nodes from the bus

You might also like