ES Notes
ES Notes
Definition of ES.
Block Diagram of ES
Embedded System Architectures: Von-Neumann/Harvard, RISC/CISC, DSP
Characteristics of Embedded Systems: size, performance, flexibility,
maintainability, latency, throughput, correctness, processor power, power
consumption, safety, NRE cost, cost
Classification of Embedded Systems:
Based on Performance of microcontroller: Small scale, Medium scale,
Sophisticated
Based on performance and functional requirements: Real time,
Standalone, Networked, Mobile
History of ES.
Applications of ES
Purpose of ES
Medium Scale Embedded Systems: These systems are usually designed with a
single or few 16-bit or 32-bit microcontrollers or Digital Signal Processor (DSPs) or
Reduced Instruction Set Computers (RISCs) being used. These system have both
hardware and software complexities. For complex software design of medium scale
embedded system, there are the following programming tools: RTOS, Source code
engineering tool, Simulator, Debugger and Integrated Development Environment
(IDE). Software tools also give the clarifications to the hardware complexities. An
assembler is of slight use as a programming tool. These systems may also utilize the
readily available Application-Specific Standard Product (ASSPs) and IPs for the
various functions. For example, for the bus interfacing, encrypting, deciphering,
discrete cosine transformation and inverse transformation, TCP/IP protocol is stacking
and network connecting functions.
Sophisticated Embedded Systems: Sophisticated embedded systems have massive
hardware and software complexities and may require ASIPs, IPs and PLAs scalable or
configurable processors and programmable logic arrays. They are used for cutting
edge applications that require hardware and software co-design and integration in the
final system. They are constrained by the processing speeds available in their
hardware units. Certain software functions such as encryption and deciphering
algorithms, discrete cosine transformation and inverse transformation algorithms,
TCP/IP protocol stacking and network driver functions are implemented in the
hardware to obtain additional speeds by saving time. Some of the functions of the
hardware resources in the system are also implemented by the software.
Status Register:
Bit 7 – I: Global Interrupt Enable
Bit 6 – T: Bit Copy Storage
The Bit Copy instructions BLD (Bit LoaD) and BST (Bit STore) use the T-bit as source or
destination.
Bit 5 – H: Half Carry Flag
Bit 4 – S: Sign Bit, S = N ⊕ V
Bit 3 – V: Two’s Complement Overflow Flag
Bit 2 – N: Negative Flag
Bit 1 – Z: Zero Flag
Bit 0 – C: Carry Flag
General Purpose Registers:
– Most of the instructions operating on the Register File have direct access to all
registers, and most of them are single cycle instructions.
– Each register is also assigned a data memory address, mapping them directly
into the first 32 locations of the user Data Space.
– Although not being physically implemented as SRAM locations, this memory
organization provides great flexibility in access of the registers, as the X-, Y-
and Z-pointer registers can be set to index any register in the file.
– The registers R26..R31 have some added functions to their general purpose
usage. These registers are 16-bit address pointers for indirect addressing of the
data space. The three indirect address registers X, Y, and Z
– In the different addressing modes these address registers have functions as fixed
displacement, automatic increment, and automatic decrement
– Stack pointer:
– The Stack is mainly used for storing temporary data, for storing local variables
and for storing return addresses after interrupts and subroutine calls.
– Here Stack is implemented as growing from higher to lower memory locations.
– The Stack Pointer Register always points to the top of the Stack.
– The Stack Pointer points to the data SRAM Stack area where the Subroutine
and Interrupt Stacks are located.
– A Stack PUSH command will decrease the Stack Pointer.
– The Stack in the data SRAM must be defined by the program before any
subroutine calls are executed or interrupts are enabled.
– Initial Stack Pointer value equals the last address of the internal SRAM and the
Stack Pointer must be set to point above start of the SRAM
– Interrupt Handling:
– Global Interrupt Enable bit
– Depending on the Program Counter value, interrupts may be automatically
disabled when Boot Lock bits BLB02 or BLB12 are programmed. This feature
improves software security.
– When an interrupt occurs, the Global Interrupt Enable I-bit is cleared and all
interrupts are disabled.
– The user software can write logic one to the I-bit to enable nested interrupts. All
enabled interrupts can then interrupt the current interrupt routine. The I-bit is
automatically set when a Return from Interrupt instruction – RETI – is executed.
– AVR Memories :
– The ATmega328P contains 32K bytes On-chip In-System Reprogrammable
Flash memory for program storage.
– Since all AVR instructions are 16 or 32 bits wide, the Flash is organized as 16K
x 16.
– For software security, the Flash Program memory space is divided into two
sections, Boot Loader Section and Application Program Section
– The Flash memory has an endurance of at least 10,000 write/erase cycles.
– The lower 2303 data memory locations address both the Register File, the I/O
memory, Extended I/O memory, and the internal data SRAM.
– The first 32 locations address the Register File, the next 64 location the standard
I/O memory, then 160 locations of Extended I/O memory, and the next 2048
locations address the internal data SRAM.
– The five different addressing modes for the data memory cover: Direct, Indirect
with Displacement, Indirect, Indirect with Pre-decrement, and Indirect with
Post-increment.
– In the Register File, registers R26 to R31 feature the indirect addressing pointer
registers.
– The direct addressing reaches the entire data space.
– The Indirect with Displacement mode reaches 63 address locations from the
base address given by the Y- or Z-register.
– When using register indirect addressing modes with automatic pre-decrement
and post-increment, the address registers X, Y, and Z are decremented or
incremented.
– The ATmega328P contains 1K bytes of data EEPROM memory.
– It is organized as a separate data space, in which single bytes can be read and
written.
– The EEPROM has an endurance of at least 100,000 write/erase cycles.
– I/O memory- 64 bytes
– BOD:
– When the Brown-out Detector (BOD) is enabled by BODLEVEL fuses, the
BOD is actively monitoring the power supply voltage during a sleep period. To
save power, it is possible to disable the BOD by software for some of the sleep
modes. The sleep mode power consumption will then be at the same level as
when BOD is globally disabled by fuses. If BOD is disabled in software, the
BOD function is turned off immediately after entering the sleep mode. Upon
wake-up from sleep, BOD is automatically enabled again. This ensures safe
operation in case the VCC level has dropped during the sleep period.
– Idle Mode:
– Stops CPU but allowing the SPI, USART, Analog Comparator, ADC, 2-wire
Serial Interface, Timer/Counters, Watchdog, and the interrupt system to
continue operating. This sleep mode basically halts clkCPU and clkFLASH,
while allowing the other clocks to run.
– Idle mode enables the MCU to wake up from external triggered interrupts as
well as internal ones like the Timer Overflow and USART Transmit Complete
interrupts.
– Power Down Mode:
– In this mode, the external Oscillator is stopped, while the external interrupts,
the 2- wire Serial Interface address watch, and the Watchdog continue operating
(if enabled). Only an External Reset, a Watchdog System Reset, a Watchdog
Interrupt, a Brown-out Reset, a 2-wire Serial Interface address match, an
external level interrupt on INT0 or INT1, or a pin change interrupt can wake up
the MCU. This sleep mode basically halts all generated clocks, allowing
operation of asynchronous modules only.
– Power Save Mode:
– This mode is identical to Power-down, with one exception: If Timer/Counter2
is enabled, it will keep running during sleep. The device can wake up from either
Timer Overflow or Output Compare event from Timer/Counter2 if the
corresponding Timer/Counter2 interrupt enable bits are set in TIMSK2, and the
Global Interrupt Enable bit in SREG is set.
Power Management and Sleep Modes:
• ADC Noise Reduction Mode:
– Noise Reduction mode, stopping the CPU but allowing the ADC, the external
interrupts, the 2- wire Serial Interface address watch, Timer/Counter2(1), and
the Watchdog to continue operating (if enabled). This sleep mode basically halts
clkI/O, clkCPU, and clkFLASH, while allowing the other clocks to run.
• Standby Mode:
– This mode is identical to Power-down with the exception that the Oscillator is
kept running.
• Extended Standby Mode:
– This mode is identical to Power-save with the exception that the Oscillator is
kept running.
Pin
Pin name Description Secondary Function
No.
Pin by default is used as RESET pin. PC6 can only
1 PC6 () Pin6 of PORTC be used as I/O pin when RSTDISBL Fuse is
programmed.
RXD (Data Input Pin for USART)
2 PD0 (RXD) Pin0 of PORTD
USART Serial Communication Interface
TXD (Data Output Pin for USART)
3 PD1 (TXD) Pin1 of PORTD USART Serial Communication Interface
INT2( External Interrupt 2 Input)
22 GND GROUND
So what is Arduino?
• Based on “Wiring Platform”
• Open-source hardware platform
• Open source development environment
– Easy-to learn language and libraries (based on Wiring language)
– Integrated development environment (based on Processing programming environment)
– Available for Windows / Mac / Linux
pinMode():
Description: Configures the specified pin to behave either as an input or an output. As of
Arduino 1.0.1, it is possible to enable the internal pullup resistors with the mode
INPUT_PULLUP. Additionally, the INPUT mode explicitly disables the internal
pullups.
Syntax: pinMode(pin, mode)
Parameters:
pin: the number of the pin whose mode you wish to set
mode: INPUT, OUTPUT, or INPUT_PULLUP.
Returns: None
digitalWrite():
Description: Write a HIGH or a LOW value to a digital pin. If the pin has been configured
as an OUTPUT with pinMode(), its voltage will be set to the corresponding value: 5V
(or 3.3V on 3.3V boards) for HIGH, 0V (ground) for LOW.
If the pin is configured as an INPUT, digitalWrite() will enable (HIGH) or disable (LOW)
the internal pullup on the input pin. It is recommended to set the pinMode() to
INPUT_PULLUP to enable the internal pull-up resistor.
NOTE: If you do not set the pinMode() to OUTPUT, and connect an LED to a pin, when
calling digitalWrite(HIGH), the LED may appear dim. Without explicitly setting
pinMode(), digitalWrite() will have enabled the internal pull-up resistor, which acts like
a large current-limiting resistor.
Syntax: digitalWrite(pin, value)
Parameters:
pin: the pin number
value: HIGH or LOW
Returns: none
digitalRead():
Description: Reads the value from a specified digital pin, either HIGH or LOW.
Syntax: digitalRead(pin)
Parameters:
pin: the number of the digital pin you want to read (int)
Returns: HIGH or LOW
analogReference():
Description: Configures the reference voltage used for analog input (i.e. the value used
as the top of the input range). The options are:
– DEFAULT: the default analog reference of 5 volts (on 5V Arduino boards) or 3.3 volts
(on 3.3V Arduino boards)
– INTERNAL: an built-in reference, equal to 1.1 volts on the ATmega168 or ATmega328
and 2.56 volts on the ATmega8 (not available on the Arduino Mega)
– INTERNAL1V1: a built-in 1.1V reference (Arduino Mega only)
– INTERNAL2V56: a built-in 2.56V reference (Arduino Mega only)
– EXTERNAL: the voltage applied to the AREF pin (0 to 5V only) is used as the reference.
Syntax: analogReference(type)
Parameters:
type: which type of reference to use (DEFAULT, INTERNAL, INTERNAL1V1,
INTERNAL2V56, or EXTERNAL).
Returns: None.
analogWrite():
Description: Writes an analog value (PWM wave) to a pin. Can be used to light a LED
at varying brightnesses or drive a motor at various speeds. After a call to analogWrite(),
the pin will generate a steady square wave of the specified duty cycle until the next call
to analogWrite() (or a call to digitalRead() or digitalWrite() on the same pin). The
frequency of the PWM signal on most pins is approximately 490 Hz. On the Uno and
similar boards, pins 5 and 6 have a frequency of approximately 980 Hz.
You do not need to call pinMode() to set the pin as an output before calling analogWrite().
The analogWrite function has nothing to do with the analog pins or the analogRead
function.
Syntax: analogWrite(pin, value)
Parameters:
pin: the pin to write to.
value: the duty cycle: between 0 (always off) and 255 (always on).
Returns: nothing
Notes and Known Issues
The PWM outputs generated on pins 5 and 6 will have higher-than-expected duty cycles.
This is because of interactions with the millis() and delay() functions, which share the
same internal timer used to generate those PWM outputs.
analogRead():
Description: Reads the value from the specified analog pin. The Arduino board contains
a 6 channel (8 channels on the Mini and Nano, 16 on the Mega), 10-bit analog to digital
converter. This means that it will map input voltages between 0 and 5 volts into integer
values between 0 and 1023. This yields a resolution between readings of: 5 volts / 1024
units or, .0049 volts (4.9 mV) per unit. The input range and resolution can be changed
using analogReference().
It takes about 100 microseconds (0.0001 s) to read an analog input, so the maximum
reading rate is about 10,000 times a second.
Syntax: analogRead(pin)
Parameters:
pin: the number of the analog input pin to read from (0 to 5 on most boards, 0 to 7 on the
Mini and Nano, 0 to 15 on the Mega)
Returns: int (0 to 1023)
Note:
If the analog input pin is not connected to anything, the value returned by analogRead()
will fluctuate based on a number of factors.
shiftOut():
Description: Shifts out a byte of data one bit at a time. Starts from either the most (i.e.
the leftmost) or least (rightmost) significant bit. Each bit is written in turn to a data pin,
after which a clock pin is pulsed (taken high, then low) to indicate that the bit is available.
Note: if you're interfacing with a device that's clocked by rising edges, you'll need to
make sure that the clock pin is low before the call to shiftOut(), e.g. with a call to
digitalWrite (clockPin, LOW).
Syntax: shiftOut(dataPin, clockPin, bitOrder, value)
Parameters
dataPin: the pin on which to output each bit (int)
clockPin: the pin to toggle once the dataPin has been set to the correct value (int)
bitOrder: which order to shift out the bits; either MSBFIRST or LSBFIRST.
value: the data to shift out. (byte)
Returns: None
Note
The dataPin and clockPin must already be configured as outputs by a call to pinMode().
shiftOut is currently written to output 1 byte (8 bits) so it requires a two step operation to
output values larger than 255.
shiftIn():
Description: Shifts in a byte of data one bit at a time. Starts from either the most (i.e. the
leftmost) or least (rightmost) significant bit. For each bit, the clock pin is pulled high, the
next bit is read from the data line, and then the clock pin is taken low.
If you're interfacing with a device that's clocked by rising edges, you'll need to make sure
that the clock pin is low before the first call to shiftIn(), e.g. with a call to
digitalWrite(clockPin, LOW).
Syntax:
byte incoming = shiftIn(dataPin, clockPin, bitOrder)
Parameters
dataPin: the pin on which to input each bit (int)
clockPin: the pin to toggle to signal a read from dataPin
bitOrder: which order to shift in the bits; either MSBFIRST or LSBFIRST.
(Most Significant Bit First, or, Least Significant Bit First)
Returns : the value read (byte)
pulseIn():
Description: Reads a pulse (either HIGH or LOW) on a pin. For example, if value is
HIGH, pulseIn() waits for the pin to go HIGH, starts timing, then waits for the pin to go
LOW and stops timing. Returns the length of the pulse in microseconds or 0 if no
complete pulse was received within the timeout.
The timing of this function has been determined empirically and will probably show
errors in shorter pulses. Works on pulses from 10 microseconds to 3 minutes in length.
Please also note that if the pin is already high when the function is called, it will wait for
the pin to go LOW and then HIGH before it starts counting. This routine can be used
only if interrupts are activated. Furthermore the highest resolution is obtained with short
intervals.
Syntax:
pulseIn(pin,value)
pulseIn(pin, value, timeout)
Parameters:
pin: the number of the pin on which you want to read the pulse. (int)
value: type of pulse to read: either HIGH or LOW. (int)
timeout (optional): the number of microseconds to wait for the pulse to be completed:
the function returns 0 if no complete pulse was received within the timeout. Default is
one second (unsigned long).
Returns: the length of the pulse (in microseconds) or 0 if no pulse is completed before
the timeout (unsigned long)
millis():
Description: Returns the number of milliseconds since the Arduino board began running
the current program. This number will overflow (go back to zero), after approximately
50 days.
Parameters: None
Returns: Number of milliseconds since the program started (unsigned long)
Note: Please note that the return value for millis() is an unsigned long, logic errors may
occur if a programmer tries to do arithmetic with smaller data types such as int's. Even
signed long may encounter errors as its maximum value is half that of its unsigned
counterpart.
micros():
Description: Returns the number of microseconds since the Arduino board began running
the current program. This number will overflow (go back to zero), after approximately
70 minutes. On 16 MHz Arduino boards (e.g. Duemilanove and Nano), this function has
a resolution of four microseconds (i.e. the value returned is always a multiple of four).
On 8 MHz Arduino boards (e.g. the LilyPad), this function has a resolution of eight
microseconds.
Note: there are 1,000 microseconds in a millisecond and 1,000,000 microseconds in a
second.
Parameters: None
Returns: Number of microseconds since the program started (unsigned long)
delay()/delayMicroseconds():
Description: Pauses the program for the amount of time (in miliseconds/microseconds)
specified as parameter.
Syntax: delay(ms) /delayMicrseconds(us)
Parameters: ms: the number of milliseconds to pause (unsigned long) OR
us: the number of microseconds to pause (unsigned long/ unsigned int)
Returns: nothing
Serial.println():
Description: Prints data to the serial port as human-readable ASCII text followed by a
carriage return character (ASCII 13, or '\r') and a newline character (ASCII 10, or '\n').
This command takes the same forms as Serial.print().
Syntax:
Serial.println(val)
Serial.println(val, format)
Parameters:
Serial: serial port object.
val: the value to print. Allowed data types: any data type.
format: specifies the number base (for integral data types) or number of decimal places
(for floating point types).
Returns: println() returns the number of bytes written, though reading that number is
optional. Data type: size_t.
if(Serial):
Description:
Indicates if the specified Serial port is ready. On the boards with native USB, if
(Serial) indicates whether or not the USB CDC serial connection is open. For all other
boards, and the non-USB CDC ports, this will always return true.
Syntax: if (Serial)
Parameters: None
Returns: Returns true if the specified serial port is available. Data type: bool.
Serial.available():
Description: Get the number of bytes (characters) available for reading from the serial
port. This is data that’s already arrived and stored in the serial receive buffer (which
holds 64 bytes).
Syntax: Serial.available()
Parameters:
Serial: serial port object.
Returns: The number of bytes available to read.
Serial.print():
Description: Prints data to the serial port as human-readable ASCII text. This command
can take many forms. Numbers are printed using an ASCII character for each digit. Floats
are similarly printed as ASCII digits, defaulting to two decimal places. Bytes are sent as
a single character. Characters and strings are sent as is. For example-
Serial.print(78) gives "78“ OR Serial.print(1.23456) gives "1.23“ OR
Serial.print('N') gives "N“ OR Serial.print("Hello world.") gives "Hello world."
An optional second parameter specifies the base (format) to use; permitted values
are BIN(binary, or base 2), OCT(octal, or base 8), DEC(decimal, or base
10), HEX(hexadecimal, or base 16). For floating point numbers, this parameter
specifies the number of decimal places to use. For example-
Serial.print(78, BIN) gives "1001110“ OR Serial.print(78, OCT) gives "116“ OR
Serial.print(78, DEC) gives "78“ OR Serial.print(78, HEX) gives "4E“ OR
Serial.print(1.23456, 0) gives "1“ OR Serial.print(1.23456, 2) gives "1.23” OR
Serial.print(1.23456, 4) gives "1.2346"
You can pass flash-memory based strings to Serial.print() by wrapping them with F().
For example:
Serial.print(F(“Hello World”))
To send data without conversion to its representation as characters, use Serial.write().
Syntax:
Serial.print(val)
Serial.print(val, format)
Parameters:
Serial: serial port object.
val: the value to print. Allowed data types: any data type.
Returns: print() returns the number of bytes written, though reading that number is
optional. Data type: size_t.
Serial.write():
Description: Writes binary data to the serial port. This data is sent as a byte or series of
bytes; to send the characters representing the digits of a number use the print() function
instead.
Syntax:
Serial.write(val)
Serial.write(str)
Serial.write(buf, len)
Parameters:
Serial: serial port object.
val: a value to send as a single byte.
str: a string to send as a series of bytes.
buf: an array to send as a series of bytes.
len: the number of bytes to be sent from the array.
Returns: write() will return the number of bytes written, though reading that number is
optional. Data type: size_t.
Serial.read():
Description: Reads incoming serial data.
Syntax: Serial.read()
Parameters:
Serial: serial port object.
Returns: The first byte of incoming serial data available (or -1 if no data is available).
Data type: int.
Serial.begin():
Description: Sets the data rate in bits per second (baud) for serial data transmission. For
communicating with Serial Monitor, make sure to use one of the baud rates listed in the
menu at the bottom right corner of its screen. You can, however, specify other rates -
for example, to communicate over pins 0 and 1 with a component that requires a
particular baud rate.
An optional second argument configures the data, parity, and stop bits. The default is
8 data bits, no parity, one stop bit.
Syntax:
Serial.begin(speed)
Serial.begin(speed, config)
Parameters:
Serial: serial port object.
speed: in bits per second (baud). Allowed data types: long.
config: sets data, parity, and stop bits. E.G. SERIAL_8N1
Returns:Nothing
sq():
Description: Calculates the square of a number: the number multiplied by itself.
Syntax: sq(x)
Parameters:
x: the number. Allowed data types: any data type.
Returns: The square of the number. Data type: double.
Notes and Warnings:
Because of the way the sq() function is implemented, avoid using other functions
inside the brackets, it may lead to incorrect results.
This code will yield incorrect results:
int inputSquared = sq(Serial.parseInt()); // avoid this
sqrt():
Description: Calculates the square root of a number.
Syntax: sqrt(x)
Parameters:
x: the number. Allowed data types: any data type.
Returns: The number’s square root.
Data type: double.
pow():
Description: Calculates the value of a number raised to a power. pow() can be used to
raise a number to a fractional power. This is useful for generating exponential mapping
of values or curves.
Syntax: pow(base, exponent)
Parameters:
base: the number. Allowed data types: float.
exponent: the power to which the base is raised. Allowed data types: float.
Returns: The result of the exponentiation. Data type: double.
map():
Description: Re-maps a number from one range to another. That is, a value
of fromLow would get mapped to toLow, a value of fromHigh to toHigh, values in-
between to values in-between, etc. Does not constrain values to within the range,
because out-of-range values are sometimes intended and useful.
The constrain() function may be used either before or after this function, if limits to the
ranges are desired.
Note that the "lower bounds" of either range may be larger or smaller than the "upper
bounds" so the map() function may be used to reverse a range of numbers, for example
y = map(x, 1, 50, 50, 1);
The function also handles negative numbers well, so that this example y = map(x, 1,
50, 50, -100); is also valid.
The map() function uses integer math so will not generate fractions, when the math
might indicate that it should do so. Fractional remainders are truncated, and are not
rounded or averaged.
Syntax: map(value, fromLow, fromHigh, toLow, toHigh)
Parameters:
value: the number to map.
fromLow: the lower bound of the value’s current range.
fromHigh: the upper bound of the value’s current range.
toLow: the lower bound of the value’s target range.
toHigh: the upper bound of the value’s target range.
Returns: The mapped value.
Arduino Bootloader:
• Almost all microcontroller (and microprocessor) development systems use some form
of a bootloader. Often called firmware, mistakenly, the Arduino bootloader is one
example. Since it is a rather popular platform, let’s use it as an example. Let’s talk about
what a bootloader does and how it works.
• When a microcontroller turns on, it only knows how to do one thing. Typically, that
one thing is to run an instruction found at a specific memory location. Often this
location address 0x0000, but not always. Usually, this memory location will contain a
jump instruction to another place in memory, which is the start of the user program.
The bootloader, however, exists in a slightly separate memory space from the user
program.
• On power-up or reset, a bootloader is a section of program memory that runs before the
main code runs. It can be used to setup the microcontroller or provide limited ability to
update the main program’s code.
The idea behind the Von Neumann architectures is the ability to store instructions in
the memory along with the data on which the instructions operate. In short, the Von
Neumann architecture refers to a general framework that a computer’s hardware,
programming, and data should follow.
The Von Neumann architecture consists of three distinct components: a central
processing unit (CPU), memory unit, and input/output (I/O) interfaces. The CPU is the
heart of the computer system that consists of three main components: the Arithmetic
and Logic Unit (ALU), the control unit (CU), and registers.
The Von Neumann bottleneck occurs when data taken in or out of memory must wait
while the current memory operation is completed. That is, if the processor just
completed a computation and is ready to perform the next, it has to write the finished
computation into memory (which occupies the bus) before it can fetch new data out of
memory (which also uses the bus). The Von Neumann bottleneck has increased over
time because processors have improved in speed while memory has not progressed as
fast. Some techniques to reduce the impact of the bottleneck are to keep memory in
cache to minimize data movement, hardware acceleration, and speculative execution.
Harvard Architecture:
The Harvard architecture stores machine instructions and data in separate memory units
that are connected by different busses.
This architecture has data storage entirely contained within the CPU, and there is no
access to the instruction storage as data. Computers have separate memory areas for
program instructions and data using internal data buses, allowing simultaneous access
to both instructions and data.
Programs needed to be loaded by an operator; the processor could not boot itself. In a
Harvard architecture, there is no need to make the two memories share properties.
In this case, there are at least two memory address spaces to work with, so there is a
memory register for machine instructions and another memory register for data.
Computers designed with the Harvard architecture are able to run a program and access
data independently, and therefore simultaneously.
Harvard architecture has a strict separation between data and code. Thus, Harvard
architecture is more complicated but separate pipelines remove the bottleneck that Von
Neumann creates.
Single memory to be shared by both code Separate memories for code and data.
and data.
Processor needs to fetch code in a separate Single clock cycle is sufficient, as separate
clock cycle and data in another clock cycle. buses are used to access code and data.
So it requires two clock cycles.
Higher speed, thus less time consuming. Slower in speed, thus more time-consuming.
There is common bus for data and Separate buses are used for transferring data
instruction transfer. and instruction.
CPU cannot access instructions and CPU can access instructions and read/write at
read/write at the same time. the same time.
It is used in personal computers and small It is used in micro controllers and signal
computers. processing.
Reduced Instruction Set Architecture (RISC) –
A reduced instruction set computer is a computer which only uses simple commands
that can be divided into several instructions which achieve low-level operation within
a single CLK cycle, as its name proposes “Reduced Instruction Set”.
The term RISC stands for ‘’Reduced Instruction Set Computer’’. It is a CPU design
plan based on simple orders and acts fast.
This is small or reduced set of instructions. Here, every instruction is expected to attain
very small jobs. In this machine, the instruction sets are modest and simple, which help
in comprising more complex commands. Each instruction is of the similar length; these
are wound together to get compound tasks done in a single operation. Most commands
are completed in one machine cycle. This pipelining is a crucial technique used to speed
up RISC machines.
The main idea behind is to make hardware simpler by using an instruction set composed
of a few basic steps for loading, evaluating, and storing operations just like a load
command will load data, store command will store the data.
Reduce the cycles per instruction at the cost of the number of instructions per program.
ARM, PA-RISC, Power Architecture, Alpha, AVR, ARC and the SPARC.
Characteristic of RISC –
Simpler instruction, hence simple instruction decoding.
Instruction comes undersize of one word.
Instruction takes a single clock cycle to get executed.
More number of general-purpose registers.
Simple Addressing Modes.
Less Data types.
Pipeline can be achieved.
Characteristic of CISC –
Complex instruction, hence complex instruction decoding.
Instructions are larger than one-word size.
Instruction may take more than a single clock cycle to get executed.
Less number of general-purpose registers as operation get performed in memory
itself.
Complex Addressing Modes.
More Data types.
Can perform only Register to Register Can perform REG to REG or REG to MEM
Arithmetic operations or MEM to MEM
An instruction execute in a single clock cycle Instruction takes more than one clock cycle
DSP processors:
These are microprocessors designed for efficient mathematical manipulation of digital
signals. DSP evolved from Analog Signal Processors (ASPs), using analog hardware to
transform physical signals
ASP to DSP because
DSP insensitive to environment
DSP performance identical even with variations in components; 2 analog
systems behavior varies even if built with same components with 1% variations
DSPs tend to run one program, not many programs. – Hence OSes are much simpler,
there is no virtual memory or protection, ...
DSPs usually run applications with hard real-time constraints:
You must account for anything that could happen in a time slot
All possible interrupts or exceptions must be accounted for and their collective
time be subtracted from the time interval.
Therefore, exceptions are bad.
DSPs usually process infinite continuous data streams.
The design of DSP architectures and ISAs driven by the requirements of DSP
algorithms.
ENDIAN:
Endianness is a term that describes the order in which a sequence of bytes are stored in
computer memory. Endianness can be either big or small, with the adjectives referring
to which value is stored first.
Big-endian is an order in which the "big end" (most significant value in the sequence)
is stored first (at the lowest storage address). For example, in a big-endian computer,
the two bytes required for the hexadecimal number 4F52 would be stored as 4F52 in
storage (if 4F is stored at storage address 1000, for example, 52 will be at address 1001).
Little-endian is an order in which the "little end" (least significant value in the
sequence) is stored first. For example, in a little-endian computer, the two bytes
required for the hexadecimal number 4F52 would be stored as 524F (52 at address
1000, 4F at 1001).
For people who use languages that read left-to-right, big endian seems like the natural
way to think of a storing a string of characters or numbers - in the same order you expect
to see it presented to you. Many of us would thus think of big-endian as storing
something in forward fashion, just as we read.
Note that within both big-endian and little-endian byte orders, the bits within each byte
are big-endian. That is, there is no attempt to be big- or little-endian about the entire bit
stream represented by a given number of stored bytes. or example, whether
hexadecimal 4F is put in storage first or last with other bytes in a given storage address
range, the bit order within the byte will be:
01001111
It is possible to be big-endian or little-endian about the bit order, but CPUs and
programs are almost always designed for a big-endian bit order. In data transmission,
however, it is possible to have either bit order.
Experts observe that Internet domain name addresses and e-mail addresses are little-
endian. For example, a big-endian version of our domain name address would be:
com.whatis.www
IBM's 370 mainframes, most RISC-based computers, and Motorola
microprocessors use the big-endian approach. TCP/IP also uses the big-endian
approach (and thus big-endian is sometimes called network order).
On the other hand, Intel processors (CPUs) and DEC Alphas and at least some
programs that run on them are little-endian.
The diagram shows the different storage formats for big and little endian double words,
words, half words and bytes.
The most significant byte in each pair is shaded to highlight its position.
Note that there is no difference when storing individual bytes
ASIC: Application Specific IC:
An application-specific integrated circuit (ASIC) is a kind of integrated circuit that is
specially built for a specific application or purpose. Compared to a programmable logic
device or a standard logic integrated circuit, an ASIC can improve speed because it is
specifically designed to do one thing and it does this one thing well. It can also be made
smaller and use less electricity.
The disadvantage of this circuit is that it can be more expensive to design and
manufacture, particularly if only a few units are needed.
An ASIC can be found in almost any electronic device and its uses can range from
custom rendering of images to sound conversion.
Because ASICs are all custom-made and thus only available to the company that
designed them, they are considered to be proprietary technology.
ASIC: Types
There are three different categories of ASICS:
Full-Custom ASICs: These are custom-made from scratch for a specific application.
Their ultimate purpose is decided by the designer. All the photolithographic layers of
this integrated circuit are already fully defined, leaving no room for modification during
manufacturing.
Semi-Custom ASICs: These are partly customized to perform different functions within
the field of their general area of application. These ASICS are designed to allow some
modification during manufacturing, although the masks for the diffused layers are
already fully defined.
Platform ASICs: These are designed and produced from a defined set of methodologies,
intellectual properties and a well-defined design of silicon that shortens the design cycle
and minimizes development costs. Platform ASICs are made from predefined platform
slices, where each slice is a pre-manufactured device, platform logic or entire system.
The use of pre-manufactured materials reduces development costs for these circuits.
Processor Performance:
A. Pipelining:
Pipelining is the process of accumulating instruction from the processor through a
pipeline. It allows storing and executing instructions in an orderly process. It is also
known as pipeline processing.
Pipelining is a technique where multiple instructions are overlapped during execution.
Pipeline is divided into stages and these stages are connected with one another to form
a pipe like structure. Instructions enter from one end and exit from another end.
Pipelining increases the overall instruction throughput.
In pipeline system, each segment consists of an input register followed by a
combinational circuit. The register is used to hold data and combinational circuit
performs operations on it. The output of combinational circuit is applied to the input
register of the next segment.
It allows storing, prioritizing, managing and executing tasks and instructions in an
orderly process.
A stream of instructions can be executed by
overlapping fetch, decode and execute phases of an instruction cycle. This type of
technique is used to increase the throughput of the computer system.
An instruction pipeline reads instruction from the memory while previous instructions
are being executed in other segments of the pipeline. Thus we can execute multiple
instructions simultaneously.
The pipeline will be more efficient if the instruction cycle is divided into segments of
equal duration
The cycle time of the processor is reduced.
It increases the throughput of the system
It makes the system reliable
B. Superscalar Processor:
A Scalar processor is a normal processor, which works on simple instruction at a time,
which operates on single data items.
But in today's world, this technique will prove to be highly inefficient, as the overall
processing of instructions will be very slow. An instruction pipeline reads instruction
from the memory while previous instructions are being executed in other segments of
the pipeline. Thus we can execute multiple instructions simultaneously.
Vector processors: There is a class of computational problems that are beyond the
capabilities of a conventional computer. These problems require vast number of
computations on multiple data items that will take a conventional computer (with scalar
processor) days or even weeks to complete.
Such complex instructions, which operates on multiple data at the same time, requires
a better way of instruction execution, which was achieved by Vector processors.
Scalar CPUs can manipulate one or two data items at a time, which is not very efficient.
Also, simple instructions like ADD A to B, and store into C are not practically efficient.
Addresses are used to point to the memory location where the data to be operated will
be found, which leads to added overhead of data lookup. So until the data is found, the
CPU would be sitting ideal, which is a big performance issue.
A Hence, the concept of Instruction Pipeline comes into picture, in which the
instruction passes through several sub-units in turn. Vector processor, not only use
Instruction pipeline, but it also pipelines the data, working on multiple data at the same
time.
A normal scalar processor instruction would be ADD A, B, which leads to addition of
two operands, but what if we can instruct the processor to ADD a group of numbers
(from 0to n memory location) to another group of numbers(lets say, n to k memory
location). This can be achieved by vector processors.
In vector processor a single instruction, can ask for multiple data operations, which
saves time, as instruction is decoded once, and then it keeps on operating on different
data items.
Superscalar Processors: It was first invented in 1987. It is a machine which is
designed to improve the performance of the scalar processor. In most applications, most
of the operations are on scalar quantities. Superscalar approach produces the high
performance general purpose processors.
The main principle of superscalar approach is that it executes instructions
independently in different pipelines. As we already know, that Instruction pipelining
leads to parallel processing thereby speeding up the processing of instructions. In
Superscalar processor, multiple such pipelines are introduced for different operations,
which further improves parallel processing.
There are multiple functional units each of which is implemented as a pipeline. Each
pipeline consists of multiple stages to handle multiple instructions at a time which
support parallel execution of instructions.
It increases the throughput because the CPU can execute multiple instructions per clock
cycle. Thus, superscalar processors are much faster than scalar processors.
A scalar processor works on one or two data items, while the vector processor works
with multiple data items. A superscalar processor is a combination of both. Each
instruction processes one data item, but there are multiple execution units within each
CPU thus multiple instructions can be processing separate data items concurrently.
While a superscalar CPU is also pipelined, there are two different performance
enhancement techniques. It is possible to have a non-pipelined superscalar CPU or
pipelined non-superscalar CPU. The superscalar technique is associated with some
characteristics, these are:
Instructions are issued from a sequential instruction stream.
CPU must dynamically check for data dependencies.
Should accept multiple instructions per clock cycle.
C. CPU Power Consumption:
Embedded systems are becoming more and more complex. Due to technological
improvements, it is now possible to integrate a lot of components in a unique circuit.
Nowadays, homogeneous or heterogeneous multiprocessor architectures within SoC
(System on Chip) or SiP (System in a Package) offer increasing computing capacities.
Meanwhile, applications are growing in complexity.
Thus, embedded systems commonly have to perform different multiple tasks, from
control oriented (innovative user interfaces, adaptation to the environment, compliance
to new formats, quality of service management) to data intensive (multimedia, audio
and video coding/decoding, software radio, 3D image processing, communication
streaming), and to sustain high throughputs and bandwidths.
One side effect of this global evolution is a drastic increase of the circuit’s power
consumption.
Leakage power increases exponentially as the process evolves to finer technologies.
Dynamic power is proportional to the operating frequency.
With higher chip densities, thermal dissipation may involve costly cooling devices, and
battery life is definitely shortened.
The role of an Operating System (OS) is essential in such a context. It also offers power
management services which may exploit low-level mechanisms (low operating/low-
standby power modes, voltage/frequency scaling, clock gating!!!) to reduce the
system’s energy consumption.
But the Operating System itself has a non negligible impact on the energy consumption.
The Operating System’s energy overhead depends on the complexity of the applications
and the number of services called.
it was observed that, depending on the application, the energy consumption of an
embedded system could rise from 6% to 50%. This ratio gets higher if the frequency
and supply voltage of the processor increase.
Power consumption is now a major constraint in many designs. Being able to estimate
this consumption for the whole system and for all its components is now compulsory.
Estimating energy consumption due to the Operating Systems is thus unavoidable. It is
the first step towards the application of off-line or on-line power optimization
technique.
Chapter 4: Embedded Systems Memory
CPU Bus:
BUS Protocol
Bus Organization
Memory System Architecture
Cache Memory, Virtual memory
Memory Management Unit
Address Translation
Memory devices & their Characteristics
SRAM, DRAM
ROM, UVROM, EEPROM
Flash Memory
CPU Bus:
The bus is the mechanism by which the CPU communicates with memory and devices.
A bus is, at a minimum, a collection of wires but it also defines a protocol by which the
CPU, memory, and devices communicate.
One of the major roles of the bus is to provide an interface to memory. (Of course, I/O
devices also connect to the bus.) Based on understanding of the bus, we study the
characteristics of memory components in this section, focusing on DMA.
CPU Bus & Organization:
A bus is a common connection between components in a system. The CPU, memory,
and I/O devices are all connected to the bus.
The signals that make up the bus provide the necessary communication: the data itself,
addresses, a clock, some control signals.
In a typical bus system, the CPU serves as the bus master and initiates all transfers. If
any device could request a transfer, then other devices might be starved of bus
bandwidth. As bus master, the CPU reads and writes data and instructions from
memory. It also initiates all reads or writes on I/O devices.
The basic building block of most bus protocols is the four-cycle handshake. The
handshake ensures that when two devices want to communicate, one is ready to transmit
and the other is ready to receive.
The handshake uses a pair of wires dedicated to the handshake: enq (meaning enquiry)
and ack (meaning acknowledge). Extra wires are used for the data transmitted during
the handshake. The four cycles are described below.
Device 1 raises its output to signal an enquiry, which tells device 2 that it should
get ready to listen for data.
When device 2 is ready to receive, it raises its output to signal an
acknowledgment. At this point, devices 1 and 2 can transmit or receive.
Once the data transfer is complete, device 2 lowers its output, signalling that it
has received the data.
After seeing that ack has been released, device 1 lowers its output.
At the end of the handshake, both handshaking signals are low, just as they were at the
start of the handshake. The system has thus returned to its original state in readiness for
another handshake-enabled data transfer.
Microprocessor buses build on the handshake for communication between the CPU and
other system components. The term bus is used in two ways.
The most basic use is as a set of related wires, such as address wires. However, the term
may also mean a protocol for communicating between components.
To avoid confusion, we will use the term bundle to refer to a set of related signals. The
fundamental bus operations are reading and writing. Figure shows the structure of a
typical bus that supports reads and writes.
The major components follow:
Clock provides synchronization to the bus components,
R/W is true when the bus is reading and false when the bus is writing,
Address is an a-bit bundle of signals that transmits the address for an access,
Data is an n-bit bundle of signals that can carry data to or from the CPU, and
Data ready signals when the values on the data bundle are valid.
Types of Buses:
• ISA: Industry Standard architecture
• EISA: Extended Industry Standard architecture
• MCA: Micro Channel Architecture
• VESA: Video Electronics Standards Association
• PCI: Peripheral Component Interconnect
• PCI-X: PCI express
• PCMCIA: Personal Computer Memory Card Industry Association
• AGP: Accelerated Graphics Port
• SCSI: Small Computer Systems Interface
• USB
• IEEE1394
Main Memory:
The memory unit that communicates directly within the CPU, Auxiliary memory and
Cache memory, is called main memory. It is the central storage unit of the computer
system.
It is a large and fast memory used to store data during computer operations. Main
memory is made up of RAM and ROM, with RAM integrated circuit chips holding the
major share.
RAM: Random Access Memory
DRAM: Dynamic RAM, is made of capacitors and transistors, and must be
refreshed every 10~100ms. It is slower and cheaper than SRAM.
SRAM: Static RAM, has a six transistor circuit in each cell and retains data,
until powered off.
NVRAM: Non-Volatile RAM, retains its data, even when turned off. Example:
Flash memory.
ROM: Read Only Memory, is non-volatile and is more like a permanent storage for
information. It also stores the bootstrap loader program, to load and start the operating
system when computer is turned on. PROM(Programmable ROM), EPROM(Erasable
PROM) and EEPROM(Electrically Erasable PROM) are some commonly used ROMs.
Auxiliary & Cache Memory:
Auxiliary Memory
Devices that provide backup storage are called auxiliary memory. For example:
Magnetic disks and tapes are commonly used auxiliary devices. Other devices
used as auxiliary memory are magnetic drums, magnetic bubble memory and
optical disks.
It is not directly accessible to the CPU, and is accessed using the Input/Output
channels.
Cache Memory
The data or contents of the main memory that are used again and again by CPU,
are stored in the cache memory so that we can easily access that data in shorter
time.
Whenever the CPU needs to access memory, it first checks the cache memory.
If the data is not found in cache memory then the CPU moves onto the main
memory. It also transfers block of recent data into the cache and keeps on
deleting the old data in cache to accommodate the new one.
Types of Cache –
Primary Cache – A primary cache is always located on the processor chip. This cache
is small and its access time is comparable to that of processor registers.
Secondary Cache – Secondary cache is placed between the primary cache and the rest
of the memory. It is referred to as the level 2 (L2) cache. Often, the Level 2 cache is
also housed on the processor chip.
Virtual Memory:
Virtual memory is the separation of logical memory from physical memory. This
separation provides large virtual memory for programmers when only small physical
memory is available.
Virtual memory is used to give programmers the illusion that they have a very large
memory even though the computer has a small main memory. It makes the task of
programming easier because the programmer no longer needs to worry about the
amount of physical memory available.
Operating System manages the Virtual On the other hand hardware manages the
memory. cache memory.
In virtual memory, The program with
While in cache memory, recently used
size larger than the main memory are
data is copied into.
executed.
Paging
A computer can address more memory than the amount physically installed on
the system. This extra memory is actually called virtual memory and it is a
section of a hard that's set up to emulate the computer's RAM. Paging technique
plays an important role in implementing virtual memory.
Paging is a memory management technique in which process address space is
broken into blocks of the same size called pages (size is power of 2, between
512 bytes and 8192 bytes). The size of the process is measured in the number
of pages.
Similarly, main memory is divided into small fixed-sized blocks of (physical)
memory called frames and the size of a frame is kept the same as that of a page
to have optimum utilization of the main memory and to avoid external
fragmentation.
Address Translation:
Page address is called logical address and represented by page number and
the offset.
Logical Address = Page number + page offset Frame address is called physical
address and represented by a frame number and the offset.
Physical Address = Frame number + page offset A data structure called page map
table is used to keep track of the relation between a page of a process to a frame in
physical memory.
When the system allocates a frame to any page, it translates this logical address into a
physical address and create entry into the page table to be used throughout execution
of the program.
When a process is to be executed, its corresponding pages are loaded into any
available memory frames. Suppose you have a program of 8Kb but your memory can
accommodate only 5Kb at a given point in time, then the paging concept will come
into picture. When a computer runs out of RAM, the operating system (OS) will move
idle or unwanted pages of memory to secondary memory to free up RAM for other
processes and brings them back when needed by the program.
This process continues during the whole execution of the program where the OS keeps
removing idle pages from the main memory and write them onto the secondary
memory and bring them back when required by the program.
Advantages and Disadvantages of Paging
Paging reduces external fragmentation, but still suffer from internal fragmentation.
Paging is simple to implement and assumed as an efficient memory management
technique.
Due to equal size of the pages and frames, swapping becomes very easy.
Page table requires extra memory space, so may not be good for a system having
small RAM.
Demand Paging:
A demand paging system is quite similar to a paging system with swapping where
processes reside in secondary memory and pages are loaded only on demand, not in
advance. When a context switch occurs, the operating system does not copy any of
the old program’s pages out to the disk or any of the new program’s pages into the
main memory Instead, it just begins executing the new program after loading the first
page and fetches that program’s pages as they are referenced.
While executing a program, if the program references a page which is not available in
the main memory because it was swapped out a little ago, the processor treats this
invalid memory reference as a page fault and transfers control from the program to
the operating system to demand the page back into the memory.
Advantages
Large virtual memory.
More efficient use of memory.
There is no limit on degree of multiprogramming.
Disadvantages
Number of tables and the amount of processor overhead for handling page
interrupts are greater than in the case of the simple paged management
techniques.
The operation of the SRAM memory cell is relatively straightforward. When the cell is
selected, the value to be written is stored in the cross-coupled flip-flops. The cells are
arranged in a matrix, with each cell individually addressable. Most SRAM memories
select an entire row of cells at a time, and read out the contents of all the cells in the
row along the column lines.
While it is not necessary to have two bit lines, using the signal and its inverse, this is
normal practice which improves the noise margins and improves the data integrity. The
two bit lines are passed to two input ports on a comparator to enable the advantages of
the differential data mode to be accessed, and the small voltage swings that are present
can be more accurately detected.
Access to the SRAM memory cell is enabled by the Word Line. This controls the two
access control transistors which control whether the cell should be connected to the bit
lines. These two lines are used to transfer data for both read and write operations.
DRAM: Dynamic RAM
Dynamic RAM, or DRAM is a form of random access memory, RAM which is used in
many processor systems to provide the working memory.
DRAM is widely used in digital electronics where low-cost and high-capacity memory
is required.
Dynamic RAM, DRAM is used where very high levels of memory density are needed,
although against this it is quite power hungry so this needs to be considered if it is to
be use
DRAM, or dynamic random access memory stores each bit of data on a small capacitor
within the memory cell. The capacitor can be either charged or discharged and this
provides the two states, "1" or "0" for the cell.
Since the charge within the capacitor leaks, it is necessary to refresh each memory cell
periodically. This refresh requirement gives rise to the term dynamic - static memories
do not have a need to be refreshed.
The advantage of a DRAM is the simplicity of the cell - it only requires a single
transistor compared to around six in a typical static RAM, SRAM memory cell. In view
of its simplicity, the costs of DRAM are much lower than those for SRAM, and they
are able to provide much higher levels of memory density. However the DRAM has
disadvantages as well, and as a result, most computers use both DRAM technology and
SRAM, but in different areas.
In view of the fact that power is required for the DRAM to maintain its data, it is what
is termed a volatile memory.
Asynchronous DRAM
This was the first type of DRAM in use but was gradually replaced by synchronous
DRAM. This was called asynchronous because the memory access was not
synchronized with the system clock.
Synchronous DRAM
This DRAM replaced the asynchronous RAM and is used in most computer systems
today. In synchronous DRAM, the clock is synchronised with the memory interface.
All the signals are processed on the rising edge of the clock.
Graphics DRAM
There are many graphics related tasks that can be accomplished with both synchronous
and asynchronous DRAM. Some of the DRAM used for these tasks are Video DRAM,
Window DRAM, Multibank DRAM etc.
Difference : SRAM Vs. DRAM
Both static and dynamic RAM are types of RAM but SRAM is formed using flip flops
and DRAM using capacitors.
It is necessary that the data in DRAM is refreshed periodically to store it correctly. This
is not necessary for SRAM.
SRAM is normally only used in Cache memory while DRAM is used in main memory.
Static RAM is much more faster and expensive as compared to Dynamic RAM.
Since SRAM is used as Cache memory, its size is 1MB to 16MB. On the other hand,
dynamic memory is larger as it is used as main memory. Its size is 4GB to 16GB in
computers and laptops.
SRAM is usually present on processors or between processors and main memory.
DRAM is present on the motherboard.
ROM Memories:
Mask ROM (MROM). Data bits are permanently programmed into a microchip by the
manufacturer of the external MROM chip. MROM designs are usually based upon
MOS (NMOS, CMOS) or bipolar transistor-based circuitry. This was the original type
of ROM design. Because of expensive setup costs for a manufacturer of MROMs, it is
usually only produced in high volumes and there is a wait time of several weeks to
several months. However, using MROMs in design of products is a cheaper solution.
One-Time Programmable ROM (OTP or OTPRom/PROM). This type of ROM can
only be programmed (permanently) one time as its name implies, but it can be
programmed outside the manufacturing factory, using a ROM burner. OTPs are based
upon bipolar transistors, in which the ROM burner burns out fuses of cells to program
them to “1” using high voltage/current pulses.
Erasable Programmable ROM (EPROM). An EPROM can be erased more than one
time using a device that outputs intense short-wavelength, ultraviolet light into the
EPROM package’s built-in transparent window. (OTPs are one-time programmable
EPROMs without the window to allow for erasure; the packaging without the window
used in OTPs is cheaper)
EPROMs are made up of MOS (i.e., CMOS, NMOS) transistors whose extra “floating
gate” (gate capacitance) is electrically charged, and the charge trapped, to store a “0”
by the Romizer through “avalanche induced migration”—a method in which a high
voltage is used to expose the floating gate.
The floating gate is made up of a conductor floating within the insulator, which allows
enough of a current flow to allow for electrons to be trapped within the gate, with the
insulator of that gate preventing electron leakage.
The floating gates are discharged via UV light, to store a “1” for instance. This is
because the high-energy photons emitted by UV light provide enough energy for
electrons to escape the insulating portion of the floating gate. The total number of
erasures and rewrites is limited depending on the EPROM.
Electrically Erasable Programmable ROM (EEPROM). Like EPROM, EEPROMs
can be erased and reprogrammed more than once. The number of times erasure and
reuse occur depends on the EEPROMs.
Unlike EPROMs, the content of EEPROM can be written and erased “in bytes” without
using any special devices. In other words, the EEPROM can stay on its residing board,
and the user can connect to the board interface to access and modify an EEPROM.
EEPROMs are based upon NMOS transistor circuitry, except insulation of the floating
gate in an EEPROM is thinner than that of the EPROM, and the method used to charge
the floating gates is called the Fowler–Nordheim tunneling method (in which the
electrons are trapped by passing through the thinnest section of the insulating material).
Erasing an EEPROM which has been programmed electrically is a matter of using a
high-reverse polarity voltage to release the trapped electrons within the floating gate.
Electronically discharging an EEPROM can be tricky, though, in that any physical
defects in the transistor gates can result in an EEPROM not being discharged
completely before a new reprogram.
EEPROMs typically have more erase/write cycles than EPROMs, but are also usually
more expensive. A cheaper and faster variation of the EEPROM is Flash memory.
Flash Memory:
Flash memory storage is a form of non-volatile memory that was born out of a
combination of the traditional EPROM and E2PROM.
In essence it uses the same method of programming as the standard EPROM and the
erasure method of the E2PROM.
One of the main advantages that flash memory has when compared to EPROM is its
ability to be erased electrically. However it is not possible to erase each cell in a flash
memory individually unless a large amount of additional circuitry is added into the chip.
This would add significantly to the cost and accordingly most manufacturers dropped
this approach in favour of a system whereby the whole chip, or a large part of it is block
or flash erased - hence the name.
Flash memory is able to provide high density memory because it requires only a few
components to make up each memory cell. In fact the structure of the memory cell is
very similar to the EPROM.
Each Flash memory cell consists of the basic channel with the source and drain
electrodes separated by the channel about 1 µm long. Above the channel in the Flash
memory cell there is a floating gate which is separated from the channel by an
exceedingly thin oxide layer which is typically only 100 Å thick. It is the quality of this
layer which is crucial to the reliable operation of the memory.
Above the floating gate there is the control gate. This is used to charge up the gate
capacitance during the write cycle.
The Flash memory cell functions by storing charge on the floating gate. The presence
of charge will then determine whether the channel will conduct or not. During the read
cycle a "1" at the output corresponds to the channel being in its low resistance or ON
state.
Programming the Flash memory cell is a little more complicated, and involves a process
known as hot-electron injection. When programming the control gate is connected to a
"programming voltage". The drain will then see a voltage of around half this value
while the source is at ground. The voltage on the control gate is coupled to the floating
gate through the dielectric, raising the floating gate to the programming voltage and
inverting the channel underneath. This results in the channel electrons having a higher
drift velocity and increased kinetic energy.
Collisions between the energetic electrons and the crystal lattice dissipate heat which
raises the temperature of the silicon. At the programming voltage it is found that the
electrons cannot transfer their kinetic energy to the surrounding atoms fast enough and
they become "hotter" and scatter further afield, many towards the oxide layer. These
electrons overcome the 3.1 eV (electron volts) needed to overcome the barrier and they
accumulate on the floating gate. As there is no way of escape they remain there until
they are removed by an erase cycle.
The erase cycle for Flash memory uses a process called Fowler-Nordheim tunnelling.
The process is initiated by routing the programming voltage to the source, grounding
the control gate and leaving the drain floating. In this condition electrons are attracted
towards the source and they tunnel off the floating gate, passing through the thin oxide
layer. This leaves the floating gate devoid of charge.
I/O devices
5.1 I/O Devices
5.1.1 Timers and Counters.
5.1.2 Watchdog Timers.
5.1.3 Interrupt Controllers.
5.1.4 DMA Controllers.
5.1.5 A/D and D/A Converters.
5.1.6 Displays.
5.1.7 Keyboards.
5.1.8 Infrared devices.
5.2 Component Interfacing.
5.2.1 Memory Interfacing.
5.2.2 I/O Device Interfacing.
5.3 Interfacing Protocols
5.3.1 GPIB.
5.3.2 FIREWIRE
5.3.3 USB
5.3.4 IRDA
Timers/Counters:
Counter/timer hardware is a crucial component of most embedded systems. In some
cases a timer is needed to measure elapsed time; in others we want to count or time
some external events.
The names counter and timer can be used interchangeably when talking about the
hardware. The difference in terminology has more to do with how the hardware is
used in a given application.
Digital timer/counters are used throughout embedded designs to provide a series of
time or count related events within the system with the minimum of processor and
software overhead.
Most embedded systems have a time component within them such as timing
references for control sequences, to provide system ticks for operating systems and
even the generation of waveforms for serial port baud rate generation and audible
tones.
They are available in several different types but are essentially based around a simple
structure as shown.
The central timing is derived from a clock input. This clock may be internal to the
timer/counter or be external and thus connected via a separate pin.
The clock may be divided using a simple divider which can provide limited division
normally based on a power of two or through a pre-scalar which effectively scales
down or divides the clock by the value that is written into the pre-scalar register.
The divided clock is then passed to a counter which is normally configured in a count-
down operation, i.e. it is loaded with a preset value which is clocked down towards
zero. When a zero count is reached, this causes an event to occur such as an interrupt
of an external line changing state.
The final block is loosely described as an I/O control block but can be more
sophisticated than that. It generates interrupts and can control the counter based on
external signals which can gate the count-down and provide additional control. This
expands the functionality that the timer.
Auto Reload: A timer with automatic reload capability will have a latch register to hold
the count written by the processor. When the processor writes to the latch, the count
register is written as well. When the timer later overflows, it first generates an output
signal. Then, it automatically reloads the contents of the latch into the count register.
Since the latch still holds the value written by the processor, the counter will begin
counting again from the same initial value.
Such a timer will produce a regular output with the same accuracy as the input clock.
This output could be used to generate a periodic interrupt like a real-time operating
system (RTOS) timer tick, provide a baud rate clock to a UART, or drive any device
that requires a regular pulse.
Watchdog Timers:
In a small embedded device, it is easy to find the exact cause for the bug, but not in a
complex embedded system, even though it is perfectly designed & tested embedded
system on which a perfect code executes can fail, due to a small bug.
Most embedded systems need to be self-reliant. It's not usually possible to wait for
someone to reboot them if the software hangs. Some embedded designs, such as space
probes, are simply not accessible to human operators. If their software ever hangs, such
systems are permanently disabled. In other cases, the speed with which a human
operator might reset the system would be too slow to meet the uptime requirements of
the product.
A watchdog timer (WDT) is a safety mechanism that brings the system back to life
when it crashes.
A watchdog timer is a piece of hardware that can be used to automatically detect
software anomalies and reset the processor if any occur. Generally speaking, a
watchdog timer is based on a counter that counts down from some initial value to zero.
The embedded software selects the counter's initial value and periodically restarts it. If
the counter ever reaches zero before the software restarts it, the software is presumed
to be malfunctioning and the processor's reset signal is asserted. The processor (and the
embedded software it's running) will be restarted as if a human operator had cycled the
power.
WDT is a hardware that contains a timing device and clock source(besides system
clock).
A timing device is a free-running timer, which is set to a certain value that gets
decremented continuously. When the value reaches zero, a short pulse is generated by
WDT circuitry that resets and restarts the system.
In short, WDT constantly watches the execution of the code and resets the system if
software is hung or no longer executing the correct sequence of the code. Reloading of
WDT value by the software is called kicking the watchdog.
These are of two types :
External WDT
Internal WDT
As shown, the watchdog timer is a chip external to the processor. However, it could
also be included within the same chip as the CPU. This is done in many
microcontrollers. In either case, the output from the watchdog timer is tied directly to
the processor's reset signal.
The process of restarting the watchdog timer's counter is sometimes called "kicking the
dog." The appropriate visual metaphor is that of a man being attacked by a vicious dog.
If he keeps kicking the dog, it can't ever bite him. But he must keep kicking the dog at
regular intervals to avoid a bite. Similarly, the software must restart the watchdog timer
at a regular rate, or risk being restarted.
Software Problems:
A watchdog timer can get a system out of a lot of dangerous situations.
However, if it is to be effective, resetting the watchdog timer must be considered
within the overall software design. Designers must know what kinds of things
could go wrong with their software, and ensure that the watchdog timer will
detect them, if any occur.
Systems hang for any number of reasons. A logical fallacy resulting in the
execution of an infinite loop is the simplest.
Another possibility is that an unusual number of interrupts arrives during one
pass of the loop. Any extra time spent in ISRs is time not spent executing the
main loop. A dangerous delay in feeding the motor new control instructions
could result.
When multitasking kernels are used, deadlocks can occur. For example, a group
of tasks might get stuck waiting on each other and some external signal that one
of them needs, leaving the whole set of tasks hung indefinitely.
If such faults are transient, the system may function perfectly for some length
of time after each watchdog-induced reset. However, failed hardware could lead
to a system that constantly resets. For this reason it may be wise to count the
number of watchdog-induced resets, and give up trying after some fixed number
of failures.
Interrupt Controller:
In a typical SoC, there could be hundreds of peripherals – interrupt sources that may
vie for processor attention. It is impossible to route all these signals to the CPU. So
there is another special purpose peripheral called Interrupt Controller to which all the
peripheral interrupts signals are connected.
The controller is in turn connected to the CPU with one or few lines of signals, with
which it can interrupt the CPU. Controller is provided with various registers to
mask/unmask the interrupts, read pending/raw status, set priority etc.
The controller monitors the signals from peripherals and if there are any active
interrupts, based on the preconfigured priority decides the source to be processed first.
Then it signals the CPU with interrupt. The CPU, in the ISR, can then read the pending
interrupt register and process the same.
Interrupt Mechanism:
The first part of the sequence is the recognition of the interrupt or exception. This in
itself does not necessarily immediately trigger any processor reaction. If the interrupt
is not an error condition or the error condition is not connected with the currently
executing instruction, the interrupt will not be internally processed until the currently
executing instruction has completed.
At this point, known as an instruction boundary, the processor will start to internally
process the interrupt. If, on the other hand, the interrupt is due to an error with the
currently executing instruction, the instruction will be aborted to reach the instruction
boundary.
At the instruction boundary, the processor must now save certain state information to
allow it to continue its previous execution path prior to the interrupt.
This will typically include a copy of the condition code register, the program counter
and the return address.
The next phase is to get the location of the ISR to service the interrupt. This is normally
kept in a vector table somewhere in memory and the appropriate vector can be supplied
by the peripheral or pre-assigned, or a combination of both approaches.
Once the vector has been identified, the processor starts to execute the code within the
ISR until it reaches a return from interrupt type of instruction. At this point, the
processor, reloads the status information and processing continues the previous
instruction stream.
Interrupt controllers
In many embedded systems there are more external sources for interrupts than interrupt
pins on the processor. In this case, it is necessary to use an interrupt controller to provide
a larger number of interrupt signals. An interrupt controller performs several functions:
It provides a large number of interrupt pins that can be allocated to many external
devices. Typically this is at least eight and higher numbers can be supported by
cascading two or more controllers together. This is done on the IBM PC AT where two
8 port controllers are cascaded to give 15 interrupt levels.
It orders the interrupt pins in a priority level so that a high level interrupt will inhibit a
lower level interrupt.
It may provide registers for each interrupt pin which contain the vector number to be
used during an acknowledge cycle. This allows peripherals that do not have the ability
to provide a vector to do so.
They can provide interrupt masking. This allows the system software to decide when
and if an interrupt is allowed to be serviced. The controller, through the use of masking
bits within a controller, can prevent an interrupt request from being passed through to
the processor. In this way, the system has a multi-level approach to screening interrupts.
It uses the screening provided by the processor to provide coarse grain granularity while
the interrupt controller provides a finer level.
Interrupt latency
This is usually defined as the time taken by the processor from recognition of the
interrupt to the start of the ISR. It consists of several stages and is dependent on both
hardware and software factors. Its importance is that it defines several aspects of an
embedded system with reference to its ability to respond to real-time events. The stages
involved in calculating a latency are:
The time taken to recognise the interrupt Do not assume that this is instantaneous as it
will depend on the processor design and its own interrupt recognition mechanism. As
previously mentioned, some processors will repeatedly sample an interrupt signal to
ensure that it is a real one and not a false one.
The time taken by the CPU to complete the current instruction This will also vary
depending on what the CPU is doing and its complexity. For a simple CISC processor,
this time will vary as its instructions all take a different number of clocks to complete.
Usually the most time-consuming instructions are those that perform multiplication or
division or some complex data manipulation such as bit field operations. For RISC
processors with single cycle execution, the time is usually that to clear the execution
pipeline and is 1 or 2 clocks.
The time for the CPU to perform a context switch. This is the time taken by the
processor to save its internal context information such as its program counter, internal
data registers and anything else it needs. For CISC processors, this can involve creating
blocks of data on the stack by writing the information externally. For RISC processors
this may mean simply switching registers internally without explicitly saving any
information. Register windowing or shadowing is normally used.
The time taken to fetch the interrupt vector is normally the time to fetch a single value
from memory but even this time can be longer than you think! We will come back to
this topic.
The time taken to start the interrupt service routine execution typically very short.
However remember that because the pipeline is cleared, the instruction will need to be
clocked through to execute it and this can take a few extra clocks, even with a RISC
architecture.
DMA Controller:
Direct memory access (DMA) controllers are frequently an elegant hardware solution
to a recurring software/system problem of providing an efficient method of transferring
data from a peripheral to memory.
In systems without DMA, the solution is to use the processor to either regularly poll the
peripheral to see if it needs servicing or to wait for an interrupt to do so.
Interrupts are a far better solution. An interrupt is sent from the peripheral to the
processor to request servicing. In many cases, all that is needed is to simply empty or
load a buffer. This solution starts becoming an issue as the servicing rate increases.
With high speed ports, the cost of interrupting the processor can be higher than the
couple of instructions that it executes to empty a buffer. In these cases, the limiting
factor for the data transfer is the time to recognise, process and return from the interrupt.
If the data needs to be processed on a byte by byte basis in real-time, this may have to
be tolerated but with high speed transfers this is often not the case as the data is treated
in packets.
This is where the DMA controller comes into its own. It is a device that can initiate and
control bus accesses between I/O devices and memory, and between two memory areas.
With this type of facility, the DMA controller acts as a hardware implementation of the
low-level buffer filling or emptying interrupt routine.
A generic controller consists of several components which control the operation:
Address generator: This is the most important part of a DMA controller and typically
consists of a base address register and an auto-incrementing counter which increments
the address after every transfer. The generated addresses are used within the actual bus
transfers to access memory and/or peripherals. When a predefined number of bytes have
been transferred, the base address is reloaded and the count cleared to zero ready to
repeat the operation.
Address bus: This is where the address created by the address generator is used to
access a specific memory location or peripheral.
Data bus: This is the data bus that is used to transfer data from the DMA controller to
the destination location. In some cases, the data transfer may be made direct from the
peripheral to the memory with the DMA controller directly selecting the peripheral.
Bus requester: This is used to request the bus from the main CPU.
Local peripheral control: This allows the DMA controller to select the peripheral and
get it to accept or provide data directly or for a peripheral to request a data transfer,
depending on the DMA controller’s design.
Interrupt signals: Most DMA controllers can interrupt the processor when the data
transfers are complete or if an error has occurred. This prompts the processor to either
reprogram the DMA controller for a different transfer or acts as a signal that a new
batch of data has been transferred and is ready for processing.
DMA Controller: operation
Program the controller: Prior to using the DMA controller, it must be configured with
parameters that define the addressing such as base address and byte count that will be
used to transfer the data. In addition, the device will be configured in terms of its
communication with the processor and peripheral. Processor communication will
normally include defining the conditions that will generate an interrupt. The peripheral
communication may include defining which request pin is used by the peripheral and
any arbitration mechanism that is used to reconcile simultaneous requests for DMA
from two or more peripherals. The final part of this process is to define how the
controller will transfer blocks of data: all at once or individually or some other
combination.
Start a transfer: A DMA transfer is normally initiated in response to a peripheral request
to start a transfer. It usually assumes that the controller has been correctly configured
to support this request. With a peripheral and processor, the processor will normally
request a service by asserting an interrupt pin which is connected to the processor’s
interrupt input(s). With a DMA controller, this peripheral interrupt signal can be used
to directly initiate a transfer or if it is left attached to the processor, the interrupt service
routine can start the DMA transfers by writing to the controller.
Request the bus: The next stage is to request the bus from the processor. With most
modern processors supporting bus arbitration directly, the DMA controller issues a bus
request signal to the processor which will release the bus when convenient and allowthe
DMA controller to proceed. Without this support, the DMA controller has to cycle steal
from the processor so that it is held off the bus while the DMA controller uses it. As
will be described later on in this chapter, most DMA controllers provide some
flexibility concerning how they use and compete with bus bandwidth with the
processor and other bus masters.
Issue the address: Assuming the controller has the bus, it will then issue the bus to
activate the target memory location. A variety of interfaces are used — usually
dependent on the number of pins that are available and include both non-multiplexed
and multiplexed buses. In addition, the controller provides other signals such as
read/write and strobe signals that can be used to work with the bus. DMA controllers
tend to be designed for a specific processor family bus but most recent devices are also
generic enough to be used with nearly any bus.
Transfer the data: The data is transferred either from a holding buffer within the DMA
controller or directly from a peripheral.
Update address generator: Once the data transfer has been completed, the address
generator uses the completion to calculate the address for the next transfer and update
the byte/transfer counters.
Update processor: Depending on how the DMA controller has been programmed it can
notify the processor using interrupts of events within the transfer process such as an
address error or the completion of a data or block transfer.
For devices with caches and/or internal memory, their external bus bandwidth
requirements are a lot lower and thus the DMA controller can use bus cycles without
impeding the processor’s performance. This last statement depends on the chances of
the DMA controller using the bus at the same time as the processor. This in turn depends
on the frequency and size of the DMA transfers. To provide some form of flexibility
for the designer so that a suitable trade-off can be made, most DMA controllers support
different types of bus utilisation.
Single transfer: Here the bus is returned back to the processor after every transfer so
that the longest delay it will suffer in getting access to memory will be a bus cycle.
Block transfer: Here the bus is returned back to the processor after the complete block
has been sent so that the longest delay the processor will suffer will be the time of a bus
cycle multiplied by the number of transfers to move the block. In effect, the DMA
controller has priority over the CPU in using the bus.
Demand transfer: In this case, the DMA controller will hold the bus for as long as an
external device requests it to do so. While the bus is held, the DMA controller is at
liberty to transfer data as and when needed. In this respect, there may be gaps when the
bus is retained but no data is transferred.
A/D Converter:
D/A Converter:
Input Circuit: It receives the binary digital inputs safely and does some process or
filtration if required. It does not have any vital role in the whole circuit.
Voltage Switching Circuit: It switches voltages between input digital signals and
reference voltage sources and passes to the main resistive circuit. It also makes
connection or isolation with the ground.
Resistive Network: It is the main part of the digital to analog converter circuit. It
basically helps to multiple digital input processing before the amplifier circuit. There
are two types of DAC available according to the resistive network - weighted resistor
network and R-2R network. R-2R network has more advantages than the weighted
resistor network.
Amplifier: Generally, a differential or operational amplifier is used in the DAC system.
That not only amplifies the signal even it make differentiate or comprises signal or
process signal such as summation, etc.
Digital to Analog Converter(DAC) Operation:
As we know there are two types of DAC, weighted resistor network can generate the
analog signal almost equal to the input digital signal. It works with an inverting adder
circuit. The main disadvantage of the weighted resistor network is the difference between
the resistance value will increase with the increasing of bits in the input digital signal
corresponding to LSB & MSB.
On the other hand, the R-2R network has so many advantages over the weighted resistor
network. It also generates the analog signal almost equal to the input binary digital signal.
The main advantage of the R-2R networks is, it contains only two values of resistors R and
2R so it is very easy to design and easy to select the resistor during the operation. Also,
the difference between the resistor values during the increase of the number of bits in the
signal can be adjusted by adding R-2R sections.
I/O Devices: Keyboard:
A keyboard is basically an array of switches, but it may include some internal logic to
help simplify the interface to the microprocessor. In this chapter, we build our
understanding from a single switch to a microprocessor-controlled keyboard.
A switch uses a mechanical contact to make or break an electrical circuit. The major
problem with mechanical switches is that they bounce as shown in Figure.
When the switch is depressed by pressing on the button attached to the switch’s arm,
the force of the depression causes the contacts to bounce several times until they settle
down. If this is not corrected, it will appear that the switch has been pressed several
times, giving false inputs.
A hardware debouncing circuit can be built using a one-shot timer. Software can also
be used to debounce switch inputs. A raw keyboard can be assembled from several
switches. Each switch in a raw keyboard has its own pair of terminals, making raw
keyboards impractical when a large number of keys is required.
More expensive keyboards, such as those used in PCs, actually contain a
microprocessor to preprocess button inputs. PC keyboards typically use a 4-bit
microprocessor to provide the interface between the keys and the computer.
The microprocessor can provide debouncing, but it also provides other functions as
well. An encoded keyboard uses some code to represent which switch is cur- rently
being depressed. At the heart of the encoded keyboard is the scanned array of switches
shown in Figure.
Memory and IO Interfacing:
Memory Interfacing
When we are executing any instruction, we need the microprocessor to access
the memory for reading instruction codes and the data stored in the memory.
For this, both the memory and the microprocessor requires some signals to read
from and write to registers.
The interfacing process includes some key factors to match with the memory
requirements and microprocessor signals. The interfacing circuit therefore
should be designed in such a way that it matches the memory signal
requirements with the signals of the microprocessor.
IO Interfacing
There are various communication devices like the keyboard, mouse, printer, etc.
So, we need to interface the keyboard and other devices with the microprocessor
by using latches and buffers. This type of interfacing is known as I/O
interfacing.
Advantages
Simple & standard hardware interface
Interface present on many bench instruments
Rugged connectors & connectors used (although some insulation displacement
cables appear occasionally).
Possible to connect multiple instruments to a single controller
Disadvantages
Bulky connectors
Cable reliability poor - often as a result of the bulky cables.
Low bandwidth - slow compared to more modern interfaces
Basic IEEE 422 does not mandate a command language (SCPI used in later
implementations but not included on all instruments.
Features GPIB:
PARAMETER DETAILS
Max length of bus 20 metres
Max individual distance between 2 metres average 4 metres maximum in any instance.
instruments
Maximum number of instruments 14 plus controller, i.e. 15 instruments total with at
least two-thirds of the devices powered on.
Data bus width 8 lines.
Handshake lines 3
Bus management lines 5
Connector 24-pin Amphenol (typical) D-type occasionally used.
Max data rate ~ 1 Mbyte / sec (HS-488 allows up to ~8Mbyte / sec).
Firewire:
IEEE 1394 is an interface standard for a serial bus for high-speed communications
and isochronous real-time data transfer. It was developed in the late 1980s and early
1990s by Apple in cooperation with a number of companies,
primarily Sony and Panasonic.
Apple called the interface FireWire. It is also known by the brands i.LINK (Sony),
and Lynx (Texas Instruments).
The copper cable used in its most common implementation can be up to 4.5 metres
(15 ft) long.
Power and data is carried over this cable, allowing devices with moderate power
requirements to operate without a separate power supply. FireWire is also available
in Cat 5 and optical fiber versions.
The 1394 interface is comparable to USB. USB was developed subsequently and
gained much greater market share.
USB requires a master controller whereas IEEE 1394 is cooperatively managed by the
connected devices.
The process of the bus deciding which node gets to transmit data at what time is known
as arbitration. Each arbitration round lasts about 125 microseconds. During the round,
the root node (device nearest the processor) sends a cycle start packet. All nodes
requiring data transfer respond, with the closest node winning. After the node is
finished, the remaining nodes take turns in order. This repeats until all the devices have
used their portion of the 125 microseconds, with isochronous transfers having priority.
Firewire: Specifications/features
FireWire can connect up to 63 peripherals in a tree or daisy-chain topology.
It allows peer-to-peer device communication — such as communication between a
scanner and a printer — to take place without using system memory or the CPU.
FireWire also supports multiple hosts per bus.
It is designed to support plug and play and hot swapping.
The copper cable it uses in its most common implementation can be up to 4.5 metres
(15 ft) long and is more flexible than most parallel SCSI cables.
In its six-conductor or nine-conductor variations, it can supply up to 45 watts of power
per port at up to 30 volts, allowing moderate-consumption devices to operate without a
separate power supply.
FireWire is capable of both asynchronous and isochronous transfer methods at once.
Firewire Application:
Consumer automobiles: IDB-1394 Customer Convenience Port (CCP) was the
automotive version of the 1394 standard.
Consumer audio and video: IEEE 1394 was the High-Definition Audio-Video Network
Alliance (HANA) standard connection interface for A/V (audio/visual) component
communication and control.
Military and aerospace vehicles: SAE Aerospace standard AS5643 originally released
in 2004 and reaffirmed in 2013 establishes IEEE-1394 standards as a military and
aerospace databus network in those vehicles.
General networking: FireWire can be used for ad hoc (terminals only, no routers except
where a FireWire hub is used) computer networks. Specifically, RFC 2734 specifies
how to run IPv4 over the FireWire interface, and RFC 3146 specifies how to run IPv6.
IIDC: IIDC (Instrumentation & Industrial Digital Camera) is the FireWire data format
standard for live video, and is used by Apple's iSight A/V camera.
DV: Digital Video (DV) is a standard protocol used by some digital camcorders. All
DV cameras that recorded to tape media had a FireWire interface
USB:
Universal Serial Bus (USB) communications has greatly advanced the versatility of
PCs and embedded systems with external peripherals. USB provides the next-
generation interface for all devices that at one time communicated through the serial
port, parallel port, or utilized a custom connector interface to the host computer system.
USB, Universal Serial Bus is one of the most common interfaces for connecting a
variety of peripherals to computers and providing relatively local and small levels of
data transfer.
USB interfaces are found on everything from personal computers and laptops, to
peripheral devices, mobile phones, cameras, flash memory sticks, back up hard-drives
and very many other devices. Its combination of convenience and performance has
meant that it is now one of the most widely used computer interfaces.
The Universal Serial Bus, USB provides a very simple and effective means of providing
connectivity, and as a result it is very widely used.
Whilst USB provides a sufficiently fast serial data transfer mechanism for data
communications, it is also possible to obtain power through the connector making it
possible to power small devices via the connector and this makes it even more
convenient to use, especially ‘on-the-go.’
USB is an asynchronous serial interconnect between a host and up to a total of 127
devices.
It utilizes a four wire shielded cable with a length of up to 5m between a host and a
device or hub.
Two of the wires are power and ground which are sourced by the host or a hub providing
limited supply power for attached devices.
All USB information is transmitted on the other two wires. These wires are a twisted
pair and the USB data is transmitted as a differential signal.
USB 1.0 was released in January 1996. In September 1998, USB 1.1 was released
which added an additional method of transferring data between the host and device.
USB 1.0 and 1.1 supported two data transfer speeds; a low speed of 1.5 Mbps and a full
speed of 12 Mbps.
USB 2.0 was released in April 2000 and added a faster data transfer speed of 480 Mbps
while retaining all the characteristics of the original USB 1.1. This introduction of the
faster transfer speed allowed devices such as external hard drives, video transport,
scanners, and the newer printers that require high data throughput to utilize USB
communications.
USB utilizes a tiered star topology in which each device is connected to hubs which
may be connected to other hubs or directly connected to the host which is the PC or
primary embedded system.
The host controls all communications on a USB bus and only one device at a time
communicates with the host. This is referred to as speak when spoken to
communications protocol.
USB Data Transfer:
Control transfer: Control transfers are the only type of transfer that all USB devices
must support. A device can utilize one of the other transfer types that may be better
suited for continued communications with the host but it must support control transfers
at a minimum. Control transfers are utilized during the enumeration stage of a USB
communication. After enumeration the device can continue to utilize the control
transfer type for communications. Control transfer type supports low, full and high
transmission speeds and includes error correction.
Bulk transfer: Bulk transfers are used for moving large amounts of data between a
host and a device. Examples of devices that utilize bulk transfers are printers, scanners,
hard drives and flash drives. Bulk transfers occur on the USB bus when no other types
of transfers are occurring and transmission of the data is not time critical. Bulk transfer
type is only available for devices that support full- and high-speed USB
communications; it includes error correction.
Interrupt transfer: The interrupt transfer is not the same as the interrupts that are
present in an embedded system or PC. Interrupt transfers are utilized by devices that
require the host to periodically poll the device to see if any information is required to
be transferred. Examples of such devices are the keyboard and mouse that need to
transfer small pieces of information when a key is depressed or mouse movement is
detected. Interrupt transfers can be utilized by any of the USB bus transmission speeds
and includes error correction.
Isochronous transfer: Isochronous transfer allows devices to transfer large amounts
of data in real time. They guarantee a delivery time or bandwidth for the device data to
be transmitted but it does not incorporate any type of error correction as the other
transfer types support. Devices that utilize isochronous transfer have to be tolerant of
occasional corrupt data packets arriving at the destination. Examples of devices that
would utilize isochronous transfer would be a web camera or USB speakers.
Isochronous transfer type is only available for devices that support full- or high-speed
USB communications.
Features:
A maximum of 127 peripherals can be connected to a single USB host controller.
USB device has a maximum speed up to 480 Mbps (for USB 2.0).
Length of individual USB cable can reach up to 5 meters without a hub and 40
meters with hub.
USB acts as "plug and play" device and hot swappable.
USB can draw power by its own supply or from a computer. USB devices use
power up to 5 voltages and deliver up to up to 500 mA.
If a computer turns into power-saving mode, some USB devices will automatically
convert themselves into "sleep" mode.
USB: versions:
1.5 1.5
40
Mbit/s Mbit/s 1.5 Mbit/s
Gbit/s
(Low (Low (Low Speed) 10 20
5 Gbit/s (SuperS
Speed) Speed) 12 Mbit/s Gbit/s Gbit/s
Data rate (SuperS peed+
12 12 (Full Speed) (SuperS (SuperS
peed) and
Mbit/s Mbit/s 480 Mbit/s peed+) peed+)
Thunder
(Full (Full (High Speed)
bolt 3)
Speed) Speed)
USB: Advantages & Limitations
IrDA:
The Infrared Data Association (IrDA) is an industry-driven interest group that was
founded in 1993 by around 50 companies.
IrDA provides specifications for a complete set of protocols for wireless infrared
communications, and the name "IrDA" also refers to that set of protocols.
The main reason for using the IrDA protocols had been wireless data transfer over the
"last one meter" using point-and-shoot principles.
Thus, it has been implemented in portable devices such as mobile telephones, laptops,
cameras, printers, and medical devices.
Main characteristics of this kind of wireless optical communication is physically secure
data transfer, line-of-sight (LOS) and very low bit error rate (BER) that makes it very
efficient.
IrDA devices provide a walk-up, point-to-point method of data transfer that is adaptable
to a broad range of computing and communicating devices. The first version of the
IrDA specification (version 1.0) provides communication at data rates up to 115.2
Kbps.
Later versions (version 1.1) extended the data rate to 4 Mbps, while maintaining
backward compatibility with version 1.0 interfaces.
The protocol described in this application note is only for 115.2 Kbps. The 4 Mbps
interface uses a pulse position modulation scheme which sends two bits per light pulse.
The IrDA standard contains three specifications. These relate to the Physical Layer, the
Link Access Protocol, and the Link Management Protocol.
The data is first encoded before being transmitted as IR pulses. As shown in Figure 2,
the serial encoding of the UART is NRZ (non return to zero) encoding. NRZ encoded
outputs do not transition during the bit period, and may remain High or Low for
consecutive bit periods.
This is not an efficient method for IR data transmission with LEDs. To limit the power
consumption of the LED, IrDA requires pulsing the LED in a RZI (return to zero,
inverted) modulation scheme so that the peak power to average power ratio can be
increased.
The mandatory IrPHY (Infrared Physical Layer Specification) is the physical layer of
the IrDA specifications. It comprises optical link definitions, modulation,
coding, cyclic redundancy check (CRC) and the framer.
Range:
standard: 2 m;
low-power to low-power: 0.2 m;
Standard to low-power: 0.3 m.
The 10 GigaIR also define new usage models that supports higher link
distances up to several meters.
Angle: minimum cone ±15°
Speed: 2.4 kbit/s to 1 Gbit/s
Modulation: baseband, no carrier
Infrared window (part of the device body transparent to infrared light beam)
Wavelength: 850–900 nm
he mandatory IrLAP (Infrared Link Access Protocol) is the second layer of the IrDA
specifications. It lies on top of the IrPHY layer and below the IrLMP layer. It represents
the data link layer of the OSI model.
Access control
Discovery of potential communication partners
Establishing of a reliable bidirectional connection
Distribution of the primary/secondary device roles
Negotiation of QoS parameters
The mandatory IrLMP (Infrared Link Management Protocol) is the third layer of the
IrDA specifications. It can be broken down into two parts. First, the LM-MUX (Link
Management Multiplexer), which lies on top of the IrLAP layer. Its most important
achievements are:
Provides multiple logical channels
Allows change of primary/secondary devices
IrDA: Features
IrDA compatible for communication with other standard compliant products
1200 bps to 115.2 Kbps data rate designed for lower speed applications
3 V and 5 V operation suitable for low power and portable applications
Decodes negative or positive pulses for increased system flexibility
Basics of RTOS:
Real-time concepts.
Hard Real time and Soft Real-time.
Differences between General Purpose OS & RTOS.
Basic architecture of an RTOS.
Scheduling Systems.
Inter-process communication.
Performance Matric in scheduling models.
Interrupt management in RTOS environment.
Memory management, File systems, I/O Systems.
Advantage and disadvantage of RTOS.
Overview of Open source RTOS for Embedded systems and application
development.
RTOS: Basics
Real-time embedded systems have become pervasive. They are in your cars, cell
phones, Personal Digital Assistants (PDAs), watches, televisions, and home electrical
appliances. There are also larger and more complex real-time embedded systems, such
as air-traffic control systems, industrial process control systems, networked multimedia
systems, and real-time database applications. In fact, our daily life has become more
and more dependent on real-time embedded applications.
An embedded system is a microcomputer system embedded in a larger system and
designed for one or two dedicated services. It is embedded as part of a complete device
that often has hardware and mechanical parts. Examples include the controllers built
inside our home electrical appliances. Most embedded systems have real-time
computing constraints. Therefore, they are also called real-time embedded systems.
Compared with general-purpose computing systems that have multiple functionalities,
embedded systems are often dedicated to specific tasks. For example, the embedded
airbag control system is only responsible for detecting collision and inflating the airbag
when necessary, and the embedded controller in an air conditioner is only responsible
for monitoring and regulating the temperature of a room.
Real-time systems, however, are required to compute and deliver correct results within
a specified period of time. In other words, a job of a real-time system has a deadline,
being it hard or soft.
Real-Time: Real-Time indicates an expectant response or reaction to an event on the
instant of its evolution. The expectant response depicts the logical correctness of the
result produced. The instant of the events’ evolution depicts deadline for producing the
result.
Operating System (OS) is a system program that provides an interface between
hardware and application programs. OS is commonly equipped with features like:
Multitasking, Synchronization, Interrupt and Event Handling, Input/ Output, Inter-task
Communication, Timers and Clocks and Memory Management to fulfill its primary
role of managing the hardware resources to meet the demands of application programs.
RTOS is therefore an operating system that supports real-time applications and
embedded systems by providing logically correct result within the deadline required.
Such capabilities define its deterministic timing behavior and limited resource
utilization nature.
DIFFERENCE: RTOS v/s General Purpose OS
Determinism – The key difference between general-computing operating systems and
real-time operating systems is the “deterministic ” timing behavior in the real-time
operating systems. “Deterministic” timing means that OS consume only known and
expected amounts of time. RTOS have their worst case latency defined. Latency is not
of a concern for General Purpose OS.
Task Scheduling – General purpose operating systems are optimized to run a variety
of applications and processes simultaneously, thereby ensuring that all tasks receive at
least some processing time. As a consequence, low-priority tasks may have their
priority boosted above other higher priority tasks, which the designer may not want.
However, RTOS uses priority-based preemptive scheduling, which allows high-priority
threads to meet their deadlines consistently. All system calls are deterministic, implying
time bounded operation for all operations and ISRs. This is important for embedded
systems where delay could cause a safety hazard. The scheduling in RTOS is time
based. In case of General purpose OS, like Windows/Linux, scheduling is process
based.
Preemptive kernel – In RTOS, all kernel operations are preemptible
Priority Inversion – RTOS have mechanisms to prevent priority inversion
Usage – RTOS are typically used for embedded applications, while General Purpose
OS are used for Desktop PCs or other generally purpose PCs.
RTOS Characteristics:
Reliability: Embedded systems must be reliable. Depending on the application, the
system might need to operate for long periods without human intervention. Different
degrees of reliability may be required. For example, a digital solar-powered calculator
might reset itself if it does not get enough light, yet the calculator might still be
considered acceptable. On the other hand, a telecom switch cannot reset during
operation without incurring high associated costs for down time. The RTOSes in these
applications require different degrees of reliability. Although different degrees of
reliability might be acceptable, in general, a reliable system is one that is available
(continues to provide service) and does not fail.
Predictability: Because many embedded systems are also real-time systems, meeting
time requirements is key to ensuring proper operation. The RTOS used in this case
needs to be predictable to a certain degree. The term deterministic describes RTOSes
with predictable behavior, in which the completion of operating system calls occurs
within known timeframes. Developers can write simple benchmark programs to
validate the determinism of an RTOS. The result is based on timed responses to specific
RTOS calls. In a good deterministic RTOS, the variance of the response times for each
type of system call is very small.
Performance: This requirement dictates that an embedded system must perform fast
enough to fulfill its timing requirements. Typically, the more deadlines to be met-and
the shorter the time between them-the faster the system's CPU must be. Although
underlying hardware can dictate a system's processing power, its software can also
contribute to system performance. Typically, the processor's performance is expressed
in million instructions per second (MIPS). Throughput also measures the overall
performance of a system, with hardware and software combined. One definition of
throughput is the rate at which a system can generate output based on the inputs coming
in. Throughput also means the amount of data transferred divided by the time taken to
transfer it. Data transfer throughput is typically measured in multiples of bits per second
(bps). Sometimes developers measure RTOS performance on a call-by-call basis.
Benchmarks are written by producing timestamps when a system call starts and when
it completes. Although this step can be helpful in the analysis stages of design, true
performance testing is achieved only when the system performance is measured as a
whole.
Compactness: Application design constraints and cost constraints help determine how
compact an embedded system can be. For example, a cell phone clearly must be small,
portable, and low cost. These design requirements limit system memory, which in turn
limits the size of the application and operating system. In such embedded systems,
where hardware real estate is limited due to size and costs, the RTOS clearly must be
small and efficient. In these cases, the RTOS memory footprint can be an important
factor. To meet total system requirements, designers must understand both the static
and dynamic memory consumption of the RTOS and the application that will run on it.
Scalability: Because RTOSes can be used in a wide variety of embedded systems, they
must be able to scale up or down to meet application-specific requirements. Depending
on how much functionality is required, an RTOS should be capable of adding or
deleting modular components, including file systems and protocol stacks. If an RTOS
does not scale up well, development teams might have to buy or build the missing
pieces. Suppose that a development team wants to use an RTOS for the design of a
cellular phone project and a base station project. If an RTOS scales well, the same
RTOS can be used in both projects, instead of two different RTOSes, which saves
considerable time and money.
RTOS CLASSFICATION:
RTOS specifies a known maximum time for each of the operations that it performs. Based
upon the degree of tolerance in meeting deadlines, RTOS are classified into following
categories
Hard real-time: Degree of tolerance for missed deadlines is negligible. A missed
deadline can result in catastrophic failure of the system
Firm real-time: Missing a deadline might result in an unacceptable quality
reduction but may not lead to failure of the complete system
Soft real-time: Deadlines may be missed occasionally, but system doesn’t fail and
also, system quality is acceptable
For a life saving device, automatic parachute opening device for skydivers,
delay can be fatal. Parachute opening device deploys the parachute at a specific altitude
based on various conditions. If it fails to respond in specified time, parachute may not get
deployed at all leading to casualty. Similar situation exists during inflation of air bags, used
in cars, at the time of accident. If airbags don’t get inflated at appropriate time, it may be
fatal for a driver. So such systems must be hard real time systems, whereas for TV live
broadcast, delay can be acceptable. In such cases, soft real time systems can be used.
RTOS: Architecture
The architecture of an RTOS is dependent on the complexity of its deployment. Good RTOSs
are scalable to meet different sets of requirements for different applications. For simple
applications, an RTOS usually comprises only a kernel. For more complex embedded systems,
an RTOS can be a combination of various modules, including the kernel, networking protocol
stacks, and other components as illustrated in Figure:
Kernel:
For simpler applications, RTOS is usually a kernel but as complexity increases, various
modules like networking protocol stacks debugging facilities, device I/Os are includes in
addition to the kernel. RTOS kernel acts as an abstraction layer between the hardware and the
applications. There are three broad categories of kernels
Monolithic kernel: Monolithic kernels are part of Unix-like operating systems like
Linux, FreeBSD etc. A monolithic kernel is one single program that contains all of the
code necessary to perform every kernel related task. It runs all basic system services
(i.e. process and memory management, interrupt handling and I/O communication, file
system, etc) and provides powerful abstractions of the underlying hardware. Amount
of context switches and messaging involved are greatly reduced which makes it run
faster than microkernel.
Microkernel: It runs only basic process communication (messaging) and I/O control.
It normally provides only the minimal services such as managing memory protection,
Inter process communication and the process management. The other functions such as
running the hardware processes are not handled directly by microkernels. Thus, micro
kernels provide a smaller set of simple hardware abstractions. It is more stable than
monolithic as the kernel is unaffected even if the servers failed (i.e.File
System). Microkernels are part of the operating systems like AIX, BeOS, Mach, Mac
OS X, MINIX, and QNX. Etc
Hybrid Kernel: Hybrid kernels are extensions of microkernels with some properties
of monolithic kernels. Hybrid kernels are similar to microkernels, except that they
include additional code in kernel space so that such code can run more swiftly than it
would were it in user space. These are part of the operating systems such as Microsoft
Windows NT, 2000 and XP. DragonFly BSD, etc
Exokernel: Exokernels provides efficient control over hardware. It runs only services
protecting the resources (i.e. tracking the ownership, guarding the usage, revoking
access to resources, etc) by providing low-level interface for library operating systems
and leaving the management to the application.
Task Management:
In RTOS, The application is decomposed into small, schedulable, and sequential
program units known as “Task”, a basic unit of execution and is governed by three
time-critical properties; release time, deadline and execution time.
Release time refers to the point in time from which the task can be executed.
Deadline is the point in time by which the task must complete.
Execution time denotes the time the task takes to execute.
Intertask Communication:
The multi-tasking model, we have seen that each task is a quasi-independent program.
Although tasks in an embedded application have a degree of independence, it does not
mean that they have no “awareness” of one another. Although some tasks will be truly
isolated from others, the requirement for communication and synchronization between
tasks is very common.
This represents a key part of the functionality provided by an RTOS. The actual range
of options offered by a different RTOSes may vary quite widely.
Task-owned facilities – attributes that an RTOS imparts to tasks that provide
communication (input) facilities. The example we will look at some more is signals.
Kernel objects – facilities provided by the RTOS which represent stand-alone
communication or synchronization facilities. Examples include: event flags, mailboxes,
queues/pipes, semaphores and mutexes.
Message passing – a rationalized scheme where an RTOS allows the creation of
message objects, which may be sent from one task to another or to several others. This
is fundamental to the kernel design and leads to the description of such a product as
being a “message passing RTOS”.
Signals- Signals are probably the simplest inter-task communication facility offered in
conventional RTOSes. They consist of a set of bit flags – there may be 8, 16 or 32,
depending on the specific implementation – which is associated with a specific task.
A signal flag (or several flags) may be set by any task using an OR type of operation.
Only the task that owns the signals can read them. The reading process is generally
destructive – i.e. the flags are also cleared.
In some systems, signals are implemented in a more sophisticated way such that a
special function – nominated by the signal owning task – is automatically executed
when any signal flags are set. This removes the necessity for the task to monitor the
flags itself. This is somewhat analogous to an interrupt service routine.
Event Flag Groups: Event flag groups are like signals in that they are a bit-oriented
inter-task communication facility. They may similarly be implemented in groups of 8,
16 or 32 bits. They differ from signals in being independent kernel objects; they do not
“belong” to any specific task.
Any task may set and clear event flags using OR and AND operations. Likewise, any
task may interrogate event flags using the same kind of operation. In many RTOSes, it
is possible to make a blocking API call on an event flag combination; this means that a
task may be suspended until a specific combination of event flags has been set. There
may also be a “consume” option available, when interrogating event flags, such that all
read flags are cleared.
Semaphores: Semaphores are independent kernel objects, which provide a flagging
mechanism that is generally used to control access to a resource. There are broadly two
types: binary semaphores (that just have two states) and counting semaphores (that have
an arbitrary number of states). Some processors support (atomic) instructions that
facilitate the easy implementation of binary semaphores. Binary semaphores may also
be viewed as counting semaphores with a count limit of 1.
Any task may attempt to obtain a semaphore in order to gain access to a resource. If the
current semaphore value is greater than 0, the obtain will be successful, which
decrements the semaphore value. In many OSes, it is possible to make a blocking call
to obtain a semaphore; this means that a task may be suspended until the semaphore is
released by another task. Any task may release a semaphore, which increments its
value.
Mailboxes: Mailboxes are independent kernel objects, which provide a means for tasks
to transfer messages. The message size depends on the implementation, but will
normally be fixed. One to four pointer-sized items are typical message sizes.
Commonly, a pointer to some more complex data is sent via a mailbox. Some kernels
implement mailboxes so that the data is just stored in a regular variable and the kernel
manages access to it. Mailboxes may also be called “exchanges”, though this name is
now uncommon.
Any task may send to a mailbox, which is then full. If a task then tries to send to a full
mailbox, it will receive an error response. In many RTOSes, it is possible to make a
blocking call to send to a mailbox; this means that a task may be suspended until the
mailbox is read by another task. Any task may read from a mailbox, which renders it
empty again. If a task tries read from an empty mailbox, it will receive an error
response. In many RTOSes, it is possible to make a blocking call to read from a
mailbox; this means that a task may be suspended until the mailbox is filled by another
task.
Queues: Queues are independent kernel objects, that provide a means for tasks to
transfer messages. They are a little more flexible and complex than mailboxes. The
message size depends on the implementation, but will normally be a fixed size and
word/pointer oriented.
Any task may send to a queue and this may occur repeatedly until the queue is full, after
which time any attempts to send will result in an error. The depth of the queue is
generally user specified when it is created or the system is configured. In many
RTOSes, it is possible to make a blocking call to send to a queue; this means that, if the
queue is full, a task may be suspended until the queue is read by another task. Any task
may read from a queue. Messages are read in the same order as they were sent – first
in, first out (FIFO). If a task tries to read from an empty queue, it will receive an error
response. In many RTOSes, it is possible to make a blocking call to read from a queue;
this means that, if the queue is empty, a task may be suspended until a message is sent
to the queue by another task.
Mutexes: Mutual exclusion semaphores – mutexes – are independent kernel objects,
which behave in a very similar way to normal binary semaphores. They are slightly
more complex and incorporate the concept of temporary ownership (of the resource,
access to which is being controlled). If a task obtains a mutex, only that same task can
release it again – the mutex (and, hence, the resource) is temporarily owned by the task.
Mutexes are not provided by all RTOSes, but it is quite straightforward to adapt a
regular binary semaphore. It would be necessary to write a “mutex obtain” function,
which obtains the semaphore and notes the task identifier. Then a complementary
“mutex release” function would check the calling task’s identifier and release the
semaphore only if it matches the stored value, otherwise it would return an error.
Inter-process Communication:
Inter Process Communication (IPC) is a mechanism that involves communication of
one process with another process. This usually occurs only in one system.
Communication can be of two types −
Between related processes initiating from only one process, such as parent and child
processes.
Between unrelated processes, or two or more different processes.
Following are some important terms that we need to know before proceeding further on this
topic.
Pipes − Communication between two related processes. The mechanism is half duplex
meaning the first process communicates with the second process. To achieve a full
duplex i.e., for the second process to communicate with the first process another pipe
is required.
FIFO − Communication between two unrelated processes. FIFO is a full duplex,
meaning the first process can communicate with the second process and vice versa at
the same time.
Message Queues − Communication between two or more processes with full duplex
capacity. The processes will communicate with each other by posting a message and
retrieving it out of the queue. Once retrieved, the message is no longer available in the
queue.
Shared Memory − Communication between two or more processes is achieved
through a shared piece of memory among all processes. The shared memory needs to
be protected from each other by synchronizing access to all the processes.
Semaphores − Semaphores are meant for synchronizing access to multiple processes.
When one process wants to access the memory (for reading or writing), it needs to be
locked (or protected) and released when the access is removed. This needs to be
repeated by all the processes to secure data.
Signals − Signal is a mechanism to communication between multiple processes by way
of signaling. This means a source process will send a signal (recognized by number)
and the destination process will handle it accordingly.
Interrupt Management:
• An interrupt is a signal from a device attached to a computer or from a running process
within the computer, indicating an event that needs immediate attention. The processor
responds by suspending its current activity, saving its state, and executing a function
called an interrupt handler (also called interrupt service routine, ISR) to deal with the
event.
• Modern OSs are interrupt-driven. Virtually, all activities are initiated by the arrival of
interrupts. Interrupt transfers control to the ISR, through the interrupt vector, which
contains the addresses of all the service routines. Interrupt architecture must save the
address of the interrupted instruction.
• Incoming interrupts are disabled while another interrupt is being processed. A system
call is a software-generated interrupt caused either by an error or by a user request.
Memory Management:
Main memory is the most critical resource in a computer system in terms of speed at
which programs run. The kernel of an OS is responsible for all system memory that is
currently in use by programs. Entities in memory are data and instructions.
Each memory location has a physical address. In most computer architectures, memory
is byte-addressable, meaning that data can be accessed 8 bits at a time, irrespective of
the width of the data and address buses. Memory addresses are fixed-length sequences
of digits. In general, only system software such as Basic Input/ Output System (BIOS)
and OS can address physical memory.
Most application programs do not have knowledge of physical addresses. Instead, they
use logic addresses. A logical address is the address at which a memory location
appears to reside from the perspective of an executing application program. A logical
address may be different from the physical address due to the operation of an address
translator or mapping function.
In a computer supporting virtual memory, the term physical address is used mostly to
differentiate from a virtual address. In particular, in computers utilizing a memory
management unit (MMU) to translate memory addresses, the virtual and physical
addresses refer to an address before and after translation performed by the MMU,
respectively. There are several reasons to use virtual memory.
Among them is memory protection. If two or more processes are running at the same
time and use direct addresses, a memory error in one process (e.g., reading a bad
pointer) could destroy the memory being used by the other process, taking down
multiple programs due to a single crash. The virtual memory technique, on the other
hand, can ensure that each process is running in its own dedicated address space.
File Management:
Files are the fundamental abstraction of secondary storage devices. Each file is a named
collection of data stored in a device. An important component of an OS is the file
system, which provides capabilities of file management, auxiliary storage management,
file access control, and integrity assurance.
File management is concerned with providing the mechanisms for files to be stored,
referenced, shared, and secured. When a file is created, the file system allocates an
initial space for the data. Subsequent incremental allocations follow as the file grows.
When a file is deleted or its size is shrunk, the space that is freed up is considered
available for use by other files. This creates alternating used and unused areas of various
sizes.
When a file is created and there is not an area of contiguous space available for its initial
allocation, the space must be assigned in fragments. Because files do tend to grow or
shrink over time, and because users rarely know in advance how large their files will
be, it makes sense to adopt noncontiguous storage allocation schemes.
Figure illustrates the block chaining scheme. The initial address of storage of a file is
identified by its file name. Typically, files on a computer are organized into directories,
which constitute a hierarchical system of tree structure.
I/O Management:
Modern computers interact with a wide range of I/O devices. Keyboards, mice, printers,
disk drives, USB drives, monitors, networking adapters, and audio systems are among
the most common ones. One purpose of an OS is to hide peculiarities of hardware I/O
devices from the user.
In memory-mapped I/O, each I/O device occupies some locations in the I/O address
space. Communication between the I/O device and the processor is enabled through
physical memory locations in the I/O address space. By reading from or writing to those
addresses, the processor gets information from or sends commands to I/O devices.
Most systems use device controllers. A device controller is primarily an interface unit.
The OS communicates with the I/O device through the device controller. Nearly all
device controllers have direct memory access (DMA) capability, meaning that they can
directly access the memory in the system, without the intervention by the processor.
This frees up the processor of the burden of data transfer from and to I/O devices.
• Interrupts allow devices to notify the processor when they have data to transfer or when
an operation is complete, allowing the processor to perform other duties when no I/O
transfers need its immediate attention. The processor has the interrupt request line that
it senses after executing every instruction. When a device controller raises an interrupt
by asserting a signal on the interrupt request
• line, the processor catches it and saves the state and then transfers control to the
interrupt handler. The interrupt handler determines the cause of the interrupt, performs
necessary processing, and executes a return from interrupt instruction to return control
to the processor.
• I/O operations often have high latencies. Most of this latency is due to the slow speed
of peripheral devices. For example, information cannot be read from or written to a
hard disk until the spinning of the disk brings the target sectors directly under the
read/write head. The latency can be alleviated by having one or more input and output
buffers associated with each device.
RTOS: Perfomance Metrics:
Memory – how much ROM and RAM does the kernel need and how is this affected by
options and configuration?
ROM, which is normally flash memory, is used for the kernel, along with code for the
runtime library and any middleware components. This code – or parts of it – may be
copied to RAM on boot up, as this can offer improved performance. There is also likely
to be some read only data. If the kernel is statically configured, this data will include
extensive information about kernel objects. However, nowadays, most kernels are
dynamically configured.
RAM space will be used for kernel data structures, including some or all of the kernel
object information, again depending upon whether the kernel is statically or
dynamically configured. There will also be some global variables. If code is copied
from flash to RAM, that space must also be accounted for.
There are a number of factors that affect the memory footprint of an RTOS. The CPU
architecture is key. The number of instructions can vary drastically from one processor
to another, so looking at size figures for, say, PowerPC gives no indication of what the
ARM version might be like.
Embedded compilers generally have a large number of optimization settings. These can
be used to reduce code size, but that will most likely affect performance. Optimizations
affect ROM footprint, and also RAM. Data size can also be affected by optimization,
as data structures can be packed or unpacked. Again both ROM and RAM can be
affected. Packing data has an adverse effect on performance.
Latency, which is broadly the delay between something happening and the response to
that occurrence. This is a particular minefield of terminology and misinformation, but
there are two essential latencies to consider: interrupt response and task scheduling.
• Interrupt Latency: The time related performance measurements are probably of most
concern to developers using an RTOS. A key characteristic of a real time system is its
timely response to external events and an embedded system is typically notified of an
event by means of an interrupt, so interrupt latency is critical.
• System: the total delay between the interrupt signal being asserted and the start of the
interrupt service routine execution.
• OS: the time between the CPU interrupt sequence starting and the initiation of the ISR.
This is really the operating system overhead, but many people refer to it as the latency.
This means that some vendors claim zero interrupt latency.
• Interrupt latency is the sum of the hardware dependent time, which depends on the
interrupt controller as well as the type of the interrupt, and the OS induced overhead.
• Scheduling latency: A key part of the functionality of an RTOS is its ability to support
a multi-threading execution environment. Being real time, the efficiency at which
threads or tasks are scheduled is of some importance and the scheduler is at the core of
an RTOS. It is hard to get a clear picture of performance, as there is a wide variation in
the techniques employed to make measurements and in the interpretation of the results.
• There are really two separate measurements to consider:
– The context switch time
– The time overhead that the RTOS introduces when scheduling a task
• The scheduling latency is the maximum of two times: ƮSO, the scheduling overhead;
the end of the ISR to the start of task schedule and ƮCS, the time taken to save and
restore thread context.
Performance of kernel services. How long does it take to perform specific actions?
• Timing kernel services
• An RTOS is likely to have a great many system service API (application program
interface) calls, probably numbering into the hundreds. To assess timing, it is not useful
to try to analyze the timing of every single call. It makes more sense to focus on the
frequently used services.
• For most RTOSes, there are four key categories of service call:
– Threading services
– Synchronization services
– Inter-process communication services
– Memory services
RTOS: Open Source:
Name Platforms Built-in Components Description
FreeRTOS • MSP430 • FileSystem FreeRTOS is a market leading
• ARM • Network RTOS from Amazon Web
• AVR • TLS/SSL Services that supports more than
• ColdFire • Command Line 35 architectures. It is distributed
• PIC Interface under the MIT license.
• x86 • Runtime
Analysis
RIOT • MSP430 • BLE RIOT is a real-time multi-
• ARM • LoRaWAN threading operating system that
• AVR • FileSystem supports a range of devices that
• MIPS • Network, are typically found in the Internet
• RISC-V 6LoWPAN, GUI, of Things (IoT): 8-bit, 16-bit and
TLS/SSL, USB 32-bit microcontrollers.
Device, OTA
TinyOS • MSP430 TinyOS is an open source, BSD-
• AVR licensed operating system
designed for low-power wireless
devices, such as those used in
sensor networks, ubiquitous
computing, personal area
networks, smart buildings, and
smart meters.
Applications of RTOS:
Real-time systems are used in:
Airlines reservation system.
Air traffic control system.
Systems that provide immediate updating.
Used in any system that provides up to date and minute information on stock prices.
Defense application systems like RADAR.
Networked Multimedia Systems
Command Control Systems
Internet Telephony
Anti-lock Brake Systems
Heart Pacemaker
Advantages of RTOS:
When all the resources and devices are inactive, then the RTOS gives maximum
consumption of the system and more output.
When a task is performing there is a no chance to get the error because the RTOS is an
error free.
Memory allocation is the best type to manage in this type of system.
In this type of system, the shifting time is very less.
Because of the small size of the program, the RTOS is used in the embedded system like
transport and others.
Disadvantages of RTOS:
RTOS system can run minimal tasks together, and it concentrates only on those
applications which contain an error so that it can avoid them.
RTOS is the system that concentrates on a few tasks. Therefore, it is really hard for
these systems to do multi-tasking.
Specific drivers are required for the RTOS so that it can offer fast response time to
interrupt signals, which helps to maintain its speed.
Plenty of resources are used by RTOS, which makes this system expensive.
The tasks which have a low priority need to wait for a long time as the RTOS maintains
the accuracy of the program, which are under execution.
Minimum switching of tasks is done in Real time operating systems.
It uses complex algorithms which is difficult to understand.
RTOS uses lot of resources, which sometimes not suitable for the system.
Chapter 2: Communication Interfaces
Modes of communication: Serial/Parallel, Synchronous/Asynchronous
Onboard Communication Interfaces: I²C, CAN, SPI, PSI
External Communication Interfaces: RS232, USB
Wireless Communication Interfaces: IrDA, Bluetooth, Zigbee
Communication Interfaces:
Digital communication can be considered as the communication happening between two (or
more) devices in terms of bits. This transferring of data, either wirelessly or through wires, can
be either one bit at a time or the entire data (depending on the size of the processor inside i.e.,
8 bit, 16 bit etc.) at once. Based on this, we can have the following classification namely, Serial
Communication and Parallel Communication.
Serial Communication implies transferring of data bit by bit, sequentially. This is the most
common form of communication used in the digital word. Contrary to the parallel
communication, serial communication needs only one line for the data transfer. Thereby, the
cost for the communication line as well as the space required is reduced.
Parallel communication implies transferring of the bits in a parallel fashion at a time. This
communication comes for rescue when speed rather than space is the main objective. The
transfer of data is at high speed owing to the fact that no bus buffer is present.
Serial/ Parallel Communication:
For a 8 bit data transfer in Serial communication one bit will be sent at a time. The
entire data is first fed into the serial port buffer. From this buffer one bit will be sent at
a time. Only after the last bit is received the data transferred can be forwarded for
processing.
While in the Parallel Communication a serial port buffer is not required. According to
the length of the data, the number of bus lines are available plus a synchronization line
for synchronized transmission of data.
Thus we can state that for the same frequency of data transmission Serial
communication is slower than parallel communication
Serial Transmission Parallel Transmission
In serial transmission, data (bit) flows in bi- In Parallel Transmission, data flows in
direction. multiple lines.
Generally, Serial Transmission is used for Generally, Parallel Transmission is used for
long-distance. short distance.
The circuit used in Serial Transmission is The circuit used in Parallel Transmission is
simple. relatively complex.
Asynchronous transmission works in spurts and must insert a start bit before each
data character and a stop bit at its termination to inform the receiver where it begins
and ends.
The term asynchronous is used to describe the process where transmitted data is
encoded with start and stop bits, specifying the beginning and end of each character.
These additional bits provide the timing or synchronization for the connection by
indicating when a complete character has been sent or received; thus, timing for each
character begins with the start bit and ends with the stop bit.
When gaps appear between character transmissions, the asynchronous line is said to be
in a mark state. A mark is a binary 1 (or negative voltage) that is sent during periods of
inactivity on the line as shown in the following figure.
The following is a list of characteristics specific to asynchronous communication:
Each character is preceded by a start bit and followed by one or more stop bits.
Gaps or spaces between characters may exist.
With asynchronous transmission, a large text document is organized into long strings
of letters (or characters) that make up the words within the sentences and paragraphs.
These characters are sent over the communication link one at a time and reassembled
at the remote location.
Asynchronous transmission is used commonly for communications over telephone
lines.
SCL (Serial Clock) – The line that carries the clock signal.
With I2C, data is transferred in messages. Messages are broken up into frames of data.
Each message has an address frame that contains the binary address of the slave, and
one or more data frames that contain the data being transmitted.
The message also includes start and stop conditions, read/write bits, and ACK/NACK
bits between each data frame:
Start Condition: The SDA line switches from a high voltage level to a low voltage
level before the SCL line switches from high to low.
Stop Condition: The SDA line switches from a low voltage level to a high voltage
level after the SCL line switches from low to high.
Address Frame: A 7 or 10 bit sequence unique to each slave that identifies the slave
when the master wants to talk to it.
Read/Write Bit: A single bit specifying whether the master is sending data to the slave
(low voltage level) or requesting data from it (high voltage level).
ACK/NACK Bit: Each frame in a message is followed by an acknowledge/no-
acknowledge bit. If an address frame or data frame was successfully received, an ACK
bit is returned to the sender from the receiving device.
The master sends the start condition to every connected slave by switching the SDA
line from a high voltage level to a low voltage level before switching the SCL line from
high to low:
The master sends each slave the 7 or 10 bit address of the slave it wants to communicate
with, along with the read/write bit.
Each slave compares the address sent from the master to its own address. If the address
matches, the slave returns an ACK bit by pulling the SDA line low for one bit. If the
address from the master does not match the slave’s own address, the slave leaves the
SDA line high.
The master sends or receives the data frame:
After each data frame has been transferred, the receiving device returns another ACK
bit to the sender to acknowledge successful receipt of the frame.
To stop the data transmission, the master sends a stop condition to the slave by
switching SCL high before switching SDA high.
ADVANTAGES
Only uses two wires
Supports multiple masters and multiple slaves
ACK/NACK bit gives confirmation that each frame is transferred successfully
Hardware is less complicated than with UARTs
Well known and widely used protocol
DISADVANTAGES
Slower data transfer rate than SPI
The size of the data frame is limited to 8 bits
More complicated hardware needed to implement than SPI
The most commonly used triggering is edge triggering and there are two types: rising
edge and falling edge. Depending on how the receiver is configured, up on detecting
the edge, the receiver will look for data on the data bus from the next bit.
SPI or Serial Peripheral Interface was developed by Motorola in the 1980’s as a
standard, low – cost and reliable interface between the Microcontroller
(microcontrollers by Motorola in the beginning) and its peripheral ICs.
In SPI protocol, the devices are connected in a Master – Slave relationship in a multi –
point interface. In this type of interface, one device is considered the Master of the bus
and all the other devices are considered as slaves.
The SPI bus consists of 4 signals or pins. They are
Master – Out / Slave – In (MOSI)
Master – In / Slave – Out (MISO)
Serial Clock (SCLK) and
Chip Select (CS) or Slave Select (SS)
Master – Out / Slave – Data generated by the Master and received by the Slave. Hence,
MOSI pins on both the master and slave are connected together. Master – In / Slave –
Out or MISO is the data generated by Slave and must be transmitted to Master.
MISO pins on both the master and slave are ties together. Even though the Signal in
MISO is produced by the Slave, the line is controlled by the Master. The Master
generates a clock signal at SCLK and is supplied to the clock input of the slave. Chip
Select (CS) or Slave Select (SS) is used to select a particular slave by the master.
Since the clock is generated by the Master, the flow of data is controlled by the master.
For every clock cycle, one bit of data is transmitted from master to slave and one bit of
data is transmitted from slave to master.
This process happen simultaneously and after 8 clock cycles, a byte of data is
transmitted in both directions and hence, SPI is a full – duplex communication.
If the data has to be transmitted by only one device, then the other device has to send
something (even garbage or junk data) and it is up to the device whether the transmitted
data is actual data or not.
This means that for every bit transmitted by one device, the other device has to send
one bit data i.e. the Master simultaneously transmits data on MOSI line and receive data
from slave on MISO line.
If the slave wants to transmit the data, the master has to generate the clock signal
accordingly by knowing when the slave wants to send the data in advance. If more than
one slave has to be connected to the master, then the setup will be something similar to
the following image.
Even though multiple slaves are connected to the master in the SPI bus, only one slave
will be active at any time. In order to select the slave, the master will pull down the SS
(Slave Select) or CS (Chip Select) line of the corresponding slave.
Hence, there must by a separate CS pin on the Master corresponding to each of the slave
device. We need to pull down the SS or CS line to select the slave because this line is
active low.
There are two types of configurations in which the SPI devices can be connected in an
SPI bus. They are Independent Slave Configuration and Daisy Chain Configuration.
In Independent Slave Configuration, the master has dedicated Slave Select Lines for
all the slaves and each slave can be selected individually. All the clock signals of the
slaves are connected to the master SCK.
Similarly, all the MOSI pins of all the slaves are connected to the MOSI pin of the
master and all the MISO pins of all the slaves are connected to the MISO pin of the
master.
In Daisy Chain Configuration, only a single Slave Select line is connected to all the
slaves. The MOSI of the master is connected to the MOSI of slave 1. MISO of slave 1
is connected to MOSI of slave 2 and so on. The MISO of the final slave is connected
to the MISO of the master.
Consider the master transmits 3 bytes of data in to the SPI bus. First, the 1st byte of
data is shifted to slave 1. When the 2nd byte of data reaches slave 1, the first byte is
pushed in to slave 2.
Finally, when the 3rd byte of data arrives in to the first slave, the 1st byte of data is
shifted to slave 3 and the second byte of data is shifted in to second slave.
Advantages
SPI is very simple to implement and the hardware requirements are not that
complex.
Supports full – duplex communication at all times.
Very high speed of data transfer.
No need for individual addresses for slaves as CS or SS is used.
Only one master device is supported and hence there is no chance of conflicts.
Clock from the master is configured based on speed of the slave and hence slave
doesn’t have to worry about clock.
Disadvantages
Each additional slave requires an additional dedicated pin on master for CS or SS.
There is no acknowledgement mechanism and hence there is no confirmation of
receipt of data.
Slowest device determines the speed of transfer.
There are no official standards and hence often used in application specific
implementations.
There is no flow control.
Applications of SPI
Memory: SD Card , MMC , EEPROM , Flash
Sensors: Temperature and Pressure
Control Devices: ADC , DAC , digital POTS and Audio Codec.
Others: Camera Lens Mount, touchscreen, LCD, RTC, video game controller, etc.
An algorithmic diagram that shows the connectivity between devices using the CAN protocol.
SOF–The single dominant start of frame (SOF) bit marks the start of a message, and is
used to synchronize the nodes on a bus after being idle.
Identifier-The Standard CAN 11-bit identifier establishes the priority of the message.
The lower the binary value, the higher its priority.
RTR–The single remote transmission request (RTR) bit is dominant when information
is required from another node. All nodes receive the request, but the identifier
determines the specified node. The responding data is also received by all nodes and
used by any node interested. In this way, all data being used in a system is uniform.
IDE–A dominant single identifier extension (IDE) bit means that a standard CAN
identifier with no extension is being transmitted.
r0–Reserved bit (for possible use by future standard amendment).
DLC–The 4-bit data length code (DLC) contains the number of bytes of data being
transmitted.
Data–Up to 64 bits of application data may be transmitted.
CRC–The 16-bit (15 bits plus delimiter) cyclic redundancy check (CRC) contains the
checksum (number of bits transmitted) of the preceding application data for error
detection.
ACK–Every node receiving an accurate message overwrites this recessive bit in the
original message with a dominate bit, indicating an error-free message has been sent.
Should a receiving node detect an error and leave this bit recessive, it discards the
message and the sending node repeats the message after re-arbitration. In this way, each
node acknowledges (ACK) the integrity of its data. ACK is 2 bits, one is the
acknowledgment bit and the second is a delimiter.
EOF–This end-of-frame (EOF), 7-bit field marks the end of a CAN frame (message)
and disables bit stuffing, indicating a stuffing error when dominant. When 5 bits of the
same logic level occur in succession during normal operation, a bit of the opposite logic
level is stuffed into the data.
IFS–This 7-bit inter frame space (IFS) contains the time required by the controller to
move a correctly received frame to its proper position in a message buffer area.
CAN Messages:
The CAN bus is a broadcast type of bus. This means that all nodes can ‘hear’ all
transmissions. There is no way to send a message to just a specific node; all nodes will
invariably pick up all traffic. The CAN hardware, however, provides local filtering so
that each node may react only on the interesting messages. CAN uses short messages –
the maximum utility load is 94 bits. There is no explicit address in the messages;
instead, the messages can be said to be contents-addressed, that is, their contents
implicitly determines their address.
The Data Frame:
The Data Frame is the most common message type. It comprises;
The Arbitration Field, which determines the priority of the message when two or more
nodes are contending for the bus. The Arbitration Field contains:
For CAN 2.0A, an 11-bit Identifier and one bit, the RTR bit, which is dominant
for data frames.
For CAN 2.0B, a 29-bit Identifier (which also contains two recessive bits: SRR
and IDE) and the RTR bit.
The Data Field, which contains zero to eight bytes of data.
The CRC Field, which contains a 15-bit checksum calculated on most parts of the
message. This checksum is used for error detection.
an Acknowledgement Slot; any CAN controller that has been able to correctly receive
the message sends an Acknowledgement bit at the end of each message. The transmitter
checks for the presence of the Acknowledge bit and retransmits the message if no
acknowledge was detected.
The Remote Frame
The intended purpose of the remote frame is to solicit the transmission of data from
another node. The remote frame is similar to the data frame, with two important
differences. First, this type of message is explicitly marked as a remote frame by a
recessive RTR bit in the arbitration field, and secondly, there is no data.
The Error Frame
The error frame is a special message that violates the formatting rules of a CAN
message. It is transmitted when a node detects an error in a message, and causes all
other nodes in the network to send an error frame as well. The original transmitter then
automatically retransmits the message. An elaborate system of error counters in the
CAN controller ensures that a node cannot tie up a bus by repeatedly transmitting error
frames.
The Overload Frame
The overload frame is mentioned for completeness. It is similar to the error frame with
regard to the format, and it is transmitted by a node that becomes too busy. It is
primarily used to provide for an extra delay between messages.
A Valid Frame
A message is considered to be error free when the last bit of the ending EOF field of a
message is received in the error-free recessive state. A dominant bit in the EOF field
causes the transmitter to repeat a transmission.
Advantages Disadvantages
PCI supports both 32-bit and 64-bit data width; therefore it is compatible with 486s and
Pentiums. The bus data width is equal to the processor, for example, a 32 bit processor
would have a 32 bit PCI bus, and operates at 33MHz.
PCI was used in developing Plug and Play (PnP) and all PCI cards support PnP i.e. the
user can plug a new card into the computer, power it on and it will “self identify” and
“self specify” and start working without manual configuration using jumpers.
Bluetooth:
Bluetooth is a short-range wireless technology standard that is used for exchanging
data between fixed and mobile devices over short distances using UHF radio waves in
the ISM bands, from 2.402 GHz to 2.48 GHz, and building personal area
networks (PANs), using methods like spread spectrum, frequency hopping and full
duplex signals.
It was originally conceived as a wireless alternative to RS-232 data cables. It is mainly
used as an alternative to wire connections, to exchange files between nearby portable
devices and connect cell phones and music players with wireless headphones.
In the most widely used mode, transmission power is limited to 2.5 milliwatts, giving
it a very short range of up to 10 meters (30 feet).
Frequency-hopping spread spectrum (FHSS) is a method of transmitting radio
signals by rapidly changing the carrier frequency among many distinct frequencies
occupying a large spectral band. The changes are controlled by a code known to
both transmitter and receiver. FHSS is used to avoid interference, to prevent
eavesdropping, and to enable code-division multiple access (CDMA) communications.
The available frequency band is divided into smaller sub-bands. Signals rapidly change
("hop") their carrier frequencies among the center frequencies of these sub-bands in a
predetermined order. Interference at a specific frequency will only affect the signal
during a short interval
Piconet:
Piconet is a type of bluetooth network that contains one primary node called master
node and seven active secondary nodes called slave nodes. Thus there are total of 8
active nodes which are present at a distance of 10 meters.
The communication between the primary and secondary node can be one-to-one or one-
to-many. Possible communication is only between the master and slave; Slave-slave
communication is not possible. It also have 255 parked nodes, these are secondary
nodes and cannot take participation in communication unless it get converted to the
active state.
Scatternet:
It is formed by using various piconets. A slave that is present in one piconet can be act
as master or we can say primary in other piconet. This kind of node can receive message
from master in one piconet and deliver the message to its slave into the other piconet
where it is acting as a slave. This type of node is refer as bridge node.
A station cannot be master in two piconets.
The core specifications of Bluetooth consists of 5 layers
Radio: Radio specifies the requirements for radio transmission – including frequency,
modulation, and power characteristics – for a Bluetooth transceiver.
Baseband Layer: It defines physical and logical channels and link types (voice or data);
specifies various packet formats, transmit and receive timing, channel control, and the
mechanism for frequency hopping (hop selection) and device addressing. It specifies
point to point or point to multipoint links. The length of a packet can range from 68 bits
(shortened access code) to a maximum of 3071 bits.
LMP- Link Manager Protocol (LMP): It defines the procedures for link setup and
ongoing link management.
Logical Link Control and Adaptation Protocol (L2CAP): It is responsible for
adapting upper-layer protocols to the baseband layer.
Service Discovery Protocol (SDP): – Allows a Bluetooth device to query other
Bluetooth devices for device information, services provided, and the characteristics of
those services.
Advantages:
Low cost.
Easy to use.
It can also penetrate through walls.
It creates an adhoc connection immediately without any wires.
It is used for voice and data transfer.
Disadvantages:
It can be hacked and hence, less secure.
It has slow data transfer rate: 3 Mbps.
It has small range: 10 meters.
Zigbee:
Zigbee is an IEEE 802.15.4-based specification for a suite of high-level
communication protocols used to create personal area networks with small, low-
power digital radios, such as for home automation, medical device data collection, and
other low-power low-bandwidth needs, designed for small scale projects which need
wireless connection.
The technology defined by the Zigbee specification is intended to be simpler and less
expensive than other wireless personal area networks (WPANs), such as Bluetooth or
more general wireless networking such as Wi-Fi. Applications include wireless light
switches, home energy monitors, traffic management systems, and other consumer and
industrial equipment that requires short-range low-rate wireless data transfer.
Zigbee was conceived in 1998, standardized in 2003, and revised in 2006. The name
refers to the waggle dance of honey bees after their return to the beehive.
This communication standard defines physical and Media Access Control (MAC)
layers to handle many devices at low-data rates. These Zigbee’s WPANs operate at 868
MHz, 902-928MHz, and 2.4 GHz frequencies. The data rate of 250 kbps is best suited
for periodic as well as intermediate two-way transmission of data between sensors and
controllers.
Zigbee technology works with digital radios by allowing different devices to converse
through one another. The devices used in this network are a router, coordinator as well
as end devices. The main function of these devices is to deliver the instructions and
messages from the coordinator to the single end devices such as a light bulb.
In this network, the coordinator is used to perform different tasks. They choose a
suitable channel to scan a channel as well as to find the most appropriate one through
the minimum of interference, allocate an exclusive address to every device within the
network so that messages / instructions can be transferred in the network.
Routers are arranged among the coordinator as well as end devices which are
accountable for messages routing among the various nodes. Routers get messages from
the coordinator and store them until their end devices are in a situation to get them.
These can also permit other end devices as well as routers to connect the network;
In this network, the small information can be controlled by end devices by
communicating with the parent node like a router or the coordinator based on the Zigbee
network type. End devices don’t converse directly through each other.
Zigbee system structure consists of three different types of devices as Zigbee
Coordinator, Router, and End device. Every Zigbee network must consist of at least
one coordinator which acts as a root and bridge of the network. The coordinator is
responsible for handling and storing the information while performing receiving and
transmitting data operations.
Zigbee routers act as intermediary devices that permit data to pass to and fro to other
devices. End devices have limited functionality to communicate with the parent nodes
such that the battery power is saved. The no. of routers, coordinators, and end devices
depends on the type of networks such as star, tree, and mesh networks.
Physical Layer: This layer does modulation and demodulation operations upon
transmitting and receiving signals respectively. This layer’s frequency, data rate, and
a number of channels are given below.
MAC Layer: This layer is responsible for reliable transmission of data by accessing
different networks with the carrier sense multiple access collision avoidances (CSMA).
This also transmits the beacon frames for synchronizing communication.
Network Layer: This layer takes care of all network-related operations such as
network setup, end device connection, and disconnection to network, routing, device
configurations, etc.
Application Support Sub-Layer: This layer enables the services necessary for Zigbee
device objects and application objects to interface with the network layers for data
managing services. This layer is responsible for matching two devices according to
their services and needs.
Application Framework: It provides two types of data services as key-value pair and
generic message services. The generic message is a developer-defined structure,
whereas the key-value pair is used for getting attributes within the application objects.
ZDO provides an interface between application objects and the APS layer in Zigbee
devices. It is responsible for detecting, initiating, and binding other devices to the
network.
Zigbee Coordinator
In an FFD device, it is a PAN Coordinator is used to form the network. Once the
network is established, then it assigns the address of the network for the devices used
within the network. And also, it routes the messages among the end devices.
Zigbee Router
A Zigbee Router is an FFD Device that allows the range of the Zigbee Network. This
Router is used to add more devices to the network. Sometimes, it acts as a Zigbee End
Device.
Zigbee End Device
This is neither a Router nor a Coordinator that interfaces to a sensor physically
otherwise performs a control operation. Based on the application, it can be either an
RFD or an FFD.
Advantages:
This network has a flexible network structure
Battery life is good.
Power consumption is less
Very simple to fix.
It supports approximately 6500 nodes.
Less cost.
It is self-healing as well as more reliable.
Network setting is very easy as well as simple.
Loads are evenly distributed across the network because it doesn’t include a central
controller
Home appliances monitoring as well controlling is extremely simple using remote
The network is scalable and it is easy to add/remote ZigBee end device to the network.
Disadvantages:
It needs the system information to control Zigbee based devices for the owner.
As compared with WiFi, it is not secure.
The high replacement cost once any issue happens within Zigbee based home
appliances
The transmission rate of the Zigbee is less
It does not include several end devices.
It is so highly risky to be used for official private information.
It is not used as an outdoor wireless communication system because it has less coverage
limit.
Similar to other types of wireless systems, this ZigBee communication system is prone
to bother from unauthorized people.
Bluetooth Zigbee
The frequency range of Bluetooth ranges The frequency range of Zigbee is 2.4 GHz
from 2.4 GHz – 2.483 GHz
Features:
Zigbee :
Support for multiple network topologies such as point-to-point,
point-to-multipoint and mesh networks
Low duty cycle – provides long battery life
Low latency
Direct Sequence Spread Spectrum (DSSS)
Up to 65,000 nodes per network
128-bit AES encryption for secure data connections
Collision avoidance, retries and acknowledgements
Bluetooth:
It operates in the 2.4GHz frequency band without having a license for wireless
communication.
Up to 10-100 meters data can be transfer in real time.
Close proximity & accuracy is not required for Bluetooth as in case of infrared data
(IrDA) communication device. Bluetooth does not suffer from interference from
obstacles such as walls while infrared suffers due to obstacles.
Bluetooth supports both point-to-point and point-to-multipoint wireless connections
without cables between mobile phones and personal computers.
Data transfer rate of blue tooth varies from version to version. Data rate of 1 Mbps for
Version 1.2 Up to 3 Mbps for Version 2.0.
CAN :
The physical layer uses differential transmission on a twisted pair wire
A non-destructive bit-wise arbitration is used to control access to the bus
The messages are small (at most eight data bytes) and are protected by a checksum
There is no explicit address in the messages, instead, each message carries a numeric
value which controls its priority on the bus, and may also serve as an identification of
the contents of the message
An elaborate error handling scheme that results in retransmitted messages when they
are not properly received
There are effective means for isolating faults and removing faulty nodes from the bus