0% found this document useful (0 votes)
4 views

unit-3 PROCESSES AND OPERATING SYSTEMS

The document discusses processes and operating systems, focusing on multitasking, multirate systems, and real-time operating systems (RTOS). It covers key concepts such as process management, scheduling policies, and the importance of timing requirements in complex applications. Additionally, it highlights examples of real-time systems and the mechanisms for interprocess communication and performance evaluation.

Uploaded by

lcece
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

unit-3 PROCESSES AND OPERATING SYSTEMS

The document discusses processes and operating systems, focusing on multitasking, multirate systems, and real-time operating systems (RTOS). It covers key concepts such as process management, scheduling policies, and the importance of timing requirements in complex applications. Additionally, it highlights examples of real-time systems and the mechanisms for interprocess communication and performance evaluation.

Uploaded by

lcece
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 123

UNIT -III

PROCESSES AND OPERATING SYSTEMS


Introduction – Multiple tasks and multiple processes – Multirate
systems- Preemptive real-time operating systems- Priority based
scheduling- Interprocess communication mechanisms – Evaluating
operating system performance- power optimization strategies for
processes – Example Real time operating systems-POSIX-Windows-
CE. Distributed embedded systems – MPSoCs and shared memory
multiprocessors. – Design Example - Audio player, Engine control unit
– Video accelerator.
INTRODUCTION
⚫ Simple applications can be programmed on a microprocessor by writing a single
piece of code.
⚫ But for a complex application, multiple operations must be performed at widely
varying times.
⚫ Two fundamental abstractions that allow us to build complex applications on
microprocessors.
1. Process defines the state of an executing program
2. operating system (OS)provides the mechanism for switching execution
between the processes.
MULTIPLE TASKS AND MULTIPLE
PROCESSES
⚫ Systems which are capable of performing multiprocessing known as multiple
processor system.
⚫ Multiprocessor system can execute multiple processes simultaneously with the
help of multiple CPU.
⚫ Multi-tasking The ability of an operating system to hold multiple processes
in memory and switch the processor for executing one process.
Tasks and Processes
⚫ Task is nothing but different parts of functionality in a single system.
⚫ Eg-Mobile Phones
⚫ When designing a telephone answering machine, we can define recording a
phone call ,answering a call and operating the user’s control panel as distinct
tasks, at different rates.
⚫ Each application in a system is called a task.
Process
⚫ A process is a single execution of a program.
⚫ If we run the same program two different times, we have created two
different processes.
⚫ Each process has its own state that includes not only its registers but all
of its memory.
⚫ In some OSs, the memory management unit is used to keep each
process in a separate address space.
⚫ In others, particularly lightweight RTOSs, the processes run in the
same address space.
⚫ Processes that share the same address space are often called threads.
⚫ This device is connected to serial ports on both ends.
⚫ The input to the box is an uncompressed stream of bytes.
⚫ The box emits a compressed string of bits, based on a compression table.

⚫ Ex: compress data being sent to a modem.


⚫ The program’s need to receive and send data at different rates
⚫ EgThe program may emit 2 bits for the first byte and then 7 bits for the
second byte— will obviously find itself reflected in the structure of the code.
⚫ if we spend too much time in packaging and emitting output characters,we
may drop an input character.
Asynchronous input
⚫ Ex:A control panel on a machine provides a different type of rate.
⚫ The control panel of the compression box include a compression mode button
that disables or enables compression, so that the input text is passed through
unchanged when compression is disabled.
⚫ Sampling the button’s state too slowly machine will miss a button
depression entirely.
⚫ Sampling it too frequently the machine will do incorrectly compress data.
⚫ To solve this problem every n times the compression loop is executed.
Multi-rate Systems
⚫ In operating system implementing code for satisfies timing requirements is
more complex when multiple rates of computation must be handled.
⚫ Multirate embedded computing systemsEx: automobile engines, printers,
and cell phones.
⚫ In all these systems, certain operations must be executed periodically with its
own rate.
⚫ EgAutomotive engine control
⚫ The simplest automotive engine controllers, such as the ignition controller for a
basic motorcycle engine, perform only one task—timing the firing of the spark
plug, which takes the place of a mechanical distributor.
Spark Plug
⚫ The spark plug must be fired at a certain point in the combustion cycle.
Microcontroller
⚫ Using a microcontroller that senses the engine crankshaft position allows the
spark timing to vary with engine speed.
⚫ Firing the spark plug is a periodic process.
Engine controller
⚫ Automobile engine controllers use additional sensors, including the gas pedal
position and an oxygen sensor used to control emissions.
⚫ They also use a multimode control scheme. one mode may be used for engine
warm-up, another for cruise, and yet another for climbing steep hills.
⚫ The engine controller takes a variety of inputs that determine the state of the
engine.
⚫ It then controls two basic engine parameters: the spark plug firings and the
fuel/air mixture.
Task performed by engine controller unit
Timing Requirements on Processes
⚫ Processes can have several different types of timing requirements based on the
application.
⚫ The timing requirements on a set of processes strongly depends on the type of
scheduling.
⚫ A scheduling policy must define the timing requirements that it uses to
determine whether a schedule is valid.
1. Release time
⚫ The time at which the process becomes ready to execute.
⚫ simpler systems the process may become ready at the beginning of the period.
⚫ sophisticated systems set the release time at the arrival time of certain data, at
a time after the start of the period.
2. Deadline
⚫ specifies when a computation must be finished.
⚫ The deadline for an a periodic process is generally measured from the release
time or initiation time.
⚫ The deadline for a periodic process may occur at the end of the period.
⚫ The period of a process is the time between successive executions.
⚫ The process’s rate is the inverse of its period.
⚫ In a Multi rate system, each process executes at its own distinct rate.
Example definitions of release times and deadlines
A sequence of processes with a high initiation rate

• In this case, the initiation interval is equal to one fourth of the period.
• It is possible for a process to have an initiation rate less than the period even in
single-CPU systems.
• If the process execution time is less than the period, it may be possible to initiate
multiple copies of a program at slightly offset times.
Data dependencies among processes

• The data dependencies define a partial ordering on process execution.


• P1 and P2 can execute in any order but must both complete before P3, and P3
must complete before P4.
• All processes must finish before the end of the period.
Directed Acyclic Graph (DAG)
• It is a directed graph that contains no cycles.
• The data dependencies must form a directed acyclic graph.
• A set of processes with data dependencies is known as a task graph.
Communication among processes at different rates
(MPEG audio/Video)

• The system decoder process demultiplexes the audio and video data and
distributes it to the appropriate processes.
• Missing Deadline
• Missing deadline in a multimedia system may cause an audio or video glitch.
• The system can be designed to take a variety of actions when a deadline is
missed.
CPU Metrics
⚫ CPU metrics are described by initiation time and completion time.
⚫ Initiation timeIt is the time at which a process actually starts executing on
the CPU.
⚫ Completion timeIt is the time at which the process finishes its work.
⚫ The CPU time of process i is called Ci .
⚫ The CPU time is not equal to the completion time minus initiation time.
⚫ The total CPU time consumed by a set of processes is

⚫ The simplest and most direct measure is utilization.


Process State and Scheduling
⚫ The first job of the OS is to determine that process runs next.
⚫ The work of choosing the order of running processes is known as scheduling.
⚫ There three basic scheduling ,such as waiting, ready and executing.

⚫ A process goes into the waiting state when it needs data that it has finished all its work for
the current period.
⚫ A process goes into the ready state when it receives its required data, when it enters
a new period.
⚫ Finally a process can go into the executing state only when it has all its data, is ready to
run, and the scheduler selects the process as the next process to run.
Scheduling Policies
⚫ A scheduling policy defines how processes are selected for promotion from the
ready state to the running state.
⚫ SchedulingAllocate time for execution of the processes in a system .
⚫ For periodic processes, the length of time that must be considered is the hyper period,
which is the least-common multiple of the periods of all the processes.
⚫ Unrolled schedule The complete schedule for the least-common multiple of the
periods.
Types of scheduling
1. Cyclostatic scheduling or Time Division Multiple Access scheduling
⚫ Schedule is divided into equal-sized time slots over an interval equal to the length of the
hyperperiod H. (run in the same time slot)

Two factors affect this scheduling


⚫ The number of time slots used
⚫ The fraction of each time slot that is used for useful work.
2)Round Robin-scheduling
⚫ Uses the same hyper period as does cyclostatic.
⚫ It also evaluates the processes in order.
⚫ If a process does not have any useful work to do, the scheduler moves on to the next
process in order to fill the time slot with useful work.

⚫ All three processes execute during the first hyperperiod.


⚫ During the second one, P1 has no useful work and is skipped so P3 is directly move on to
the next process.
Scheduling overhead
⚫ The execution time required to choose the next execution process, which is incurred in
addition to any context switching overhead.
To calculate the utilization of CPU
Preemptive Real-Time Operating
Systems(RTOS)
⚫ A pre emptive OS solves the fundamental problem in multitasking system.
⚫ It executes processes based upon timing requirements provided by the system designer.
⚫ To meet timing constraints accurately is to build a preemptive OS and to use priorities to
control what process runs at any given time.
Preemption
⚫ Preemption is an alternative to the C function call to control execution.
⚫ To be able to take full advantage of the timer, change the process as something more than
a function call.
⚫ Break the assumptions of our high-level programming language.
⚫ Create new routines that allow us to jump from one subroutine to another at any point in
the program.
⚫ The timer, will allow us to move between functions whenever necessary based upon the
system’s timing constraints.
Kernel
⚫ It is the part of the OS that determines what process is running.
⚫ The kernel is activated periodically by the timer.
⚫ It determines what process will run next and causes that process to run.
Priorities
⚫ Based on the priorities kernel can do the processes sequentially.
⚫ which ones actually want to execute and select the highest priority process that is ready
to run.
⚫ This mechanism is both flexible and fast.
⚫ The priority is a non-negative integer value.

• When the system begins execution,P2 is the only ready process, so it is selected for execution.
• At T=15, P1 becomes ready; it preempts P2 because p1 has a higher priority, so it execute
immediately
• P3’s data arrive at time 18, it has lowest priority.
• P2 is still ready and has higher priority than P3.
• Only after both P1 and P2 finish can P3 execute
⚫ 5.4.4) Context Switching

⚫ To understand the basics of a context switch, let’s assume that the set of tasks is
in steady state.
⚫ Everything has been initialized, the OS is running, and we are ready for a timer
interrupt.
⚫ This diagram shows the application tasks, the hardware timer, and all the
functions in the kernel that are involved in the context switch.
⚫ vPreemptiveTick()  it is called when the timer ticks.
⚫ portSAVE_CONTEXT() swaps out the current task context.
⚫ vTaskSwitchContext ( ) chooses a new task.
⚫ portRESTORE_CONTEXT() swaps in the new context
PRIORITY-BASED SCHEDULING
⚫ Operating system is to allocate resources in the computing system based on
the priority.
⚫ After assigning priorities, the OS takes care of the rest by choosing the highest-
priority ready process.
⚫ There are two major ways to assign priorities.
⚫ Static priorities that do not change during execution
⚫ Dynamic priorities that do change during execution
⚫ Types of scheduling process
1. Rate-Monotonic Scheduling
2. Earliest-Deadline-First Scheduling
Rate-Monotonic Scheduling(RMS)
⚫ Rate-monotonic scheduling (RMS) is one of the first scheduling policies
developed for real-time systems.
⚫ RMS is a static scheduling policy.
⚫ It assigns fixed priorities are sufficient to efficiently schedule the processes in
many situations.
RMS is known as rate-monotonic analysis (RMA), as summarized below.
⚫ All processes run periodically on a single CPU.
⚫ Context switching time is ignored.
⚫ There are no data dependencies between processes.
⚫ The execution time for a process is constant.
⚫ All deadlines are at the ends of their periods.
⚫ The highest-priority ready process is always selected for execution.
⚫ Priorities are assigned by rank order of period, with the process with the
shortest period being assigned the highest priority.
Example-Rate-monotonic scheduling
⚫ set of processes and their characteristics

⚫ According to RMA Assign highest priority for least execution period.


⚫ Hence P1 the highest priority, P2 the middle priority,and P3 the lowest priority.
⚫ First execute P1 then P2 and finally P3.(T1>T2>T3)
⚫ After assigning priorities, construct a time line equal in length to hyper period, which is 12
in this case.
⚫ Every 4 time intervals P1 executes 1 units.(Execution time intervals for
P1 0-4,4-8,8-12)
⚫ Every 6 time intervals P2 executes 2 units. .(Execution time intervals
for P2 0-6,6-12)
⚫ Every 12 intervals P3 executes 3 units. .(Execution time intervals for P3
0-12)
⚫ Time interval from 10-12 no scheduling available because no process
will be available for execution. All process are executed already.
⚫ P1 is the highest-priority process, it can start to execute immediately.
⚫ After one time unit, P1 finishes and goes out of the ready state until the start of its next
period.
⚫ At time 1, P2 starts executing as the highest-priority ready process.
⚫ At time 3, P2 finishes and P3 starts executing.
⚫ P1’s next iteration starts at time 4, at which point it interrupts P3.
⚫ P3 gets one more time unit of execution between the second iterations of P1 and P2, but
P3 does not get to finish until after the third iteration of P1.
⚫ Consider the following different set of execution times.

⚫ In this case, Even though each process alone has an execution time significantly less than
its period, combinations of processes can require more than 100% of the available CPU
cycles.
⚫ During one 12 time-unit interval, we must execute P1 -3 times, requiring 6 units of CPU
time; P2 twice, costing 6 units and P3 one time, costing 3 units.
⚫ The total of 6 + 6 + 3 = 15 units of CPU time is more than the 12 time units available,
clearly exceeding the available CPU capacity(12units).
RMA priority assignment analysis
⚫ Response time The time at which the process finishes.
⚫ Critical instantThe instant during execution at which the task has the largest response
time.
⚫ Let the periods and computation times of two processes P1 and P2 be τ1, τ2 and T1, T2,
with τ 1 < τ 2.
⚫ let P1 have the higher priority. In the worst case we then execute P2 once during its period
and as many iterations of P1 as fit in the same interval.
⚫ Since there are τ2/ τ1 iterations of P1 during a single period of P2.
⚫ The required constraint on CPU time, ignoring context switching overhead, is

⚫ we give higher priority to P2, then execute all of P2 and all of P1 in one of P1’s periods in
the worst case.

⚫ Total CPU utilization for a set of n tasks is


Earliest-Deadline-First Scheduling(EDF)
⚫ Earliest deadline first (EDF) is a dynamic priority scheme.
⚫ It changes process priorities during execution based on initiation times.
⚫ As a result, it can achieve higher CPU utilizations than RMS.
⚫ The EDF policy is also very simple.
⚫ It assigns priorities in order of deadline.
⚫ Assign highest priority to a process who has Earliest deadline.
⚫ Assign lowest priority to a process who has farthest deadline.
⚫ After assigning scheduling procedure, the highest-priority process is chosen for
execution.
⚫ Consider the following Example

⚫ Hyper-period is 60
Dead line Table
⚫ There is one time slot left at t= 30, giving a CPU utilization of 59/60.
⚫ EDF can achieve 100% utilization
⚫ RMS vs. EDF
Ex:Priority inversion
⚫ Low-priority process blocks execution of a higher priority process by keeping hold
of its resource.
Consider a system with two processes
⚫ Higher-priority P1 and the lower-priority P2.
⚫ Each uses the microprocessor bus to communicate to peripherals.
⚫ When P2 executes, it requests the bus from the operating system and receives it.
⚫ If P1 becomes ready while P2 is using the bus, the OS will preempt P2 for P1,
leaving P2 with control of the bus.
⚫ When P1 requests the bus, it will be denied the bus, since P2 already owns it.
⚫ Unless P1 has a way to take the bus from P2, the two processes may deadlock.
Eg:Data dependencies and scheduling
⚫ Data dependencies imply that certain combinations of processes can never occur. Consider the
simple example.

⚫ We know that P1 and P2 cannot execute at the same time, since P1 must finish before P2 can
begin.
⚫ P3 has a higher priority, it will not preempt both P1 and P2 in a single iteration.
⚫ If P3 preempts P1, then P3 will complete before P2 begins.
⚫ if P3 preempts P2, then it will not interfere with P1 in that iteration.
⚫ Because we know that some combinations of processes cannot be ready at the same time,
worst-case CPU requirements are less than would be required if all processes could be ready
simultaneously.
Inter-process communication mechanisms
⚫ It is provided by the operating system as part of the process abstraction.
⚫ Blocking Communication The process goes into the waiting state until it receives a
response
⚫ Non-blocking CommunicationIt allows a process to continue execution after
sending the communication.
Types of inter-process communication
1. Shared Memory Communication
2. Message Passing
3. Signals
Shared Memory Communication
⚫ The communication between inter-process is used by bus-based system.
⚫ CPU and an I/O device, communicate through a shared memory location.
⚫ The software on the CPU has been designed to know the address of the shared location.
⚫ The shared location has also been loaded into the proper register of the I/O device.
⚫ If CPU wants to send data to the device, it writes to the shared location.
⚫ The I/O device then reads the data from that location.
⚫ The read and write operations are standard and can be encapsulated in a procedural
interface.
⚫ CPU and the I/O device want to communicate through a shared memory block.
⚫ There must be a flag that tells the CPU when the data from the I/O device is ready.
⚫ The flag value of 0 when the data are not ready and 1 when the data are ready.
⚫ If the flag is used only by the CPU, then the flag can be implemented using a standard
memory write operation.
⚫ If the same flag is used for bidirectional signaling between the CPU and the I/O device,
care must be taken.
Consider the following scenario to call flag
1. CPU reads the flag location and sees that it is 0.
2. I/O device reads the flag location and sees that it is 0.
3. CPU sets the flag location to 1 and writes data to the shared location.
4. I/O device erroneously sets the flag to 1 and overwrites the data left by the CPU.
Ex: Elastic buffers as shared memory
⚫ The text compressor is a good example of a shared memory.
⚫ The text compressor uses the CPU to compress incoming text, which is then sent on a
serial line by a UART.
⚫ The input data arrive at a constant rate and are easy to manage.
⚫ But the output data are consumed at a variable rate, these data require an elastic buffer.
⚫ The CPU and output UART share a memory area—the CPU writes compressed characters
into the buffer and the UART removes them as necessary to fill the serial line.
⚫ Because the number of bits in the buffer changes constantly, the compression and
transmission processes need additional size information.
⚫ CPU writes at one end of the buffer and the UART reads at the other end.
⚫ The only challenge is to make sure that the UART does not overrun the buffer.
Message Passing
⚫ Here each communicating entity has its own message send/receive unit.
⚫ The message is not stored on the communications link, but rather at the senders/ receivers
at the end points.
⚫ Ex:Home control system
⚫ It has one microcontroller per household device—lamp, thermostat, faucet, appliance.
⚫ The devices must communicate relatively infrequently.
⚫ Their physical separation is large enough that we would not naturally think of them as
sharing a central pool of memory.
⚫ Passing communication packets among the devices is a natural way to describe
coordination between these devices.
Signals
⚫ Generally signal communication used in Unix .
⚫ A signal is analogous to an interrupt, but it is entirely a software creation.
⚫ A signal is generated by a process and transmitted to another process by the OS.
⚫ A UML signal is actually a generalization of the Unix signal.
⚫ Unix signal carries no parameters other than a condition code.
⚫ UML signal is an object, carry parameters as object attributes.
⚫ The sigbehavior( ) behavior of the class is responsible for throwing the signal,
as indicated by<<send>>.
⚫ The signal object is indicated by the <<signal>>
Evaluating operating system performance
⚫ Analysis of scheduling policies is made by the following 4 assumptions
⚫ Assumed that context switches require zero time. Although it is often
reasonable to neglect context switch time when it is much smaller than the
process execution time, context switching can add significant delay in some
cases.
⚫ We have largely ignored interrupts. The latency from when an interrupt is
requested to when the device’s service is complete is a critical parameter of real
time performance.
⚫ We have assumed that we know the execution time of the processes.
⚫ We probably determined worst-case or best-case times for the processes in
isolation.
Context switching time
It depends on following factors
⚫ The amount of CPU context that must be saved.
⚫ Scheduler execution time.
Interrupt latency
⚫ Interrupt latency It is the duration of time from the assertion of a device interrupt to
the completion of the device’s requested operation.
⚫ Interrupt latency is critical because data may be lost when an interrupt is not serviced in
a timely fashion.

⚫ A task is interrupted by a device.


⚫ The interrupt goes to the kernel, which may need to finish a protected operation.
⚫ Once the kernel can process the interrupt, it calls the interrupt service routine (ISR),
which performs the required operations on the device.
⚫ Once the ISR is done, the task can resume execution.
⚫ Several factors in both hardware and software affect interrupt latency:
⚫ The processor interrupt latency
⚫ The execution time of the interrupt handler
⚫ Delays due to RTOS scheduling
⚫ RTOS delay the execution of an interrupt handler in two ways.
⚫ Critical sections and interrupt latency
⚫ Critical sections in the kernel will prevent the RTOS from taking interrupts.
⚫ Some operating systems have very long critical sections that disable interrupt handling for
very long periods.
⚫ If a device interrupts during a critical section, that critical section must finish before the
kernel can handle the interrupt.
⚫ The longer the critical section, the greater the potential delay.
⚫ Critical sections are one important source of scheduling jitter because a device may
interrupt at different points in the execution of processes and hit critical sections at
different points.
Interrupt priorities and interrupt latency
⚫ A higher-priority interrupt may delay a lower-priority interrupt.
⚫ A hardware interrupt handler runs as part of the kernel, not as a user thread.
⚫ The priorities for interrupts are determined by hardware.
⚫ Any interrupt handler preempts all user threads because interrupts are part of the CPU’s
fundamental operation.
⚫ We can reduce the effects of hardware preemption by dividing interrupt handling into
two different pieces of code.
⚫ Interrupt service handler (ISH) performs the minimal operations required to
respond to the device.
⚫ Interrupt service routine (ISR) Performs updating user buffers or other more
complex operation.
⚫ RTOS performance evaluation tools
⚫ Some RTOSs provide simulators or other tools that allow us to view the
operation of the processes,context switching time, interrupt response time,
and other overheads.
Windows CE provides several performance analysis toolsAn instrumentation routine in
the kernel that measures both interrupt serviceroutine and interrupt service thread
latency.
⚫ OS Bench measures the timing of operating system tasks such as critical
section access, signals, and so on
⚫ Kernel Tracker provides a graphical user interface for RTOS events.
Power optimization strategies for processes
⚫ A power management policy is a strategy for determining when to perform
certain power management operations.
⚫ The system can be designed based on the static and dynamic power
management mechanisms.
Power saving straegies
⚫ Avoiding a power-down mode can cost unnecessary power.
⚫ Powering down too soon can cause severe performance penalties.
⚫ Re-entering run mode typically costs a considerable amount of time.
⚫ A straightforward method is to power up the system when a request is received.
Predictive shutdown
⚫ The goal is to predict when the next request will be made and to start the
system just before that time, saving the requestor the start-up time.
⚫ Make guesses about activity patterns based on a probabilistic model of
expected behavior.
This can cause two types of problems
⚫ The requestor may have to wait for an activity period.
⚫ In the worst case,the requestor may not make a deadline due to the delay
incurred by system
An L-shaped usage distribution
⚫ A very simple technique is to use fixed times.
⚫ If the system does not receive inputs during an interval of length Ton, it shuts down.
⚫ Powered-down system waits for a period Toff before returning to the power-on mode.
⚫ In this distribution, the idle period after a long active period is usually very short, and the
length of the idle period after a short active period is uniformly distributed.
⚫ Based on this distribution, shutdown when the active period length was below a threshold,
putting the system in the vertical portion of the L distribution.
Advanced Configuration and Power Interface (ACPI)
⚫ It is an open industry standard for power management services.
⚫ It is designed to be compatible with a wide variety of OSs.
⚫ A decision module determines power management actions.
ACPI supports the following five basic global power states.
1. G3, the mechanical off state, in which the system consumes no power.
2. G2, the soft off state, which requires a full OS reboot to restore the machine to
working condition. This state has four sub-states:
⚫ S1, a low wake-up latency state with no loss of system context
⚫ S2, a low wake-up latency state with a loss of CPU and system cache state
⚫ S3, a low wake-up latency state in which all system state except for main
⚫ memory is lost.
S4, the lowest-power sleeping state, in which all devices are turned off.
3. G1, the sleeping state, in which the system appears to be off.
4. G0, the working state, in which the system is fully usable.
5. The legacy state, in which the system does not comply with ACPI.
Example Real time operating systems
POSIX
⚫ POSIX is a Unix operating system created by a standards organization.
⚫ POSIX-compliant operating systems are source-code compatible.
⚫ Application can be compiled and run without modification on a new POSIX
platform.
⚫ It has been extended to support real time requirements.
⚫ Many RTOSs are POSIX-compliant and it serves as a good model for basic
RTOS techniques.
⚫ The Linux operating system has a platform for embedded computing.
⚫ Linux is a POSIX-compliant operating system that is available as open source.
⚫ Linux was not originally designed for real-time operation .
⚫ Some versions of Linux may exhibit long interrupt latencies,
⚫ To improve interrupt latency,A dual-kernel approach uses a specialized kernel,
the co-kernel, for real-time processes and the standard kernel for non-real-
time processes.
⚫ Process in POSIX
⚫ A new process is created by making a copy of an existing process.
⚫ The copying process creates two different processes both running the same code.
⚫ The complex task is to ensuring that one process runs the code intended for the new process
while the other process continues the work of the old process .
⚫ Scheduling in POSIX
⚫ A process makes a copy of itself by calling the fork() function.
⚫ That function causes the operating system to create a new process (the child process) which is
a nearly exact copy of the process that called fork() (the parent process).
⚫ They both share the same code and the same data values with one exception, the return value
⚫ of fork().
⚫ The parent process is returned the process ID number of the child process, while the child
process gets a return value of 0.
⚫ We can therefore test the return value of fork() to determine which process is the child
childid = fork();
if (childid == 0) { /* must be the child */
/* do child process here */
}
⚫ execv() function takes as argument the name of the file that holds the child’s
code and the array of arguments.
⚫ It overlays the process with the new code and starts executing it from the
main() function.
⚫ In the absence of an error, execv() should never return.
⚫ The code that follows the call to perror() and exit(), take care of the case where
execv() fails and returns to the parent process.
⚫ The exit() function is a C function that is used to leave a process
childid = fork();
if (childid == 0) { /* must be the child */
execv(“mychild”,childargs);
perror(“execv”);
exit(1);
}
⚫ The wait functions not only return the child process’s status, in many
implementations of POSIX they make sure that the child’s resources .
⚫ The parent stuff() function performs the work of the parent function.

childid = fork();
if (childid == 0) { /* must be the child */
execv(“mychild”,childargs);
perror(“execl”);
exit(1);
}
else { /* is the parent */
parent_stuff(); /* execute parent functionality */
wait(&cstatus);
exit(0);
}
The POSIX process model
⚫ Each POSIX process runs in its own address space and cannot directly access the
data or code.
Real-time scheduling in POSIX
⚫ POSIX supports real-time scheduling in the POSIX_PRIORITY_SCHEDULING
resource.
⚫ POSIX supports Rate-monotonic scheduling in the SCHED_FIFO scheduling
policy.
⚫ It is a strict priority-based scheduling scheme in which a process runs until it is
preempted or terminates.
⚫ The term FIFO simply refers processes run in first-come first-served order.
POSIX semaphores
⚫ POSIX supports semaphores and also supports a direct shared memory mechanism.
⚫ POSIX supports counting semaphores in the _POSIX_SEMAPHORES option.
⚫ A counting semaphore allows more than one process access to a resource at a time.
⚫ If the semaphore allows up to N resources, then it will not block until N processes have
⚫ simultaneously passed the semaphore;
⚫ The blocked process can resume only after one of the processes has given up its
semaphore.
⚫ When the semaphore value is 0, the process must wait until another process gives up the
semaphore and increments the count.
POSIX pipes
⚫ Parent process uses the pipe() function to create a pipe to talk to a child.
⚫ Each end of a pipe appears to the programs as a file.
⚫ The pipe() function returns an array of file descriptors, the first for the write end and the
second for the read end.
⚫ POSIX also supports message queues under the _POSIX_MESSAGE_PASSING facility..
Windows CE
⚫ Windows CE is designed to run on multiple hardware platforms and
instruction set architectures.
⚫ It supports devices such as smart phones, electronic instruments etc..,
⚫ Applications run under the shell and its user interface.
⚫ The Win32 APIs manage access to the operating system.
⚫ OEM Adaption Layer (OAL) provides an interface to the hardware and software
architecture.

⚫ OAL  provides services such as a real-time clock, power management, interrupts, and a
debugging interface.
⚫ A Board Support Package (BSP) for a particular hardware platform includes the OAL and
drivers.
Memory Space
⚫ It support for virtual memory with a flat 32-bit virtual address space.
⚫ A virtual address can be statically mapped into main memory for key kernel-mode code.
⚫ An address can also be dynamically mapped, which is used for all user-mode and some
kernel-mode code.
⚫ Flash as well as magnetic disk can be used as a backing store

⚫ The top 1 GB is reserved for system elements such as DLLs, memory mapped files, and
shared system heap.
⚫ The bottom 1 GB holds user elements such as code, data, stack, and heap.
User address space in windows CE
⚫ Threads are defined by executable files while drivers are defined by
dynamically-linked libraries (DLLs).
⚫ A process can run multiple threads.
⚫ Threads in different processes run in different execution
environments.
⚫ Threads are scheduled directly by the operating system.
⚫ Threads may be launched by a process or a device driver.
⚫ A driver may be loaded into the operating system or a process.
⚫ Drivers can create threads to handle interrupts
⚫ Each thread is assigned an integer priority.
⚫ 0 is the highest priority and 255 is the lowest priority.
⚫ Priorities 248 through 255 are used for non-real-time threads .
⚫ The operating system maintains a queue of ready processes at each
priority level.
⚫ Execution of a thread can also be blocked by a higher-priority thread.
⚫ Tasks may be scheduled using either of two policies: a thread runs until the end
of its quantum; or a thread runs until a higher-priority thread is ready to run.
⚫ Within each priority level, round-robin scheduling is used.
⚫ WinCE supports priority inheritance.
⚫ When priorities become inverted, the kernel temporarily boosts the priority of
the lower-priority thread to ensure that it can complete and release its
resources.
⚫ Kernel will apply priority inheritance to only one level.
⚫ If a thread that suffers from priority inversion in turn causes priority inversion
for another thread, the kernel will not apply priority inheritance to solve the
nested priority inversion.
Sequence diagram for an interrupt
⚫ Interrupt handling is divided among three entities
⚫ The interrupt service handler (ISH) is a kernel service that provides the first
response to the interrupt.
⚫ The ISH selects an interrupt service routine (ISR) to handle the interrupt.
⚫ The ISR in turn calls an interrupt service thread (IST) which performs most of
the work required to handle the interrupt.
⚫ The IST runs in the OAL and so can be interrupted by a higher-priority
interrupt.
⚫ ISRdetermines which IST to use to handle the interrupt and requests the
kernel to schedule that thread.
⚫ The ISH then performs its work and signals the application about the updated
device status as appropriate.
⚫ kernel-mode and user-mode drivers use the same API.
Distributed Embedded Systems (DES)
⚫ It is a collection of hardware and software and its communication.
⚫ It also has many control system performance.
⚫ Processing Element (PE)is a basic unit of DES.
⚫ It allows the network to communicate.
⚫ PE is an instruction set processor such as DSP,CPU and Microcontroller.
Network abstractions
⚫ Networks are complex systems.
⚫ It provide high-level services such as data transmission from the other
components in the system.
⚫ ISO has developed a seven-layer model for networks known as Open Systems
Interconnection (OSI) models.
OSI model layers
⚫ Physical layer defines the basic properties of the
interface between systems, including the physical
connections, electrical properties & basic procedures
for exchanging bits.
⚫ Data link layer used for error detection and control
across a single link.
⚫ Network layer defines the basic end-to-end data
transmission service.
⚫ Transport layer defines connection-oriented
services that ensure that data are delivered in the
proper order .
⚫ Session layer provides mechanisms for controlling
the interaction of end-user services across a network,
such as data grouping and checkpointing.
⚫ Presentation layer layer defines data exchange
formats
⚫ Application layer provides the application interface
between the network and end-user programs.
Controller Area Network(CAN)Bus
⚫ It was designed for automotive electronics
and was first used in production cars in 1991.
⚫ It uses bit-serial transmission.
⚫ CAN can run at rates of 1 Mbps over a twisted
pair connection of 40 meters.
⚫ An optical link can also be used.
4.7.2.1)Physical-electrical organization of a CAN
bus
⚫ Each node in the CAN bus has its own
electrical drivers and receivers that connect
the node to the bus in wired-AND fashion.
⚫ When all nodes are transmitting 1s, the bus is
said to be in the recessive state.
⚫ when a node transmits a 0s, the bus is in the
dominant state.
Data Frame

⚫ Arbitration field The first field in the packet contains the packet’s destination address 11 bits
⚫ Remote Transmission Request (RTR) bit is set to 0 if the data frame is used to request data
from the destination identifier.
⚫ When RTR = 1, the packet is used to write data to the destination identifier.
⚫ Control field 4-bit length for the data field with a 1 in between.
⚫ Data field0 to 64 bytes, depending on the value given in the control field.
⚫ CRC It is sent after the data field for error detection.
⚫ Acknowledge field identifier signal whether the frame was correctly received.( sender puts a
bit (1) in the ACK slot , if the receiver detected an error, it put (0) value)
Arbitration
⚫ It uses a technique known as Carrier Sense Multiple Access with Arbitration on Message
Priority (CSMA/AMP).
⚫ When a node hears a dominant bit in the identifier when it tries to send a recessive bit, it
stops transmitting.
⚫ By the end of the arbitration field, only one transmitter will be left.
⚫ The identifier field acts as a priority identifier, with the all-0 having the highest priority
Error handling
⚫ An error frame can be generated by any node that detects an error on the bus.
⚫ Upon detecting an error, a node interrupts the current transmission.
⚫ Error flag field followed by an error delimiter field of 8 recessive bits.
⚫ Error delimiter field allows the bus to return to the quiescent state so that data frame
transmission can resume.
⚫ Overload frame signals that a node is overloaded and will not be able to handle the next
message. Hence the node can delay the transmission of the next frame .
Architecture of a CAN controller

⚫ The controller implements the physical and data link layers.


⚫ CAN does not need network layer services to establish end-to-end
connections.
⚫ The protocol control block is responsible for determining when to send
messages, when a message must be resent and when a message should
be received.
I2C bus
⚫ I2C bus used to link microcontrollers
into systems.
⚫ I2C is designed to be low cost, easy to
implement, and of moderate speed (up to
100kbps for the standard bus and up to
400 kbps for the extended bus).
⚫ Serial data line (SDL) for data
transmission.
⚫ Serial clock line (SCL) indicates when
valid data are on the data line.
⚫ Every node in the network is connected to
both SCL and SDL.
⚫ Some nodes may act as bus masters .
⚫ Other nodes may act as slaves that only
respond to requests from masters.
Electrical interface to the I2C bus
⚫ Both bus lines are defined by an electrical signal.
⚫ Both bus signals use open collector/open drain
circuits.
⚫ The open collector/open drain circuitry allows a slave
device to stretch a clock signal during a read.
⚫ The master is responsible for generating the SCL
clock.
⚫ The slave can stretch the low period of the clock.
⚫ It is a multi master bus so different devices may act
as the master at various times.
⚫ Master drives both SCL and SDL when it is sending
data.
⚫ When the bus is idle, both SCL and SDL remain
high.
⚫ When two devices try to drive either SCL or SDL ,
the open collector/open drain circuitry prevents
errors.
⚫ Each master device make sure that it is not
interfering with another message.
Format of an I2C address transmission

⚫ Every I2C device has an separate address.


⚫ A device address is 7 bits and 1 bit for read/write data.
⚫ The address 0000000 ,which can be used to signal all devices simultaneously.
⚫ The address 11110XX is reserved for the extended 10-bit addressing scheme.
Bus transactions on the I2C bus

⚫ When a master wants to write a slave, it transmits the slave’s address followed by the data.
⚫ When a master send a read request with the slave’s address and the slave transmit the data.
⚫ Transmission address has 7-bit and 1 bit for data direction.( 0 for writing from the master to
the slave and 1 for reading from the slave to the master)
⚫ A bus transaction is initiated by a start signal and completed with an end signal.
⚫ A start is signaled by leaving the SCL high and sending a 1 to 0 transition on SDL.
⚫ A stop is signaled by setting the SCL high and sending a 0 to 1 transition on SDL.
State transition graph for an I2C bus master
⚫ Starts and stops must be paired.
⚫ A master can write and then read by sending a start after the data transmission, followed
by another address transmission and then more data.
Transmitting a byte on the I2C bus
⚫ The transmission starts when SDL is pulled low while SCL remains high.
⚫ The clock is pulled low to initiate the data transfer.
⚫ At each bit, the clock goes high while the data line assumes its proper value of 0 or 1.
⚫ An acknowledgment is sent at the end of every 8-bit transmission, whether it is an
address or data.
⚫ After acknowledgment, the SDL goes from low to high while the SCL is high, signaling
the stop condition.
I2C interface in a microcontroller
⚫ System has a 1-bit hardware interface with routines for byte-level functions.
⚫ I2C device used to generates the clock and data.
⚫ Application code calls routines to send an address, data byte, and also generates the SCL
,SDL and acknowledges.
⚫ Timers is used to control the length of bits on the bus.
⚫ When Interrupts used in master mode, polled I/O may be acceptable.
⚫ If no other pending tasks can be performed, because masters initiate their own transfers.
ETHERNET
⚫ It is widely used as a local area network for general-purpose computing.
⚫ It is also used as a network for embedded computing.
⚫ It is particularly useful when PCs are used as platforms, making it possible to use
standard components, and when the network does not have to meet real-time
requirements.
⚫ It is a bus with a single signal path.
⚫ It supports both twisted pair and coaxial cable.
⚫ Ethernet nodes are not synchronized, if two nodes decide to transmit at the same
time,the message will be ruined.
Ethernet CSMA/CD algorithm
⚫ A node that has a message waits for the
bus to become silent and then starts
transmitting.
⚫ It simultaneously listens, and if it hears
another transmission that interferes with
its transmission, it stops transmitting and
waits to retransmit.
⚫ The waiting time is random, but weighted
by an exponential function of the number
of times the message has been aborted
Ethernet-Packet format

⚫ Preamble 56-bit of alternating 1 and 0 bits, allowing devices on the network


to easily synchronize their receiver clocks.
⚫ SFD8-bit ,indicates the beginning of the Ethernet frame
⚫ Physical or MAC addresses  destination and the source( 48-bit length)
⚫ Length data payloadThe minimum payload is 42 octets
INTERNET PROTOCOL(IP)
⚫ It is the fundamental protocol on the Internet.
⚫ It provides connection orientded, packet-based
communication.
⚫ It transmits packet over different networks from
source to destination.
⚫ It allows data to flow seamlessly from one end user t o
another.
⚫ When node A wants to send data to node B, the data
pass through several layers of the protocol stack to
get to the Internet Protocol.
⚫ IP creates packets for routing to the destination,
which are then sent to the data link and physical
layers.
⚫ A packet may go through many routers to get to its
destination.
⚫ IP works at the network layer  does not
guarantee that a packet is delivered to its
destination.
⚫ It supports best-effort routing packets packets
that do arrive may come out of order.
IP packet structure
⚫ Version it ia s 4-bit field.used to identify v4 or v6.
⚫ Header Length (HL)It is a 4 bits, field.Indicates the length of the header.
⚫ Service Typeit is a 8 bit field ,used to specify the type of service.
⚫ Total lengthIncluding header and data payload is 65,535 bytes.
⚫ Identification identifying the group of fragments of a single IP datagram.
⚫ Flags bit 0 Reserved.
⚫ bit 1: Don't Fragment (DF)
⚫ bit 2: More Fragments (MF)
⚫ Fragment Offset It is 13 bits long , specifies the offset of a particular fragment relative to
the beginning of the original unfragmented IP datagram
⚫ Time To Live (TTL)It is a 8 bit wide, indicates th datagram's lifetime
⚫ Protocol protocol used in the data portion of the IP datagram
⚫ Header Checksum(16 bit) used for error-checking of the header
⚫ Source address Sender packet address(32-bits size)
⚫ Destination address Receiver packet address(32-bits size)
Transmission Control Protocol(TCP)
⚫ It provides a connection-oriented service.
⚫ It ensures that data arrive in the appropriate order.
⚫ It uses an acknowledgment protocol to ensure that packets arrive.
⚫ TCP is used to provide File Transport Protocol (FTP) for batch file transfers.
⚫ Hypertext Transport Protocol (HTTP) for World Wide Web service.
⚫ Simple Mail Transfer Protocol (SMTP) for email.
⚫ Telnet for virtual terminals.
⚫ User Datagram Protocol (UDP), is used to provide connection-less services.
⚫ Simple Network Management Protocol (SNMP) provides the network management services.
MPSoCs and shared memory
multiprocessors
⚫ Shared memory processors are well-suited to applications that require a large amount
of data to be processed(Signal processing systems )
⚫ Most MPSoCs are shared memory systems.
⚫ Shared memory allows for processors to communicate with varying patterns.
⚫ If the pattern of communication is very fixed and if the processing of different steps is
performed in different units, then a networked multiprocessor may be most appropriate.
⚫ If one processing element is used for several different steps, then shared memory also
allows the required flexibility in communication.
Heterogeneous shared memory multiprocessors

⚫ Many high-performance embedded platforms are heterogeneous multiprocessors.


⚫ Different processing elements (PE)perform different functions.
⚫ PEs may be programmable processors with different instruction sets or specialized
accelerators.
⚫ Processors with different instruction sets can perform different tasks faster and using less
energy.
⚫ Accelerators provide even faster and lower-power operation for a narrow range of
functions.
Accelerators
⚫ It is the important processing element for embedded multiprocessors.
⚫ It can provide large performance increases for applications with computational kernels .
⚫ It can also provide critical speedups for low-latency I/O functions.
⚫ CPU(host) accelerator is attached to the CPU bus.
⚫ CPU talks to the accelerator through data and control registers in the accelerator.
⚫ Control registers allow the CPU to monitor the accelerator’s operation and to give the
accelerator commands.
⚫ The CPU and accelerator may also communicate via shared memory.
⚫ The accelerator operate on a large volume of data with efficient data in memory.
⚫ Accelerator read and write memory directly .
⚫ The CPU and accelerator use synchronization mechanisms to ensure that they do not
⚫ destroy each other’s data.
⚫ An accelerator is not a co-processor.
⚫ A co-processor is connected to the internals of the CPU and processes instructions.
⚫ An accelerator interacts with the CPU through the programming model interface.
⚫ It does not execute instructions.
⚫ CPU and accelerators performs computations for specification.
CPU accelerators in a system
Accelerator Performance Analysis
⚫ The speed factor of accelerator will depend on the following factors.
⚫ Single threadedCPU is in idle state while the accelerator runs.
⚫ MultithreadedCPU do some useful work in parallel with accelerator.
⚫ Blocking CPU’s scheduler block other operations wait for the accelerator call to
complete.
⚫ Non-blocking CPU’s run some other work parallel with accelerator.
⚫ Data dependencies allow P2 and P3 to run independently on the CPU.
⚫ P2 relies on the results of the A1 process that is implemented by the accelerator.
⚫ Single-threaded CPU blocks to wait for the accelerator to return the results of its
computation.t, it doesn’t matter whether P2 or P3 runs next on the CPU.
⚫ Multithreaded  CPU continues to do useful work while the accelerator runs, so the CPU
can start P3 just after starting the accelerator and finish the task earlier.
Components of execution time for an accelerator
⚫ Execution time of a accelerator depends on the
time required to execute the accelerator’s
function.
⚫ It also depends on the time required to get the
data into the accelerator and back out of it.
⚫ Accelerator will read all its input data, perform
the required computation,and write all its results.
⚫ Total execution time given as
⚫ tacccel=tx+tin+tout
⚫ tx execution time of the accelerator
⚫ Tin times required for reading the required
variables
⚫ tout- times required for writing the required
variables
System Architecture Framework
⚫ Architectural design depends on the application.
An accelerator can be considered from two angles.
⚫ Accelerator core functionality
⚫ Accelerator interface to the CPU bus.
⚫ The accelerator core typically operates off internal
registers.
⚫ Requirement of number of registers is an important
design decision.
⚫ Main memory accesses will probably take multiple
clock cycles.
⚫ Status registers used to test the accelerator’s state and
to perform basic operations(starting, stopping, and
resetting the accelerator)
⚫ A register file in the accelerator acts as a buffer betwe en
main memory and the accelerator core.
⚫ Read unit can read the accelerator’s requirements and
load the registers with the next required data.
⚫ Write unit can send recently completed values to main
memory.
cache problem in an accelerated system
⚫ CPU cache can cause problems for
accelerators.
1. The CPU reads location S.
2. The accelerator writes S.
3. The CPU again reads S.
⚫ If the CPU has cached location S ,the
program will not see the value of S
written by the accelerator. It will instead
get the old value of S stored in the cache .
⚫ To avoid this problem, the CPU’s cache
must update the cache by setting cache
entry is invalid.
Scheduling and allocation
⚫ Designing a distributed embedded system, depends upon the scheduling and allocation
of resources.
⚫ We must schedule operations in time, including communication on the network and
computations on the processing elements.
⚫ The scheduling of operations on the PEs and the communications between the PEs are
linked.
⚫ If one PE finishes its computations too late, it may interfere with another communication
on the network as it tries to send its result to the PE that needs it.
⚫ This is bad for both the PE that needs the result and the other PEs whose communication
is interfered with.
⚫ We must allocate computations to the processing elements.
⚫ The allocation of computations to the PEs determines what communications are
required—if a value computed on one PE is needed on another PE, it must be
transmitted over the network.
⚫ We can specify the system as a task graph. However, different processes may end up on
different processing elements. Here is a task graph

⚫ We have labeled the data transmissions on each arc ,We want to execute the task on the
platform below.

⚫ The platform has two processing elements and a single bus connecting both PEs. Here
are the process speeds:
⚫ As an initial design, let us allocate P1 and P2 to M1 and P3 to M2This schedule shows
what happens on all the processing elements and the network.

⚫ The schedule has length 19. The d1 message is sent between the processes internal to
⚫ P1 and does not appear on the bus.
⚫ Let’s try a different allocation. P1 on M1 and P2 and P3 on M2. This makes P2 run more
slowly. Here is the new schedule:.
⚫ The length of this schedule is 18, or one time unit less than the other schedule. The
⚫ increased computation time of P2 is more than made up for by being able to transmit a
⚫ shorter message on the bus. If we had not taken communication into account when
analyzing total execution time, we could have made the wrong choice of which processes
to put on the same processing element.
Audio player/MP3 Player
Operation and requirements
⚫ MP3 players use either flash memory or disk drives to store music.
⚫ It performs the following functions such as audio storage, audio decompression, and
user interface.
⚫ Audio compression It is a lossy process. The coder eliminates certain features of the audio
stream so that the result can be encoded in fewer bits.
⚫ Audio decompression The incoming bit stream has been encoded using a Huffman style
code, which must be decoded.
⚫ Masking One tone can be masked by another if the tones are sufficiently close in frequency.
Audio compression standards
⚫ Layer 1 (MP1) uses a lossless compression of sub bands and simple masking model.
⚫ Layer 2 (MP2) uses a more advanced masking model.
⚫ Layer 3 (MP3) performs additional processing to provide lower bit rates.
MPEG Layer 1 encoder
⚫ Filter bank splits the signal into a set of 32 sub-
bands that are equally spaced in the frequency
domain and together cover the entire frequency
range of the audio.
⚫ EncoderIt reduce the bit rate for the audio
signals.
⚫ Quantizer scales each sub-band( fits within 6
bits ), then quantizes based upon the current scale
factor for that sub-band.
⚫ Masking model  It is driven by a separate Fast
Fourier transform (FFT), the filter bank could be
used for masking, a separate FFT provides better
results.
⚫ The masking model chooses the scale factors for
the sub-bands, which can change along with the
audio stream.
⚫ Multiplexer output of the encoder passes along
all the required data.
MPEG Layer 1 data frame format
⚫ A frame carries the basic MPEG data, error correction codes, and additional information.
⚫ After disassembling the data frame, the data are un-scaled and inverse quantized to
produce sample streams for the sub-band.

MPEG Layer 1 decoder


•After disassembling the data frame, the data are un-
scaled and inverse quantized to produce sample
streams for the sub-band.
• An inverse filter bank then reassembles the sub-bands
into the uncompressed signal.
User interface MP3 player is simple both the
physical size and power consumption of the device.
Many players provide only a simple display and a few
buttons.
File system player generally must be compatible
with PCs. CD/MP3 players used compact discs that had
been created on PCs.
Requirements
Specification
⚫ The File ID class is an abstraction of a file in the flash file system.
⚫ The controller class provides the method that operates the player.
State diagram for file display and selection
⚫ This specification assumes that all files are in the root directory and that all files are
playable audio.
State diagram for Audio Playback
⚫ It refers to sending the samples to the audio system.
⚫ Playback and reading the next data frame must be overlapped to ensure continuous operation.
⚫ The details of playback depend on the hardware platform selected, but will probably involve a
DMA transfer.
System architecture
⚫ The audio controller includes two processors.
⚫ The 32-bit RISC processor is used to perform system control and audio decoding.
⚫ The 16-bit DSP is used to perform audio effects such as equalization.
⚫ The memory controller can be interfaced to several different types of memory.
⚫ Flash memory can be used for data or code storage.
⚫ DRAM can be used to handle temporary disruptions of the CD data stream.
⚫ The audio interface unit puts out audio in formats that can be used by A/D converters.
⚫ General- purpose I/O pins can be used to decode buttons, run displays.
Component design and testing
⚫ The audio output system should be tested separately from the compression system.
⚫ Testing of audio decompression requires sample audio files.
⚫ The standard file system can either implement in a DOS FAT or a new file system.
⚫ While a non-standard file system may be easier to implement on the device, it also
requires software to create the file system.
⚫ The file system and user interface can be tested independently .
System integration and debugging
⚫ It ensure that audio plays smoothly and without interruption.
⚫ Any file access and audio output that operate concurrently should be separately tested,
ideally using an easily recognizable test signal.
Engine Control Unit
⚫ This unit controls the operation of a
fuel-injected engine based on
several measurements taken from
the running engine.
Operation and
Requirements
⚫ The throttle is the command input.
⚫ The engine measures throttle,
RPM, intake air volume, and other
variables.
⚫ The engine controller computes
injector pulse width and spark.
Requirements
Specification
⚫ The engine controller must deal with processes at different rates
⚫ ΔNE and ΔT to represent the change in RPM and throttle position.
⚫ Controller computes two output signals, injector pulse width PW and spark advance
angle S.
⚫ S=k2X ΔNE-k3VS
⚫ The controller then applies corrections to these initial values
⚫ If intake air temperature (THA) increases during engine warm-up, the controller reduces
the injection duration.
⚫ If the throttle opens, the controller temporarily increases the injection frequency.
⚫ Controller adjusts duration up or down based upon readings from the exhaust oxygen
sensor (OX).
System architecture
⚫ The two major processes, pulse-
width and advance-angle,
compute the control parameters
for the spark plugs and injectors.
⚫ Control parameters rely on
changes in some of the input
signals.
⚫ Physical sensor classes used to
compute these values.
⚫ Each change must be updated at
the variable’s sampling rate.
State diagram for throttle position sensing
⚫ Throttle sensing, which saves both the
current value and change in value of
the throttle.
State diagram for injector pulsewidth
⚫ In each case, the value is computed
in two stages, first an initial value
followed by a correction.

State diagram for spark advance angle


Component design and testing
⚫ Various tasks must be coded to satisfy the requirements of RTOS processes.
⚫ Variables that are maintained across task execution, such as the change-of-state
variables, must be allocated and saved in appropriate memory locations.
⚫ Some of the output variables depend on changes in state, these tasks should be tested
with multiple input variable sequences to ensure that both the basic and adjustment
calculations are performed correctly.
System integration and testing
⚫ Engines generate huge amounts of electrical noise that can cripple digital electronics.
⚫ They also operate over very wide temperature ranges.
1. hot during engine operation,
2. potentially very cold before the engine is started.
⚫ Any testing performed on an actual engine must be conducted using an engine
controller that has been designed to withstand the harsh environment of the engine
compartment.
Video Accelerator
⚫ It is a hardware circuits on a display adapter that speed up fill motion video.
⚫ Primary video accelerator functions are color space conversion, which converts YUV to RGB.
⚫ Hardware scaling is used to enlarge the image to full screen and double buffering which
moves the frames into the frame buffer faster.

Video compression
• MPEG-2 forms the basis for U.S. HDTV
broadcasting.
• This compression uses several
component algorithms together in a
feedback loop.
• Discrete cosine transform (DCT) used in
JPEG and MPEG-2.
• DCT used a block of pixels which is
quantized for lossy compression.
• Variable-length coderassign number of
bits required to represent the block.
Block motion Estimation
⚫ MPEG uses motion to encode one frame in
terms of another.
⚫ Block motion estimationsome frames are
sent as modified forms of other frames
⚫ During encoding, the frame is divided into
macro blocks.
⚫ Encoder uses the encoding information to
recreate the lossily-encoded picture, compares
it to the original frame, and generates an error
signal.
⚫ Decoder keep recently decoded frames in
memory so that it can retrieve the pixel values
of macro-blocks.
5.13.2).Concept of Block motion estimation
⚫ To find the best match between regions in the two frames.
⚫ Divide the current frame into 16 x 16 macro blocks.
⚫ For every macro block in the frame, to find the region in the previous frame that most
closely matches the macro block.
⚫ Measure similarity using the following sum-of-differences measure

⚫ M(i,j)  intensity of the macro block at pixel i,j,


⚫ S(i,j)  intensity of the search region
⚫ N size of the macro block in one dimension
⚫ <ox, oy>offset between the macro block and search region
⚫ We choose the macro block position relative to the search area that gives us the smallest
value for this metric.
⚫ The offset at this chosen position describes a vector from the search area center to the
macro block's center that is called the motion vector.
Algorithm and requirements
⚫ C code for a single search, which assumes that the search region does not extend past the
boundary of the frame.
⚫ The arithmetic on each pixel is simple, but we have to process a lot of pixels.
⚫ If MBSIZE is 16 and SEARCHSIZE is 8, and remembering that the search distance in each
⚫ dimension is 8 + 1 + 8, then we must perform
Requirements
Specification
⚫ Specification for the system is relatively straightforward because the algorithm is simple.
⚫ The following classes used to describe basic data types in the system motion vector,
macro block, search area.
Sequence Diagram
⚫ The accelerator provides a behavior
compute-mv() that performs the block
motion estimation algorithm.
⚫ After initiating the behavior, the accelerator
reads the search area and macro block from
the PC, after computing the motion vector,
it returns it to the PC.
Architecture
⚫ The macro block has 16 x16 = 256.
⚫ The search area has (8 + 8 + 1 + 8 + 8)2 =
1,089 pixels.
⚫ FPGA probably will not have enough
memory to hold 1,089 (8-bit )values.
⚫ The machine has two memories, one for
the macro block and another for the
search memories.
⚫ It has 16 processing elements that
perform the difference calculation on a
pair of pixels.
⚫ Comparator sums them up and selects
the best value to find the motion vector.
System testing
⚫ Testing video algorithms requires a large amount of data.
⚫ we are designing only a motion estimation accelerator and not a complete video
compressor, it is probably easiest to use images, not video, for test data.
⚫ use standard video tools to extract a few frames from a digitized video and store them in
JPEG format.
⚫ Open source for JPEG encoders and decoders is available.
⚫ These programs can be modified to read JPEG images and put out pixels in the format
required by your accelerator.

You might also like