Embedded
Embedded
Bernard Boigelot
E-mail : [email protected]
WWW : http://www.montefiore.ulg.ac.be/boigelot/
http://www.montefiore.ulg.ac.be/boigelot/courses/embedded/
References :
1
Chapter 1
Introduction
2
Embedded systems
Typical applications:
home appliances;
computer peripherals;
3
elevators, intrusion-detection devices, domotic systems;
...
4
Advantages
simpler,
cheaper to build,
more powerful.
Software components can easily be updated during the lifetime of a product, as well as
reused in other projects.
5
Developing embedded systems: Main difficulties
Real-time constraints.
High level of quality expected: Reliability, robustness and efficiency are critical.
6
Chapter 2
Hardware
7
Main components of an embedded system
8
Memory:
Timers,
Converters,
Communication controllers,
...
Communication buses.
9
Interfaces with the circuit environment:
Auxiliary components:
Power supply,
Clock generator,
Bus controllers,
...
10
Example of embedded microcontroller: Microchip PIC16F716
Main features:
Harvard memory model: 2048 words of (FLASH) program memory, 128 bytes of
(RAM) data memory, both internal.
Low power consumption: About 120 A (under 2V) at 1 MHz, 14 A at 32 kHz, 100 nA
in standby.
11
Pinout:
RA2 RA1
RA3 RA0
RA4 OSC1
MCLR OSC2
VSS VDD
RB0 RB7
RB1 RB6
RB2 RB5
RB3 RB4
Description:
12
Example of application: Temperature alarm
1N4001
7805
10K
10F + 100nF VDD
Piezo
33pF 33pF
13
Example of embedded bus
Problem: Managing data transfers between several devices (CPU, memory, sensors,
peripherals, . . . ) using communication hardware that is as simple as possible.
Requirements:
Bus topology.
Flexible configuration.
14
Solution: I2C bus
Principles:
The bus consists of a pair of two-way lines: SDA (Serial DAta) and SCL (Serial Clock).
Each device connected to the bus can read the value of SDA and SCL, but is only able
to force them down, i.e., to write a low value.
VCC
SDA
SCL
Device 1 Device 2
15
I2C: Transactions
signaling the beginning (Start, S) and the end (Stop, P) of the transaction. The signals
S and P correspond to the two possible transitions of SDA when SCL is high.
When a transaction is in progress (i.e., between S and P), transitions of SDA are only
allowed when SCL is low.
Illustration:
SDA
SCL S P
16
I2C: Data transfers
During a transaction, the sender of data can either be the master or the slave.
The value of each bit of data sent on the bus corresponds to the value of SDA during a
low-to-high transition of SCL.
Data is exchanged in 8-bit groups, the most significant bit (MSB) being sent first.
If a group of bits is not acknowledged, then the master immediately aborts the
transaction, and the slave stops sending or receiving data.
17
I2C: Addressing
When a transaction is initiated, the master has to specify which device is the other
participant.
Principles:
The first 8 bits exchanged in a transaction are always sent by the master.
The first 7 bits of this group correspond to the address of the intended slave.
The 8th bit then specifies the direction of the following data transfer:
Remark: The first group of 8 bits must thus be acknowledged by the addressed slave,
regardless of the data transfer direction.
18
I2C: Arbitration
It is possible to have several devices attempting to initiate transactions at the same time, by
generating simultaneous Start signals.
For detecting potential conflicts, each master constantly monitors the value of SDA when it
sends data. If the observed value differs from the sent one, then the master performing this
observation immediately and silently withdraws from the transaction.
Remarks:
A conflict can only be detected by the device that sends a high value.
Transmitting simultaneously two exactly identical frames does not lead to a conflict!
19
I2C: Flow control
In some cases, the frequency of the clock signal generated is too high to be followed by the
slave.
In such situations, the slave can request the master to permanently or temporarily slow
down the clock. This can be done by stretching the low value of SCL until the slave is ready
again to send or receive data.
When the master releases SCL while the clock is stretched, it detects that the value of SCL
stays low, and pauses its operations until this line is released by the slave.
Illustration:
released by the master
SCL
20
Multiplexed input/output pins
Digital input
VCC
Three-state output
Open-drain output
Digital input with pull-up
GND
Schmitt trigger
D/A Analog output
A/D Analog input
This feature makes is possible to build simple circuits in which the processor can interact
with a large number of peripherals.
21
Example: Digital multimeter
22
Solution:
The screen and the keyboard are scanned: At a given time, one can only display a
single digit, or read a single column of keys.
4 pins are associated to both an input channel and a screen digit. They are
alternatively configured as analog inputs (when reading channels) and digital outputs
(when displaying a digit or reading the keyboard).
The 8 remaining pins drive the screen segments during display and channel reading
phases (8 digital outputs), and are also able to the scan the keyboard (4 digital outputs
+ 4 digital inputs with pull-up).
23
Schematics:
MCU
24
Chapter 3
Interrupts
25
Introduction
Advantages:
26
or by the processor itself:
timer expiration,
...
27
The interrupt mechanism
Upon receiving and accepting to service an interrupt request, the processor performs the
following operations:
3. The address of the interrupt routine is read from the appropriate interrupt vector
(according to the source of the interrupt request).
5. At the end of the interrupt routine, the processor resumes program execution, at the
address retrieved from the stack.
28
Interrupt control
Some processors allow to assign specific priorities to interrupts originating from different
sources. Such architectures generally provide a mechanism for disabling the interrupts
having a priority less than some specified threshold. Interrupt priorities are also used for
resolving simultaneous interrupt requests.
Notes:
29
When an interrupt is triggered, some processors automatically disable all interrupts of
less or equal priority. They have to be explicitly reenabled in the interrupt routine if
needed.
When an interrupt request is received, the processor sets interrupt flags, in order to
trigger the interrupt as soon as it becomes enabled. Interrupt flags have to be cleared
explicitly by the interrupt routine.
30
Saving and restoring context
The correct operation of a program must not be influenced by interrupts triggered during its
execution.
It is thus mandatory for interrupt routines to leave the processor state unchanged: values of
registers and flags, interface configuration, status of peripherals, . . . , must not be modified.
This is achieved by saving the context at the beginning of interrupt routines, and restoring it
at the end.
Notes:
The context is either saved on the execution stack or in a specific memory area.
Some processors automatically save the context (either totally or in part) when an
interrupt is triggered.
31
Context save and restore operations can sometimes be simplified by using dedicated
instructions.
32
Programming interrupts
With some compilers, such mechanisms automatically insert context save and restore
instructions to interrupt routines, and take care of setting interrupt vectors.
Enabling and disabling interrupts is performed with the help of macros or specific
compilation directives (i.e., enable()/disable(), critical keyword).
It is sometimes necessary to inform the compiler than the value of a variable can be
modified by interrupt routines, in order to prevent incorrect optimizations (e.g.,
volatile keyword in C).
33
Communicating with interrupt routines
Interrupt requests are by nature unpredictable. This complicates data exchange operations
between interrupt routines and the main code.
Example: Industrial controller. The alarm must sound if two temperature measurements
made by an interrupt routine differ.
Wrong solution:
void controller(void)
{
int temp0, temp1;
for (;;)
{
temp0 = temp[0];
temp1 = temp[1];
if (temp0 != temp1) !! sound the alarm;
}
}
34
Notes:
Carrying out the comparison between the two measurements in a single C instruction
does not solve the problem:
...
void controller(void)
{
for (;;)
if (temp[0] != temp[1]) !! sound the alarm;
}
...
This only happens with specific instructions, often repeatedly performing a simpler
operation (e.g., block copy instructions).
35
Correct solution:
The instructions that read the measurements sent by the interrupt routine to the controller
form a critical section, the execution of which cannot be interrupted.
void controller(void)
{
int temp0, temp1;
for (;;)
{
disable(); /* Disable interrupts */
temp0 = temp[0];
temp1 = temp[1];
enable(); /* Reenable interrupts */
36
Other solution:
void controller(void)
{
for (;; controller_uses_b = !controller_uses_b)
if (controller_uses_b)
{
if (temp_b[0] != temp_b[1]) !! sound the alarm;
}
else
if (temp_a[0] != temp_a[1]) !! sound the alarm;
}
37
Notes:
The main code must sometimes perform one useless iteration before sounding the
alarm.
38
Improved solution:
#define MAX_FIFO 10 /* Must be even ! */
static volatile int temp_fifo[MAX_FIFO];
static volatile int first = 0;
static int last = 0;
void controller(void)
{
int temp0, temp1;
for (;;)
if (first != last) /* If the buffer is not empty */
{
temp0 = temp_fifo[last];
temp1 = temp_fifo[last + 1];
last += 2;
if (last == MAX_FIFO)
last = 0;
if (temp0 != temp1) !! sound the alarm;
}
}
39
Note: For this solution to be correct, it is necessary that the instruction last += 2
executes atomically.
40
Interrupt latency
The delay between an interrupt request I and the end of execution of urgent operations in
an interrupt routine RI is called the response time, or latency of the interrupt.
1. The longest interval during which interrupts of priority larger or equal to I are disabled.
2. The time needed for executing the interrupt routines with a higher priority than RI .
3. The maximum delay between an interrupt trigger and the branch to the corresponding
interrupt routine.
41
A good strategy is therefore to
make the interrupt routines quick and efficient (parameters 2 and 4).
42
Example
A system implements the following interrupt routines, sharing the same priority.
The main program disables interrupts during resp. 200 s and 250 s for exchanging
data with I1 and I2.
The time needed for triggering I3 and executing the corresponding urgent operations is
equal to 100 s.
43
Answer:
It is sufficient to study the system during an interval of length equal to 1000 s. The highest
possible latency is obtained with the following delays:
Executing I1 : 2 100 s.
Executing I2 : 200 s.
Total: 750 s.
Notes:
Only the largest interval in which interrupts are disabled has to be taken into account!
44
Example of scenario in which the maximum latency is reached:
urgent
operations
completed
IRQ1 IRQ1
IRQ2 enable()
IRQ3 I3
disable() I2
I1
Main program
0 100 200 300 400 500 600 700 800 900 1000 (s)
45
Chapter 4
Software architectures
46
The round-robin architecture
Principles:
Illustration:
void main(void)
{
for (;;)
{
if (!! task 1 is ready )
{
!! operations of task 1;
}
if (!! task 2 is ready )
{
!! operations of task 2 ;
}
..
.
47
Advantages:
Drawbacks:
The worst-case latency of an external request is equal to the execution time of the
entire main loop.
48
Example (multimeter):
#include "types.h"
#include "multimeter.h"
static UINT1 phase = 0; /* 03: display, 4: keyboard, 5: channels */
static UINT1 display_content[4];
static SINT4 measures[4];
static keyboard_state keys;
static multimeter_state parameters;
void main(void)
{
!! initialize global data ;
for (;;)
{
switch (phase)
{
case 4:
handle_keyboard();
if (keys.new_keypress)
{
keypress_action();
keys.new_keypress= 0;
}
break;
case 5:
handle_channels();
update_display_content();
break;
default:
handle_display();
}
if (++phase > 5)
phase = 0;
}
}
49
void handle_display(void)
{
UINT1 digit, segments;
delay(DISPLAY_DELAY);
}
void handle_channels()
{
!! PORTA: 4 analog inputs ;
!! PORTB: 8 digital outputs ;
out(PORTB, 0);
delay(CHANNELS_DELAY);
void handle_keyboard()
{
static UINT1 column = 0;
UINT1 row;
50
out(PORTA, 0);
out(PORTB, 1 < < column);
row = in(PORTB) > > 4;
if (++column >= 4)
column = 0;
}
void keypress_action()
{
!! update parameters according to the key that has
!! been pressed (specified in keys) ;
}
void update_display_content()
{
!! update display_content according to the values in
!! measures and parameters;
}
51
The round-robin with interrupts architecture
Principles: Tasks are invoked in round-robin fashion, but interrupt routines take care of
urgent operations.
Illustration:
volatile BOOL ready1 = 0, ready2 = 0, ...,
readyn = 0;
52
void main(void)
{
for (;;)
{
if (ready1)
{
!! non-urgent operations of task 1;
ready1 = 0;
}
if (ready2)
{
!! non-urgent operations of task 2 ;
ready2 = 0;
}
..
.
if (readyn)
{
!! non-urgent operations of task n;
readyn = 0;
}
}
}
53
Advantage: The urgent operations take priority over the non-urgent ones.
priority
Urgent 1
Urgent 2
Urgent n
Drawbacks:
The non-urgent tasks share the same effective priority. This yields high latencies when
at least one task has a large execution time (e.g., raster generation in laser printers).
54
Indeed,
Data exchange operations between interrupt routines and tasks have to be correctly
implemented (cf. Chapter 3).
55
Example: Serial filter
Principles:
Incoming bytes are signaled by interrupt requests, which must be answered as soon
as possible (before the next received byte).
When a UART is ready to send a byte on its output line, it requests an interrupt. The
processor is then free to wait for an arbitrarily long time before providing this byte.
56
Solution:
#include "types.h"
#include "fifo.h"
#include "filter.h"
57
void main(void)
{
!! initialize global data ;
!! initialize interrupt vectors ;
enable();
for (;;)
{
if (fifo_content_size(rx1) >= FILTER_THRESHOLD)
{
!! remove data from rx1;
!! filter ;
!! add the result to tx2;
}
byte = fifo_get(tx1);
disable();
!! send byte to UART1;
uart1_ready = 0;
enable();
}
58
if (uart2_ready && !fifo_is_empty(tx2))
{
char byte;
byte = fifo_get(tx2);
disable();
!! send byte to UART2 ;
uart2_ready = 0;
enable();
}
}
}
Notes:
Attempting to add data to a saturated FIFO buffer cannot be a blocking operation (i.e.,
it must instead discard data).
59
The functions for handling FIFO buffers must execute correctly both in the interrupt
routines and in the main code.
Example of implementation:
...
intr_enabled = disable();
!! critical section;
if (intr_enabled)
enable();
...
}
60
The waiting-queue architecture
Principles:
In the same way as the round-robin with interrupts architecture, the operations are
partitioned into urgent and non-urgent tasks.
Interrupt routines perform urgent operations, and then place in a waiting queue
requests for executing non-urgent tasks.
The main program retrieves execution requests from the queue and calls the
corresponding functions. These requests are not necessarily processed in FIFO order.
(For instance, different selection priorities can be assigned to non-urgent tasks.)
61
Illustration:
#include "queue.h"
62
void main(void)
{
!! initialize waiting_queue with an empty content ;
for (;;)
{
while (!queue_is_empty(waiting_queue))
{
!! extract a function from waiting_queue;
!! execute this function;
}
}
}
void task1(void)
{
!! non-urgent operations of task 1;
}
void task2(void)
{
!! non-urgent operations of task 2 ;
}
..
.
void taskn(void)
{
!! non-urgent operations of task n;
}
63
Advantage: The latency of a non-urgent high-priority task can become smaller than the
execution time of all the non-urgent operations.
Drawbacks:
The maximum latency of a non-urgent task is still at least as large as the execution
time of the slowest task.
With the queue architecture, it is possible to ensure that the values produced by critical
sensors are always taken into account, even in the case of data saturation caused by a
malfunctioning low-priority sensor.
64
The real-time operating system architecture
Principles:
Urgent operations are performed by interrupt routines. Those are able to signal to
other tasks that non-urgent operations are ready to be carried out.
The non-urgent tasks are invoked dynamically rather than in a predefined order. The
responsibility of calling tasks is assigned to the operating system, implemented as an
additional software component.
The operating system is able to suspend the execution of a task before its completion,
in order to transfer the processor to another task.
The signals exchanged between tasks are handled by the operating system, instead of
being implemented with shared variables.
65
Illustration: #include "signal.h"
void task1(void)
{
!! wait for signal 1;
!! non-urgent operations of task 1;
}
void task2(void)
{
!! wait for signal 2 ;
!! non-urgent operations of task 2 ;
}
..
.
void main(void)
{
!! initialize the operating system;
!! create and enable tasks;
!! start task sequencing ;
}
66
Advantages:
One can easily combine low-latency operations together with long computations.
priority
Urgent 1
Urgent 2
Urgent 1
Urgent 2
Urgent n
Task 1
Urgent n
Task 2
Task n
67
The system is efficient: When a non-urgent task is waiting for a signal, the processor
remains available for other computations.
The structure of the system is robust: New features can easily be added without
affecting the latencies of urgent operations or of high-priority tasks.
68
Drawbacks:
The system is complex (but this complexity is mainly located in the operating system,
which can be reused over many projects).
The operating system consumes some amount of system resources (a typical figure is
2 to 4 % of the instructions executed by the processor).
69
Summary
70
Robustness and simplicity:
71
Chapter 5
72
Introduction
The real-time operating systems (RTOS) are operating systems specifically suited for
embedded applications:
73
The following features are precisely documented:
74
Execution levels
At a given time, the instruction currently executed by the processor can either be
75
The processes
Dormant: The task is not scheduled (e.g., because it is not yet known to the OS).
Executable: The task is ready to execute instructions, but is not currently running.
Active: The instructions of the task are now being executed by the processor.
Blocked: The execution of the task is suspended while waiting for a signal, or for a
resource to become available.
76
Possible transitions between the states of a process:
Blocked
Interrupted
77
The scheduler
The scheduler is the kernel component responsible for managing the state of the
processes, i.e., for assigning the processor to the processes.
Principles:
The scheduler always assigns the processor to the non-dormant and non-blocked task
that has the highest priority.
If several tasks share the highest priority, then the conflict can be solved in several ways:
The time slicing approach consists in assigning the processor in turn to each of these
tasks, in order to execute a bounded sequence of instructions.
78
One can alternatively assign the processor to a task that has been arbitrarily chosen.
Note: With the first two strategies, computing the deadline of a task becomes difficult. Most
real-time operating systems thus implement the third solution.
79
Preemption
If a task T 2 has a higher priority than the active task T 1 and switches from the blocked to
the executable state, then there are two possible scheduling strategies:
The task T 2 remains suspended (in executable state) until completion of T 1. The
scheduler is said to be non-preemptive.
T1
Interrupt
routine
T2
80
Drawback: The latency of a task is influenced by the behavior of tasks with a lower
priority.
The scheduler turns the task T 1 executable, and assigns the processor to T 2. The
scheduler is said to be preemptive.
T1
Interrupt
routine
T2
The resource expected by
T 2 becomes available
Preemption
81
Context switching
The scheduler performs a context switch when it transfers the processor from a process to
another.
Principles:
The suspended task must be able to resume its execution later. The state of the
processor thus has to be saved when the task is suspended.
The kernel memory maintains for each non-dormant process a context storage area
for this purpose.
82
Illustration:
Kernel T1
T2
T1
T2
..
t
83
The working data of the suspended task has to be preserved until its execution can be
resumed.
This data is located on the runtime stack of the task, which contains
the context (return address, stack register values) of the active function calls, and
the arguments and local variables of these function calls.
84
Example:
..
f(int a, int b)
{ f call
int c; context
.. a, b
c = g(a);
.. c
} g call
context
g(int d)
d
{
int e, f; e, f
.. B g call
e = g(f); context
.. d
}
e, f
SP
85
Notes:
Since a task can be suspended at any time, it is necessary for each process to
manage its own stack.
In general the stack pointers (e.g., top of stack, base of current stack frame) are
particular processor registers. Those pointers are therefore saved, together with
the other registers, during a context switch.
The kernel also manages its own stack.
86
Reentrancy
With a preemptive scheduler, calling the same function in different tasks can be
problematic.
Example:
y2 3 y1 2
87
Definition: A function is said to be reentrant if it can be simultaneously called by several
tasks without possibility of conflict.
Examples:
Reentrant function:
void swap(int *p1, int *p2)
{
int aux;
aux = *p1;
*p1 = *p2;
*p2 = aux;
}
Non-reentrant function:
void display(int v)
{
if (is_new)
{
printf(" %d", v);
is_new = 0;
}
else
printf(" ---");
}
88
Note: The second function is non-reentrant for three reasons:
The test and assignment operations over the global variable is_new are performed
by different instructions.
The operations involving is_new are not necessarily atomic.
The function printf might not be reentrant.
89
Communication between tasks
Organizing data transfers between processes is more difficult than between tasks and
interrupt routines:
Context switches can only be disabled in software, by modifying the scheduling policy.
90
The semaphores
wait(s):
signal(s):
if at least one task is suspended as the result of an operation wait(s), make one
of them become executable;
otherwise, v(s) v(s) + 1.
91
Notes:
The operations that test and modify the value of a semaphore must be implemented
atomically.
Binary semaphores are semaphores with a value restricted to the set {0, 1}.
There are several possible strategies for selecting a task blocked on a semaphore in
order to make it executable again: arbitrary choice, FIFO policy, priorities, . . .
Example: Mutual exclusion between two tasks (binary semaphore s initialized to 1).
92
The message queues
Principles:
The maximum capacity of a queue (i.e., the maximum number of messages that have
been written and not yet read) and the size of each message are fixed.
A task that is waiting to receive data from a queue is suspended by the scheduler (in
blocked state).
Variants:
Several data access policies are possible: FIFO order, arbitrary selection, priority
mechanism.
93
Sending data to a saturated message queue can either discard the new message,
block the sender, block the sender during a bounded amount of time, . . .
When a task is blocked waiting for data from an empty queue, a maximum suspension
delay (i.e., a timeout) can be specified.
94
Programming with interrupts
The scheduler and the interrupt mechanism are both able to move the control point from
one location in the program code to another. One must take care of avoiding conflicts
between those mechanisms.
First rule:
95
Indeed, if this rule is not respected, then an interrupt routine can get suspended, which
amounts to assigning to this interrupt routine an effective priority smaller than the one
of a task.
Example:
T1
Interrupt
routine
T2
T 1 is suspended
T 1 is resumed
T3
T 1 becomes active
End of interrupt
96
Moreover, the interrupt routine might get called again before its completion. If this
routine is not reentrant, then erroneous behaviors are possible (e.g., overwriting a
saved processor context).
T1
Interrupt
routine
T2
T 1 is suspended
Reentrant call
End of interrupt
T 1 is resumed
t End of interrupt
97
Second rule:
If an interrupt routine calls an OS service that can lead to a context switch, then
the scheduler must be informed that this service call is performed inside an
interrupt routine.
If this rule is not respected, then the scheduler can suspend the execution of an interrupt
routine.
Example:
T1 Interrupt
routine
T2
T 1 is preempted
End of interrupt
98
Solution: At the beginning and at the end of each interrupt routine programmed by the
user, special OS services must be called in order to inform the kernel that the processor is
currently executing an interrupt routine.
Notes:
In the case of many levels (i.e., priorities) of interrupts, those services must handle
correctly nested interrupt routine calls.
Interrupt latencies are increased by the time needed for executing the notification
services.
99
Example:
T1 Interrupt
Interrupt request routine
Kernel
Service call
Possible preemption
T 1 is preempted
T2
Context switch
t
End of interrupt
100
Note: An interrupt routine containing explicit instructions for saving the processor state
must perform the corresponding restore operations before informing the kernel that the
interrupt routine is about to end.
101
Time-oriented services
The real-time operating systems offer timed services, for instance for suspending a task for
a predefined amount of time.
Principles:
102
Note: The precision is limited. Asking to suspend the task during k ticks only ensures that
the suspension delay is greater or equal to k 1 times the clock period.
Example:
Periodic interrupt
Higher-priority tasks
Timed task
delay(1) delay(1)
103
Chapter 6
104
Overview of the main problems
The following operations need to be efficient (i.e., ideally, to have a maximum execution
time that is independent from the number of tasks managed by the operating system):
Identifying the executable process with the highest priority in order to make it
active.
Performing a context switch.
Selecting the process that has to be unblocked following an operation over a
communication object.
For some applications, the real-time operating system has to share the processor with
another operating system.
105
Task control blocks
A Task Control Block (TCB) is a data structure that represents a non-dormant process
inside the kernel memory. This structure contains:
Note: When the task priorities are fixed and unique, they can also be used as process
identifiers.
The context of the task, i.e., the state of the processor, saved when the task was last
suspended.
Notes:
This context contains, in particular, a pointer to the runtime stack of the process.
Some operating systems (e.g., COS-III) save the bulk of the context on this stack.
106
Information for managing a potential timeout.
Pointers linking the TCB to the global data structures of the kernel.
Auxiliary data aimed at speeding up some operations (e.g., values derived from the
task priority).
Additional information managed by the user (e.g., configuration data for a peripheral
controlled by the task).
107
Global data structures of the kernel
Those sets are organized as simply or doubly-linked lists, or by hash tables, in order to
be able to manipulate them in constant time.
A structure for identifying efficiently the executable process with the highest priority.
An index for accessing directly a task control block from its corresponding process
identifier.
108
Example: COS-III
an array OSPrioTb[] of bit fields (with a width suited for the processor
architecture). Each set bit corresponds to a priority for which there exists at least
one executable (or active) process.
an array OSRdyList[] associating to each priority level a doubly-linked list of TCB
of executable processes.
Note: COS-III allows several processes to share the same priority. Such
processes are then scheduled by time slicing.
A pointer OSTCBCurPtr to the TCB of the currently active task.
The set of processes suspended for some delay is represented by a hash table
OSCfg_TickWheel[], indexed by their deadline.
109
Keeping an index of all non-dormant tasks is not necessary, because process
identifiers are defined as pointers to the corresponding TCB.
Managing a list of free TCB is avoided by letting the user code allocate TCB when
tasks are created.
A global counter OSIntNestingCtr keeps tack of the current interrupt nesting level.
110
Illustration (32-bit processor)
31 0
OSPrioTb[] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1
OSTCBCurPtr OSRdyList[] 0
1
2
14
42
14 14 42
111
Note: Identifying quickly the executable process with the highest priority is achieved by
112
The scheduler
The scheduler is implemented by a kernel function called after each operation that can
potentially influence the state of processes:
This function must be kept simple and efficient, and only performs the following operations:
113
Note: The possibility of enabling or disabling preemption is offered because
preempting the current task should be prevented inside interrupt routines (cf.
Chapter 5).
increases the latency of the tasks (by the duration of the longest interval during
which preemption is disabled), and
affects all the tasks of the system (not only those that need to be coordinated).
114
Context switching
The main issue for implementing context switching is to be able to save and restore all the
processor registers.
For many processors, these operations are automatically performed (totally or in part)
during interrupts:
When an interrupt routine is called: The current value of the registers is saved on the
runtime stack of the interrupted task.
When an interrupt routine returns: The values extracted from the current stack are
loaded into the processor registers.
115
The operations performed by this interrupt routine thus amount to
1. Saving the stack pointer of the suspended task into its associated TCB.
2. Loading the stack pointer of the task that becomes active from its associated TCB.
Notes:
This approach avoids the need to store the entire state of the processor into a TCB.
Preserving the state of the processor between a kernel service call and the
subsequent context switch can be tricky.
The user can sometimes define a hook function that will be called at every context
switch (a typical application is to put a peripheral in sleep mode).
116
Task creation and destruction
Notes:
The runtime stack of a new process is allocated by the task that asks this process to
be created.
The initial processor context of a new task, including its entry point, is built when its
stack is initialized.
One must take care of removing references to a task that is destroyed from the
structures managing communication objects.
117
The idle task(s)
Some operating systems systematically create one or many internal tasks, with a lower
priority than the processes instantiated by the user.
The scheduler can be more efficient, since it does not has to check whether there
exists at least one executable task.
In the case of a mobile system, an idle task can put the processor and some
peripherals in sleep mode in order to conserve energy.
118
Time management
One computes the set of suspended tasks that must become executable again.
Notes:
The maximum execution time of the clock interrupt routine depends on the mechanism
used for waking up tasks.
With the help of suitable data structures, this time can become equal to the number of
tasks becoming executable.
It is often necessary to configure and calibrate the clock interrupt source during
initialization.
119
Example: COS-III
The time management operations are not directly performed in the clock interrupt
routine, but in an internal task OS_TickTask() woken up by this routine.
Advantage: Some user-defined tasks can have a higher priority than the time
management operations.
The deadline of the tasks suspended to a timeout is expressed with respect to a global
clock counter.
A hash table OSCfg_TickWheel[] stores pointers to the TCB of those tasks, indexed
by their deadline. Tasks sharing the same table entries are sorted in increasing
deadline order.
120
Communication and synchronization objects
Data representing the state of the object (e.g., for a semaphore, an integer counter).
Finally, the TCB of each task waiting for an object contains a pointer to the structure
managing this object.
Note: Implementing kernel services that update the state of objects does not require
specific instructions such as test and set, since it is sufficient to disable interrupts during
non-atomic operations.
121
Combining a real-time and a non-real-time operating systems
The operations of the host operating system are suspended when the real-time OS is
started, and get the processor back when the RTOS terminates (e.g., COS-III).
The host operating system is seen as special task that has a lower priority than all the
tasks managed by the real-time OS (e.g., RTAI).
For implementing this approach, it is necessary to ensure that the host OS can never
disable the interrupts managed by the kernel or the real-time tasks.
122
Chapter 7
Scheduling problems
123
Priority inversion
Priority inversion happens when a task is suspended waiting for a resource controlled by
another task with a lower priority.
T1
Example: T 1 is preempted
wait( s)
T2
T 2 is preempted
T3
T 3 is suspended
wait( s)
T 2 terminates
T 3 is resumed
signal( s)
124
Problem:
Solution:
The priority of T 1 can be momentarily increased (becoming equal to that of T 3) during all
the time that T 3 is suspended waiting for the semaphore acquired by T 1.
125
Illustration:
T1
wait( s)
T2
T3
wait( s)
signal( s) Priority = 3
126
Periodic tasks scheduling
The execution requests for each task occur periodically, i.e., with a constant delay
between two successive requests.
In particular, the timing of execution requests for a task cannot depend on operations
performed by other tasks.
127
The following real-time constraint must be satisfied:
Each execution of a task must finish before or at the same time as the next
request for executing this task.
128
Critical instants and critical zones
Definitions:
The response time of an execution request for i is the delay between this request and
the end of the corresponding execution of this task.
A critical instant for the task i is an occurrence of an execution request for i that
leads to the largest possible response time for this task.
A critical zone for i is an interval of duration T i that starts at a critical instant (for i).
129
Theorem 1: A critical instant for i occurs when an execution request for this task coincides
with requests for executing all the tasks that have a higher priority than i.
Proof: Assume that an execution request for i occurs at t = t1, and that an execution
request for a higher-priority task j is received at t = t2.
t1 t2 t2 + C j t2 + T j t1 + T i
t
Ti
Advancing the request for j from t2 to t1 can never decrease the response time of i
The same reasoning can be applied to all the tasks that have a higher priority than i.
130
Schedulable tasks
t
2
0 1 2 3 4 5
Critical zone
131
The tasks are schedulable, and remain schedulable even if the execution time of 2 is
increased by one time unit (C2 = 2):
t
1
0 1 2 3 4 5
t
2
0 1 2 3 4 5
Critical zone
t
1
0 1 2 3 4 5
Critical zone
Note: In this case, the execution time of 1 and 2 cannot be increased anymore.
132
Rate-Monotonic Scheduling
In the previous example, the best strategy was to assign the highest priority to the task that
has the smallest period.
T i < T j Pi > P j .
Theorem 2: If a set of tasks is schedulable with respect to some priorities assignment, then
it is schedulable as well with respect to priorities defined by the RMS strategy.
133
Proof: Consider a set of tasks 1, 2, . . . , n for which there exists a priorities assignment
P1, P2, . . . Pn that makes them schedulable.
Let i and j two tasks with adjacent priorities Pi and P j, such that Pi > P j.
j
0 Tj Ti
t
i
The processor load factor U corresponding to this set of tasks represents the relative
amount of CPU time needed for executing them:
n
X C i
U= .
Ti
i=1
any increase of the execution time of a task (and hence of the processor load factor)
yields a set of tasks that is not schedulable anymore.
135
Notes:
A set of tasks that has a processor load factor less than 1 is not necessarily
schedulable:
Example:
1 : T 1 = 5, C1 = 2
)
2 4
U= + 97%
2 : T 2 = 7, C2 = 4 5 7
t
1
0 1 2 3 4 5 6 7
t
2
0 1 2 3 4 5 6 7
136
Classifying sets of tasks
The schedulable sets of tasks that do not fully use the processor.
100%
Non schedulable sets of tasks
0%
137
The best lower bound U L on the processor load factor of the sets of tasks that fully use the
processor is such that:
If the processor load factor of a set of tasks is less than or equal to U L, then this set of
tasks is schedulable (regardless of the periods and execution times of the tasks!).
If the processor load factor of a set of tasks is greater than U L, then this set of tasks
may or may not be schedulable, depending on the details of the tasks.
138
U L: Case of two tasks
Let 1 and 2 be two tasks with respective periods and execution times T 1, T 2 and C1, C2.
We assume T 1 < T 2. According to the RMS strategy, we assign a higher priority to 1.
lT m
During a critical zone of 2, the number of execution requests for 1 is equal to T 2 .
1
C2
t
2
0 T1 T2
139
For a given value of C1, the largest possible value of C2 is given by
& '
T
C2 = T 2 C1 2 .
T1
0
jT k C1
T2 T1 2
T1
140
If an execution of 1 is still unfinished at t = T 2.
C1 C1
1
C2
t
2
0 T1 T2
141
For given values of T 1 and T 2, this expression increases with C1, since
$ %
1 1 T2
0.
T1 T2 T1
U
jT k j T k
T1
T2
2
T1 + C1 1
T1 1
T2
2
T1
0 jT k
T2 T1 2
T1 T1 C1
142
Summary:
U
jT k j T k
T1
T2
2
T1 + C1 1
T1 1
T2 T1
2
0 jT k
T2 T1 2
T1 T1 C1
The smallest value of U corresponds to the boundary between the two cases, where we
have
$ %
T
C1 = T 2 T 1 2 .
T1
143
$ % $ %
T2 T T
Let us define I = and f = 2 2 .
T1 T1 T1
I I2
U = + (I + f ) 2I +
I+ f I+ f
1 f
= 1 f .
I+ f
144
Case of two tasks: Conclusions
Theorem 3: If a set of two periodic tasks has a processor load factor that is less than or
equal to 2( 2 1), then this set of tasks is schedulable.
Notes:
This sufficient criterion is independent from the periods and execution times of the
tasks.
U L = 1.
C1 C2
All pairs of tasks satisfying this condition (and such that + 1 !) are thus
T1 T2
schedulable.
145
U L: Case of n tasks
Lemma 1: Let 1, 2, . . . , n be periodic tasks with the respective periods and execution
times T 1, T 2, . . . , T n and C1, C2, . . . , Cn, such that
The processor load factor of this set of tasks is minimum among all sets of tasks that
fully use the processor.
146
In this case, one has
C1 = T 2 T 1,
C2 = T 3 T 2,
...
Cn1 = T n T n1,
Cn = T n 2(C1 + C2 + Cn1)
= 2T 1 T n.
C1 C2 Cn1 Cn C1 C2 Cn1
t
0 T1 T2 T 3 T n1 Tn
147
Proof: By contradiction, let us show that we must have C1 = T 2 T 1.
If C1 = T 2 T 1 + , with > 0.
C10 = C1 ,
C20 = C2 + ,
C30 = C3,
...
0
Cn1 = Cn1,
Cn0 = Cn.
T2 +
C1 C2 C1 C2
t
0 T1 T2
C10 C20 C10 C20
148
After the modification, the new set of tasks still fully uses the processor. However, the
processor load factor now becomes
U0 = U + < U,
T1 T2
which contradicts the hypothesis that U is minimum.
If C1 = T 2 T 1 , with > 0.
C100 = C1 + ,
C200 = C2,
C300 = C3,
...
00
Cn1 = Cn1,
Cn00 = Cn 2.
149
T2
C1 C2 n C 1 n C 2 n n
t
0 T1 T2
C100 C200 n C100 C200
The resulting set of tasks fully uses the processor. The processor load factor becomes
0 2
U =U+ .
T1 Tn
C2 = T 3 T 2,
C3 = T 4 T 3,
...
Cn1 = T n T n1.
150
Since the processor is fully used, one finally gets
Cn = T n 2(C1 + C2 + Cn1).
Corollary: For each set of tasks that satisfies the hypotheses of Lemma 1, the processor
load factor is equal to
T2 T1 T3 T2 T n T n1
U = + + +
T1 T2 T n1
2T T n
+ 1
Tn
T T Tn T
= 2 + 3 + + + 2 1 n.
T1 T2 T n1 Tn
151
T
For each i = 1, 2, . . . , n 1, let us define qi = i+1 . We then have
Ti
2
U = q1 + q2 + + qn1 + n,
q1q2 qn1
and thus for each i,
U q q qi1qi+1 qn1
=12 1 2 .
qi (q1q2 qn1) 2
152
For each i, one has
2
qi = ,
q1q2 qn1
hence
1
q1 = q2 = = qn1 = 2 .
n
Theorem 5: If a set of n periodic tasks has a processor load factor that is less than or equal
1
to n(2 1), then this set of tasks is schedulable.
n
We replace i by 0i with the period T i0 = qT i and the execution time Ci0 = Ci.
We replace n by 0n, with the period T n0 = T n and an execution time Cn0 chosen so as to
fully use the processor.
154
Ci Ci Ci Ci
t
0 Ti 2T i (q 1)T i qT i Tn
Ci0 = Ci
T i0
In the critical zone of n, the amount of execution time used by i and leaved unused by 0i
is at most equal to (q 1)Ci. Therefore, one has
Cn0 Cn (q 1)Ci.
After modifying the set of tasks, the processor load factor U 0 becomes equal to
Ci0
C (q 1)Ci
U0 U + 0 i +
Ti Ti Tn
where U is the processor load factor of the initial set of tasks.
155
Since we have qT i T n, this leads to
1 1 q1 1 1 q1
+ +
qT i T i Tn qT i T i qT i
0.
As a consequence, we have U 0 U . This implies that our modification of the set of tasks
did not increase the processor load factor.
156
The limit processor load factor
x x2 x3
For all x > 0, we have e = 1 + x + + + , hence
2! 3!
x 2 x 3
(1 x)e x = (1 x) + (1 x)x + (1 x) + (1 x) +
! ! 2! 3!!
1 2 1 1 3 1 1 4
=1 1 x x x +
2! 2! 3! 3! 4!
< 1.
157
For an asymptotically large number of tasks, we obtain
1
lim U (n) = lim n(2 n 1)
n L n
1
2n 1
= lim 1
n
n
ln 2 2 1n
n2
= lim
n 1
n2
= ln 2
0.69
158
In summary, we have the following result:
Theorem 6: If a set of periodic tasks has a processor load factor that is less than or equal
to ln 2, then this set of tasks is schedulable.
Conclusion: The following algorithm can be used for checking efficiently whether a set of n
periodic tasks with a processor load factor equal to U is schedulable or not:
159
Notes
In situations where U 69% for the periodic tasks, the processor does not have to
remain unused during 31% of the time! One can instead run low-priority tasks that are
not bound by real-time constraints.
For some specific class of sets of tasks, one can obtain U L = 100%, which guarantees
that every set of tasks for which U 100% is schedulable.
160
Let us show that this set of tasks is schedulable.
T
The critical zone of 2 contains 2 complete executions of 1:
T1
t
1
0 T1 2T 1 (k 1)T 1 kT 1
T2
The condition that must be satisfied in order to finish the execution of j before the end
of its critical zone is thus
Tj Tj Tj
C j T j C1 C2 C j1
T 1 T 2 T j1
161
After simplification, this condition becomes
C1 C2 Cj
+ + + 1,
T1 T2 Tj
162
Chapter 8
163
Introduction
In order to analyze the properties of a complex system, it is not always sufficient to study
the individual behavior of its components.
Two sensors located on the tracks 1000 meters before and 100 meters after the
crossing, aimed at detecting (respectively) that a train approaches or has passed the
crossing.
A receiver that processes the signals emitted by the sensors, and sends orders to
open or close the gate.
receiver
1000 m 100 m
164
The following information is known:
The speed of the approaching trains is between 48 and 52 m/s. Then, after reaching
the first sensor, their speed is reduced to a value between 40 and 52 m/s.
After it receives a signal from a sensor, the receiver waits for at most 5 seconds before
sending an order to close or to open the gate. During this delay, the receiver ignores
incoming signals.
The gate is closed (resp. open) when its angle is equal to 0 (resp. 90) deg. The gate is
able to move at the rate of 20 deg/s.
Question: Is the gate always closed when a train passes the crossing?
165
Modeling a system
In order to analyze the properties of a system, the first step consists in building a model,
i.e., an abstract representation of the system that describes its relevant properties without
any ambiguity.
166
Hybrid systems
Syntax:
a finite number n of variables x1, x2, . . . , xn, grouped together into a vector ~x Rn,
167
Each control location v Vi is associated with:
An activity dif (v), expressed as a conjunction of linear constraints over the variables
x1, x2, . . . , xn and their first temporal derivative x1, x2, . . . , xn.
An invariant inv (v), expressed as a conjunction of linear constraints over the variables
x1, x2, . . . , xn.
A guard guard (e), that represents a condition that must be satisfied in order to enable
this transition.
An action act (e), composed of constraints that specify how the values of the variables
are modified when this transition is followed.
In practice, the guard and the action can be combined into a conjunction of constraints
over the values of the variables before ( x1, x2, x3, . . . ) and after ( x10 , x20 , x30 , . . . )
following the transition.
168
An optional label sync (e) L that makes it possible to synchronize this transition with
a transition belonging to another process.
Finally, one defines an initial control location for each process, and assigns a set of
possible initial values for each variable, specified as a conjunction of linear constraints.
169
Example: Process modeling the behavior of a train and the two sensors.
The distance between the train and the crossing is represented by a variable x1.
The signals emitted by the sensors are modeled by two synchronization labels app
and exit.
170
[1] [2]
x1 1500 x1 = 1000
1000 x1 0 x1 1000
52 x1 48 app 52 x1 40
x1 = 100
exit x1 = 0
x10 1500
[3]
x1 100
40 x1 52
171
Process modeling the receiver:
The delay between receiving a sensor signal and sending an order to the gate is
represented by a variable x2.
The labels raise and lower model the orders sent to the gate.
172
app app
exit exit
[2] [3]
0 x2 5 0 x2 5
x2 = 1 x2 = 1
app exit
x20 = 0 x20 = 0
lower raise
[1]
x2 = 0
x2 = 0
173
Process modeling the gate:
174
raise raise
[1] [2]
x3 = 90 x3 = 90
0 x3 90 x3 = 90
x3 = 20 x3 = 0
lower
raise lower
raise
[3] [4]
x3 = 0
0 x3 90 x3 = 0
x3 = 20 x3 = 0
lower lower
175
Semantics:
By letting time elapse (time steps). The control locations of processes stay
unchanged, and the values of the variables evolve according to the invariants and
activities associated to these locations.
176
In both cases, a transition can only be followed provided that its guard is satisfied by
the current variable values.
When a transition is followed, the variable values are modified according to the action
associated to the transition. The invariant of the destination location must be satisfied
by the new variable values (otherwise, the transition cannot be followed).
A state s2 is reachable from a state s1 if there exists a finite sequence of time steps and
transition steps that lead from s1 to s2.
177
Example: The state ([2], [2], [2], 800, 4, 90) of the railroad crossing controller model
corresponds to
the respective values 800, 4 and 90 for the variables x1, x2 and x3.
= denotes a time step with a delay equal to ,
`
corresponds to following a pair of transitions sharing the synchronization label `.
178
Executions of a hybrid system
An execution of a hybrid system is an infinite sequence s1, s2, s3, . . . of states such that:
For each i, the state si+1 is reachable from the state si in a time i 0.
The time spent at a control location may not be precisely constrained by the invariant.
A control location can have several outgoing transitions enabled at a given time.
An execution s1, s2, s3, . . . beginning at time t = 0 is said to be divergent if for every T > 0,
there exists i such that the state si is reached later than time t = T .
179
Zeno hybrid systems
A hybrid system is said to have the Zeno property if it admits an execution in which at least
one finite prefix is not a prefix of a divergent execution.
In other words, in a Zeno hybrid system, there exists a reachable state from which no
execution is able to get past some time bound.
x1 = 10
x1 0
x2 = 0
x1 = x2
x2 = g
180
x1
Remarks:
Such models are inconsistent with physical reality and must be avoided!
For some restricted classes of hybrid systems, automatic methods have been
developed for transforming any given model into another one that does not have the
Zeno property, and admits the same divergent executions.
181
State-space exploration
This computation can be carried out by building, from every initial state, a tree in which
each node q represents a reachable state s(q), and the children of q correspond to the
states that are reachable from s(q) by
a time step, or
a transition step.
Problems:
The time spent at a control location may take an infinite number of possible values,
which leads to trees of infinite degree.
Since executions are infinite, the trees also have an infinite depth.
182
Solutions:
Sets of states sharing the same control locations and differing only in the elapsed time
in those locations can be grouped into regions. A tree can be built in which the nodes
are associated with regions instead of individual states.
At each exploration step, a first operation saturates the current region by letting time
elapse during all possible delays. Then, the enabled transitions are individually
followed, creating new branches.
The branches of the exploration tree that only contain already visited states can be
pruned.
Notes:
183
For general hybrid systems, the region tree can still be infinite. It is however possible
to define restricted classes of models, for which a finite region tree can always be
computed.
Some tools are available for exploring automatically the state space of hybrid systems
(e.g., HyTech) or timed automata (e.g., Uppaal).
184
Example: Railroad crossing
185
x3 =0
([2], [1], [4]) : x1 766 52,
x1 820 40,
x2 = , x3 = 0,
0 5.
= ([2], [1], [4]) : 0 x1 820 40,
x2 = , x3 = 0,
0 5.
x1 =0
([3], [1], [4]) : x1 = 0, x2 = ,
x3 = 0, 0 5.
5/2
= ([3], [1], [4]) : 0 x1 100,
x2 = , x3 = 0,
0 5.
exit
([1], [3], [4]) : x1 1500, x2 = 0,
x3 = 0.
5
= ([1], [3], [4]) : x1 1500 52,
x2 = , x3 = 0,
0 5.
raise
([1], [1], [1]) : x1 1500 52,
x2 = , x3 = 0,
0 5. 186
9/2
= ([1], [1], [1]) : x1 1500 52( + ),
x2 = , x3 = 20,
0 5, 0 9/2.
x3 =90
([1], [1], [2]) : x1 1266 52,
x2 = , x3 = 90,
0 5.
= ([1], [1], [2]) : x1 1000,
x2 = , x3 = 90,
0 5.
app
([2], [2], [2]) : x1 = 1000,
x2 = 0, x3 = 90
(already obtained).
187
Notes:
In this example, the regions correspond to the sets of states obtained after each
time-step operation (denoted by =).
Checking whether the gate is always closed when a train reaches the crossing
amounts to verifying that in each reachable region, x1 = 0 implies x3 = 0.
This particular system shows a very deterministic behavior: In each reachable state,
there is at most one transition (or a pair of synchronized transitions) that is enabled.
188