0% found this document useful (0 votes)
11 views

Lecture7 Embedded Software

The document outlines the curriculum for the Embedded Systems course at VietNam National University, covering topics such as embedded software, ARM Cortex-M architecture, real-time operating systems, and interfacing with real-world applications. It introduces key concepts, components, and development environments for embedded systems, along with practical examples like a seat belt controller project. The document also discusses program representation models, assembly, and linking processes essential for embedded software development.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

Lecture7 Embedded Software

The document outlines the curriculum for the Embedded Systems course at VietNam National University, covering topics such as embedded software, ARM Cortex-M architecture, real-time operating systems, and interfacing with real-world applications. It introduces key concepts, components, and development environments for embedded systems, along with practical examples like a seat belt controller project. The document also discusses program representation models, assembly, and linking processes essential for embedded software development.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 87

VietNam National University

University of Engineering and Technology

EMBEDDED SYSTEM FUNDAMENTALS


(ELT3240, NHẬP MÔN HỆ THỐNG NHÚNG)

Dr. Nguyễn Kiêm Hùng


Email: [email protected]

Laboratory for Smart Integrated Systems


Introduction to VietNam National University
Week
Embedded Systems 1-2
University of Engineering
Introduction to Cand Technology
Week
Languague 3

CPU: Week
ARM Cortex-M 4

Curriculum Memory
and Interfaces
Week
5-6

Path ARM-based Week


7
Embedded System

Embedded Software Week


8-9

Real-time Week
Operating systems 10-12

Interfacing Embedded Week


With Real-World 13-14

Project Week
Laboratory for Smart Integrated Systems 15
Objectives
In this lecture you will be introduced to:
– Definition of the embedded software
– Some useful structures for embedded software
development.
– Models of programs, such as data flow and
control flow graphs.
– An introduction to compilation methods.
– Analyzing and optimizing programs for
performance, size, and power consumption.

3
Outline
• Embedded Software
• Embedded Software components
• Representations of programs
• Assembly and linking
• Compilation flow

• Summary

4
Embedded Software
Definition
“Embedded software is computer software, written to control
machines or devices that are not typically thought of as
computers. It is typically specialized for the particular hardware
that it runs on and has time and memory constraints. This term
is sometimes used interchangeably with firmware.”

“Firmware” means that:


-The embedded software only repeatedly carries out particular
designated functions,
-The software is stored in ROM and executes on reset.
5
Embedded Software Overview
Common Components:
• Interrupt Vector Table
– Reset vector
• Startup code:
– initialize the system—both the hardware (e.g. stack pointer) and the
software system (e.g. global variables)
• Application code
• Libraries
• Interrupt/Exception Handler

6
Embedded Software Overview
How does a firmware execute:
Reset interrupt vector

7
Embedded Software Overview
How does a firmware execute:

Int I = 0;

8
Embedded Software Overview
How does a firmware execute:

Int x = 8

9
Embedded Software Overview
How does a firmware execute:

Main() is typically
implemented as a
endless loop which
is either interrupt-
driven or uses
polling for
controlling external
interaction or
internal events!

10
Development Environment
• Host: a computer running programming tools
for development
• Target: the HW on which code will run
• After program is written, compiled, assembled
and linked, it is transferred to the target

X86 MSP430

Host system Target system

11
SW Development Environment
Editor KeilTM uVision®
Simulated Processor
Source code Start Microcontroller
Start
; direction register Debug
LDR R1,=GPIO_PORTD_DIR_R Session Memory
LDR R0,[R1]
ORR R0,R0,#0x0F
; make PD3-0 output I/O
STR R0, [R1]

Build Target (F7)

Object code Real Processor


Microcontroller
0x00000142
0x00000144
4912
6808
Download
0x00000146 F040000F Memory
0x0000014A 6008 Start
Debug
Session I/O
Address Data

12
What If Real HW Not Available?
• Development board:
– Before real hardware is built, software can be developed
and tested using development boards
– Development boards usually have the same CPU as the
end product and provide many IO peripherals for the
developed software to use
as if it were running on the
real end product
• Tools for program development
– Integrated Development Environment
(IDE): cross compiler, linker, loader, …
– OS and related libraries and packages

13
Cross Compiler
• Runs on host but generates code for target
– Target usually have different architecture from host. Hence
compiler on host has to produce binary instructions that
will be understood by target

14
Outline
• Embedded Software
• Embedded Software components
• Representations of programs
• Assembly and linking
• Compilation flow
• Summary

15
Software state machine

• State machine keeps internal state as a


variable, changes state based on inputs and
the current state of the system.
• Uses for:
– control-dominated code;
– reactive systems: inputs appear intermittently
rather than as periodic samples

16
State machine example
Seat belt controller:
- Two inputs: seat sensor and belt sensor
- One Timer:
- One Output: Buzzer
no seat/-
no seat/ idle
buzzer off seat/timer on
no seat/- no belt
No Belt and Timer over and no
buzzer /buzzer on seated
timer/-
belt/timer off
belt/
buzzer off no belt/timer on
belted

17
State machine example
#define IDLE 0
#define SEATED 1
#define BELTED 2
#define BUZZER 3
Switch (state) { /* check the current state */
case IDLE:
if (seat){ state = SEATED; timer_on = TRUE; }
/* default case is self-loop */
break;
case SEATED:
if (belt) state = BELTED; /* won’t hear the buzzer */
else if (timer_over) state = BUZZER; /* didn’t put on belt in time */
/* default case is self-loop */
break;
case BELTED:
if (!seat) state = IDLE; /* person left */
else if (!belt) state = SEATED; /* person still in seat */
break;
case BUZZER:
if (belt) state = BELTED; /* belt is on---turn off buzzer */
else if (!seat) state = IDLE; /* no one in seat--turn off buzzer */
break;
18
}
Project
Seat belt controller • Hardware Requirement:
– 2 user LEDs
– 2 user push buttons
– Timer
– Option: 4-digit 7-segment
LCD module

19
Project
Seat belt controller
• Description: The controller turns on a buzzer if a person sits in
a seat and does not fasten the seat belt within a fixed amount
of time.
• This system has two inputs and two outputs.
– SW1 represents the sensor that detects when a person has sat down,
– SW2 represents the seat belt sensor that tells when the belt is fastened,
– Red LED is the output that represents the buzzer.
– Blue LED is the output that represents the belted state.
– LCD (option): displays current state of the system
• A timer is for setting the required time interval before turn on
buzzer
20
Signal processing and circular buffer
• Commonly used in signal processing:
– new data constantly arrives and must be processed in
real time;
– each datum has a limited lifetime.
time time t+1

d1 d2 d3 d4 d5 d6 d7

• In an embedded system we must not only emit


outputs in real time, but we must also do that
using a minimum amount of memory
– Use a circular buffer to hold the data stream.
21
Circular buffer

x1 x2 x3 x4 x5 x6 x7 x8 x9 x10

t1 t2 t3 t4 t5 t6
Data stream

x1
x5
x9 x2
x6 x3
x7 x4
x8

Circular buffer

22
Circular buffers
• Indexes locate currently used data, current input data:

input d1 use d5

d2 input d2

d3 d3

use d4 d4

time t1 time t1+1

23
Queues
• Are used whenever data may arrive and depart at somewhat unpredictable times
• Implementation:
– Declare the array
– The variables head and tail keep track of the two ends of the queue
– Two error conditions: Reading from an empty queue and writing to a full queue.

Head Head d1 d5
Tail
Tail
d2

d3 Head d3

Tail d4

time t0 time t1 time t1+1


24
Outline
• Embedded Software
• Software components
• Representations of programs
• Assembly and linking
• Compilation flow
• Summary

25
Models of programs

• Source code is not a good representation for


programs:
– many different types of source code,
– Clumsy.
• Require a single model:
– to describe all of types of source code,
– and to perform many useful analyses on the model more
easily
• Compilers derive intermediate representations to
manipulate and optimize the program.

26
Data flow graph

• DFG: data flow graph.


• Models a program (also called basic block):
– with only one entry and exit point.
– with only data operations (e.g. arithmetic and
other computations) and no control operations
(conditionals).

27
Single assignment form

x = a + b; x = a + b;
y = c - d; y = c - d;
z = x * y; z = x * y;
y = b + d; y1 = b + d;

original basic block single assignment form

28
Data flow graph

x = a + b;
a b c d
y = c - d;
z = x * y; + -
y1 = b + d;
x y

A basic block in single- * +


assignment form
z y1
DFG

29
DFGs and partial orders
- DFG shows the order in
which the operations are
a b c d performed in the program
+ -
- Partial order:
 [a+b, c-d]; [b+d, x*y]
y
x Can do pairs of operations
in any order.
* +
- DFG help us to determine
feasible reorderings of the
z y1 operations, so to reduce
pipeline or cache conflicts
30
Control-data flow graph
• CDFG: model both data operations
(arithmetic and other computations) and
control operations (conditionals).
• Uses data flow graphs as components, adding
constructs to describe control.
• Two types of nodes:
– Decision node: describe all the types of control in
a sequential program;
– Data flow node: represent a basic block.

31
Data flow node

Encapsulates a data flow graph to represent


basic block form:

x = a + b;
y=c+d

32
Control Nodes

T v1 v4
cond value

v2 v3
F

Equivalent forms

33
CDFG example
T
if (cond1) bb1(); cond1 bb1()
else bb2(); F
bb3(); bb2()
switch (test1) {
case c1: bb4(); break; bb3()
case c2: bb5(); break;
case c3: bb6(); break; c3
c1 test1
}
c2
bb4() bb5() bb6()

34
for loop

for (i=0; i<N; i++) i=0


loop_body();
for loop F
i<N

T
i=0;
while (i<N) { loop_body();
i++;
loop_body(); i++; }
equivalent
End

35
Outline
• Embedded Software
• Software components
• Representations of programs
• Assembly and linking
• Compilation flow
• Summary

36
Assembly and linking
Describes the instructions and
• Last steps in compilation: data in binary format.

37
Multiple-module programs

• Programs may be composed from several


files.
• Addresses become more specific during
processing:
– relative addresses are measured relative to the
start of a module;
– absolute addresses are measured relative to the
start of the CPU address space.

38
Assemblers

• Major tasks:
– generate binary representation for symbolic
assembly instructions;
– translate labels into addresses;
– handle pseudo-ops (data, etc.).
• Generally one-to-one translation.
• Assembly labels: represent locations of
instructions and data
ORG 100 ; pseudo-ops
label1 ADR r4,c ; 39
Two-pass assembly

• Pass 1:
– generate symbol table:
• determining the address of each label
• Pass 2:
– generate binary instructions
• using the label values computed in the first pass

40
Symbol table
ORG 0x7
ADD r0,r1,r2 xx 0xB
xx ADD r3,r4,r5 yy 0x13
CMP r0,r3
yy SUB r5,r6,r7

assembly code symbol table

First Pass
41
Symbol table generation

• Use program location counter (PLC) to determine


address of each location.
• Scan program, keeping count of PLC.
• Addresses are generated at assembly time, not
execution time.

42
Symbol table example
ORG 0x7
PLC=0x7 ADD r0,r1,r2 xx 0xB
xx PLC=0xB ADD r3,r4,r5 yy 0x13
PLC=0xF CMP r0,r3
yy PLC=0x13 SUB r5,r6,r7

43
Relative address generation

• Some label values may not be known at


assembly time.
• Labels within the module may be kept in
relative form.
• Must keep track of external labels---can’t
generate full binary for instructions that use
external labels.

44
Linking
• Combines several object modules into a
single executable module.
– An assembly language program are usually
written as several smaller pieces + uses library
routines rather than as a single large file.
• Jobs:
– put modules in order;
– resolve labels across modules.

45
External References and entry points
entry point
xxx ADD r1,r2,r3 a ADR r4,yyy
B a external reference ADD r3,r4,r5
yyy %1

46
Module ordering
• Code modules must be placed in absolute positions in the
memory space.
• Load map or linker flags control the order of modules.
– Some data structures and instructions must be put at precise
memory locations
– Many different types of memory may be installed at different
address ranges.
module1

module2

module3
47
Dynamic linking

• Some operating systems link modules


dynamically at run time:
– shares one copy of library among all executing
programs;
– saves storage space;
– allows programs to be updated with new versions
of libraries.

48
Reentrancy

• Interrupting program with another call to the


function does not change results.
– Changing global variables compromises
reentrancy.
• Recursive code:
int foo = 1;
int task1() {
foo = foo + 1;
return foo;
}

49
Outline
• Embedded Software
• Software components.
• Representations of programs.
• Assembly and linking.
• Compilation flow.
• Summary

50
Compilation
• Why need to understand Compilation process
– how a high-level language program is translated
into instructions: interrupt handling instructions,
placement of data and instructions in memory, etc
– how code is generated can help you meet your
performance goals:
• either by writing high-level code that gets compiled into
the instructions you want
• or by recognizing when you must write your own
assembly code.

51
Compilation
• Compilation process:
– compilation = translation + optimization
• The high-level language program is translated into the
lower-level form of instructions;
• optimizations try to generate better instruction
sequences
• Compiler determines quality of code:
– use of CPU resources;
– memory access scheduling;
– code size.
52
Basic compilation phases

High-level Language Code

Parsing, symbol table generation

Machine-independent
optimizations

Machine-dependent
optimizations

Assembly Code

53
Statement translation and optimization

• Source code is translated into intermediate


form such as CDFG.
• CDFG is transformed/optimized.
• CDFG is translated into instructions with
optimization decisions.
• Instructions are further optimized.

54
Arithmetic expressions

X= a*b + 5*(c-d) b
a c d
* -
expression
5

DFG
55
Arithmetic expressions, cont’d.
ADR r4,a
MOV r1,[r4]
a b c d ADR r4,b
MOV r2,[r4]
1 * 2 - MUL r3,r1,r2
5
ADR r4,c
MOV r1,[r4]
3 *
ADR r4,d
MOV r5,[r4]
SUB r6,r4,r5
4 + MUL r7,r6,#5
ADD r8,r7,r3
ADR r1,x
STR r8,[r1]
DFG Assembly code

56
Compiled code for arithmetic expressions

ldr r2, [fp, #-16] mov r3, r2


ldr r3, [fp, #-20] mov r3, r3, asl #2
mul r1, r3, r2 ; multiply add r3, r3, r2 ; add
ldr r2, [fp, #-24] add r3, r1, r3 ; add
ldr r3, [fp, #-28] str r3, [fp, #-32] ; assign
rsb r2, r3, r2 ; subtract

code generated by the ARM gcc compiler


57
Control code generation

if (a+b > 0) T

x = 5; a+b>0 x=5

else F

x = 7; x=7
expression

DFG

58
Control code generation, cont’d.
ADR r5,a
LDR r1,[r5]
ADR r5,b
T LDR r2,[r5]
1 a+b>0 x=5 2 ADD r3,r1,r2
CMP R3, #0
F
BLE label3
LDR r3,#5
3 x=7 ADR r5,x
STR r3,[r5]
B stmtent
Label3
stment LDR r3,#7
ADR r5,x
STR r3,[r5]
stmtent ...
59
Compiled code for control

ldr r2, [fp, #-16] mov r3, #5 ; true block


ldr r3, [fp, #-20] str r3, [fp, #-32]
add r3, r2, r3 b .L4 ; go to end of if
cmp r3, #0 ; test the statement
branch condition .L3: ; false block
ble .L3 ; branch to false mov r3, #7
block if <= str r3, [fp, #-32]
.L4:

code generated by the ARM gcc compiler


60
Procedure linkage
• At the procedure definition, Need code to:
– call and return;
– pass parameters and results.
• Linkage mechanism provides:
– a way for the program to pass parameters into the
program and for the procedure to return a value.
– help in restoring the values of registers that the procedure
has modified.
• Parameters and returns are passed on stack.
– Procedures with few parameters may use registers.

61
Procedure stacks

high address
growth
proc1 proc1(int a) {
proc2(5);
FP }
frame pointer
proc2
5 accessed relative to SP
SP
stack pointer

62
ARM procedure linkage

• APCS (ARM Procedure Call Standard):


– r0-r3 pass parameters into procedure. Extra
parameters are put on stack frame.
– r0 holds return value.
– r4-r7 hold register values.
– r11 is frame pointer, r13 is stack pointer.
– r10 holds limiting address on stack size to check
for stack overflows.

63
ARM procedure linkage
• APCS (ARM Procedure Call Standard):

64
Compiled procedure call code
ldr r3, [fp, #-32] ; get e
str r3, [sp, #0] ; put into p1()’s stack frame
ldr r0, [fp, #-16] ; put a into r0
ldr r1, [fp, #-20] ; put b into r1
ldr r2, [fp, #-24] ; put c into r2
ldr r3, [fp, #-28] ; put d into r3
bl p1 ; call p1()
mov r3, r0 ; move return value into r3
str r3, [fp, #-36] ; store into y in stack frame

65
Data structure transformations

• The compiler must also translate references to


data structures into references to raw
memories.
• Different types of data structures use different
data layouts.
• Some offsets into data structure can be
computed at compile time, others must be
computed at run time.

66
One-dimensional arrays

• C array name points to 0th element:

a a[0]
a[1] = *(a + 1) pointer
a[2]

67
Two-dimensional arrays
• Column-major layout:

a[0,0]
a[0,1] M
...
N

... a[1,0]
a[1,1] = a[i*M+j]

68
Structures
• A structure is implemented as a contiguous block of memory
• Fields within structures are static offsets:
– Fields in the structure can be accessed using constant offsets to the
base address of the structure

aptr
struct { field1 4 bytes
int field1;
char field2; *(aptr+4)
} mystruct; field2

struct mystruct a, *aptr = &a;

69
Compiler Optimizations

70
Expression simplification

• Constant folding:
– N+1 = 8+1 = 9 (N has ben declared as a constant)
• Algebraic:
– a*b + a*c = a*(b+c) (Why?)
• Strength reduction:
– a*2 = a<<1

71
Dead code elimination

• Dead code:
#define DEBUG 0
if (DEBUG) dbg(p1); 0
0
• Can be eliminated by
analysis of control flow, 1
constant folding. dbg(p1);

72
Procedure inlining

• Eliminates procedure linkage overhead:

int foo(a,b,c) { return a + b + c;}


z = foo(w,x,y);

z = w + x + y;

73
Register allocation

• Goals:
– choose register to hold each variable;
– determine lifespan of variable in the register.
• Basic case: within basic block.

74
Register lifetime graph

w = a + b; t=1
x = c + w; t=2 a
b
y = c + d; t=3
c
d
w
x
y

1 2 3 time

75
QUIZ

How to solve the situation where a section of code


requires more registers than are available?

w = a + b; /* statement 1 */
x = c + d; /* statement 2 */
y = x + w; /* statement 3 */
z = a − b; /* statement 4 */

Original code Lifetime Graph

76
QUIZ

How to solve the situation where a section of code


requires more registers than are available?

w = a + b; /* statement 1 */ w = a + b; /* statement 1 */
x = c + d; /* statement 2 */ z = a − b; /* statement 4 */
y = x + w; /* statement 3 */ x = c + d; /* statement 2 */
z = a − b; /* statement 4 */ y = x + w; /* statement 3 */

Original code Modified code

77
QUIZ

How to solve the situation where a section of code


requires more registers than are available?

w = a + b; /* statement 1 */
z = a − b; /* statement 4 */
x = c + d; /* statement 2 */
y = x + w; /* statement 3 */

Modified code Lifetime Graph

78
Instruction scheduling
• When a instruction is executed and which
resources does it use?

• In pipelined machines, execution time of one


instruction depends on the nearby
instructions: opcode, operands.

79
Reservation table

• A reservation table
Time/instr A B
relates instructions
execution time slots to instr1 X
CPU resources. instr2 X X
instr3 X
instr4 X

80
Instruction selection

• May be several ways to implement an


operation or sequence of operations.
• Represent operations as graphs, match
possible instruction sequences onto graph.

+ +
* +
* MUL ADD *
expression templates MADD
81
Summary

82
Quiz
Q1: State machine example, circular buffer, Queues?
Q2: For each basic block given below, rewrite it in single-assignment
form, and then draw the data flow graph for that form.

a). x = a + b; b). r = a + b − c;
y = c + d; s = 2 * r;
z = x + e; t = b − d;
c). a = q − r; r = d + e;
b = a + t; d). w = a − b + c;
a = r + s; x = w − d;
c = t − u; y = x − 2;
w = a + b − c;
z = y + d;
y = b * c;

83
Quiz
Q3: Draw the CDFG for the following code fragments:
a).
if (y == 2) {r = a + b; s = c − d;}
else r = a − c d).
b). for (i = 0; i < N; i++)
x = 1; x[i] = a[i]*b[i];
if (y == 2) { r = a + b; s = c − d; } e).
else { r = a − c; } for (i = 0; i < N; i++) {
c). if (a[i] == 0)
x = 2; x[i] = 5;
while (x < 40) { else
x = foo[x]; x[i] = a[i]*b[i];
} }

Q4: What are common components of a firmware? How does a firmware


execute?

84
Quiz
Q5: Show the contents of the assembler’s symbol table at the end of code
generation for each line of the following programs:
a.
ORG 200 b.
p1: ADR r4,a ORG 100
LDR r0,[r4] p1: CMP r0,r1
ADR r4,e BEQ x1
LDR r1,[r4] p2: CMP r0,r2
ADD r0,r0,r1 BEQ x2
CMP r0,r1 p3: CMP r0,r3
BNE q1 BEQ x3
p2: ADR r4,e
c.
ORG 200
S1: ADR r2,a
LDR r0,[r2]
S2: ADR r2,b
LDR r2,a
ADD r1,r1,r2
85
Quiz
Q6: Draw the CDFG for the following C code before and after applying
dead code elimination to the if statement:
#define DEBUG 0
proc1();
if (DEBUG) debug_stuff();
switch (foo) {
case A: a_case();
case B: b_case();
default: default_case();
}
Q7: Unroll the loop below:
for (i = 0; i < 32; i++)
x[i] = a[i] * c[i];
a. two times
b. three times

86
Quiz
Q8: Apply loop fusion or loop distribution to these code fragments as
appropriate. Identify the technique you use and write the modified code.
a.
for (i=0; i<N; i++) c.
z[i] = a[i] + b[i]; for (i=0; i<N; i++) {
for (i=0; i<N; i++) for (j=0; j<M; j++) {
w[i] = a[i] − b[i]; c[i][j] = a[i][j] + b[i][j];
b. x[j] = x[j] * c[i][j];
for (i=0; i<N; i++) { }
x[i] = c[i]*d[i]; y[i] = a[i] + x[j];
y[i] = x[i] * e[i]; }
}
Q9: What is software pipelining? Give a example about software pipelining.

87

You might also like