0% found this document useful (0 votes)
0 views

Lecture4_Embedded_Software

The document outlines the curriculum for the Embedded Systems course at VietNam National University, covering topics such as embedded software, CPU architecture, real-time operating systems, and interfacing with real-world applications. It introduces essential concepts including the definition of embedded software, program representations, and development environments. Additionally, it discusses practical applications through projects like a seat belt controller and the use of data structures like circular buffers and queues.

Uploaded by

ngdlong1809
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
0 views

Lecture4_Embedded_Software

The document outlines the curriculum for the Embedded Systems course at VietNam National University, covering topics such as embedded software, CPU architecture, real-time operating systems, and interfacing with real-world applications. It introduces essential concepts including the definition of embedded software, program representations, and development environments. Additionally, it discusses practical applications through projects like a seat belt controller and the use of data structures like circular buffers and queues.

Uploaded by

ngdlong1809
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 76

VietNam National University

University of Engineering and Technology

EMBEDDED SYSTEM FUNDAMENTALS


(ELT3240, NHẬP MÔN HỆ THỐNG NHÚNG)

Dr. Nguyễn Kiêm Hùng


Email: [email protected]

Laboratory for Smart Integrated Systems


Introduction to VietNam National University
Week
Embedded Systems 1-2
University of Engineering
Introduction to CandWeek
Technology
Languague 3

CPU: Week
ARM Cortex-M 4

Curriculum Memory
and Interfaces
Week
5-6

Path ARM-based Week


7
Embedded System

Embedded Software Week


8-9

Real-time Week
Operating systems 10-12

Interfacing Embedded Week


With Real-World 13-14

Project Week
Laboratory for Smart Integrated Systems 15
Objectives
In this lecture you will be introduced to:
– Definition of the embedded software
– Some useful structures for embedded software
development.
– Models of programs, such as data flow and
control flow graphs.
– An introduction to compilation methods.
– Analyzing and optimizing programs for
performance, size, and power consumption.

3
Outline
• Embedded Software
• Embedded Software components
• Representations of programs
• Assembly and linking
• Compilation flow

• Summary

4
Embedded Software
Definition
“Embedded software is computer software, written to control
machines or devices that are not typically thought of as
computers. It is typically specialized for the particular hardware
that it runs on and has time and memory constraints. This term
is sometimes used interchangeably with firmware.”

“Firmware” means that:


-The embedded software only repeatedly carries out particular
designated functions,
-The software is stored in ROM and executes on reset.
5
Embedded Software Overview
Common Components:
• Interrupt Vector Table
– Reset vector
• Startup code:
– initialize the system—both the hardware (e.g. stack pointer) and the
software system (e.g. global variables)
• Application code
• Libraries
• Interrupt/Exception Handler

6
Embedded Software Overview
How does a firmware execute:
Reset interrupt vector

7
Embedded Software Overview
How does a firmware execute:

Int I = 0;

8
Embedded Software Overview
How does a firmware execute:

Int x = 8
MOV r0,#8 ; generate value for x
ADR r4,x ; get address for x
STR r0,[r4] ; store x

9
Embedded Software Overview
How does a firmware execute:

Main() is typically
implemented as a
endless loop which
is either interrupt-
driven or uses
polling for
controlling external
interaction or
internal events!

10
Development Environment
• Host: a computer running programming tools
for development
• Target: the HW on which code will run
• After program is written, compiled, assembled
and linked, it is transferred to the target

X86 KL46Z

Host system Target system

11
SW Development Environment
Editor KeilTM uVision®
Simulated Processor
Source code Start Microcontroller
Start
; direction register Debug
LDR R1,=GPIO_PORTD_DIR_R Session Memory
LDR R0,[R1]
ORR R0,R0,#0x0F
; make PD3-0 output I/O
STR R0, [R1]

Build Target (F7)

Object code Real Processor


Microcontroller
0x00000142
0x00000144
4912
6808
Download
0x00000146 F040000F Memory
0x0000014A 6008 Start
Debug
Session I/O
Address Data

12
What If Real HW Not Available?
• Development board:
– Before real hardware is built, software can be developed
and tested using development boards
– Development boards usually have the same CPU as the
end product and provide many IO peripherals for the
developed software to use
as if it were running on the
real end product
• Tools for program development
– Integrated Development Environment
(IDE): cross compiler, linker, loader, …
– OS and related libraries and packages

13
Cross Compiler
• Runs on host but generates code for target
– Target usually have different architecture from host. Hence
compiler on host has to produce binary instructions that
will be understood by target

14
Outline
• Embedded Software
• Embedded Software components
• Representations of programs
• Assembly and linking
• Compilation flow
• Summary

15
Software state machine

• State machine keeps internal state as a


variable, changes state based on inputs and
the current state of the system.
• Uses for:
– control-dominated code;
– reactive systems: inputs appear intermittently
rather than as periodic samples

16
State machine example
Seat belt controller:
- Two inputs: seat sensor and belt sensor
- One Timer:
- One Output: Buzzer
no seat/-
no seat/ idle
buzzer off seat/timer on
no seat/- no belt
No Belt and Timer over and no
buzzer /buzzer on seated
timer/-
belt/timer off
belt/
buzzer off no belt/timer on
belted

17
State machine example
#define IDLE 0
#define SEATED 1
#define BELTED 2
#define BUZZER 3
Switch (state) { /* check the current state */
case IDLE:
if (seat){ state = SEATED; timer_on = TRUE; }
/* default case is self-loop */
break;
case SEATED:
if (belt) state = BELTED; /* won’t hear the buzzer */
else if (timer_over) state = BUZZER; /* didn’t put on belt in time */
/* default case is self-loop */
break;
case BELTED:
if (!seat) state = IDLE; /* person left */
else if (!belt) state = SEATED; /* person still in seat */
break;
case BUZZER:
if (belt) state = BELTED; /* belt is on---turn off buzzer */
else if (!seat) state = IDLE; /* no one in seat--turn off buzzer */
break;
18
}
Project
Seat belt controller • Hardware Requirement:
– 2 user LEDs
– 2 user push buttons
– Timer
– Option: 4-digit 7-segment
LCD module

19
Project
Seat belt controller
• Description: The controller turns on a buzzer if a person sits in
a seat and does not fasten the seat belt within a fixed amount
of time.
• This system has two inputs and two outputs.
– SW1 represents the sensor that detects when a person has sat down,
– SW2 represents the seat belt sensor that tells when the belt is fastened,
– Red LED is the output that represents the buzzer.
– Blue LED is the output that represents the belted state.
– LCD (option): displays current state of the system
• A timer is for setting the required time interval before turn on
buzzer
20
Signal processing and circular buffer
• Commonly used in signal processing:
– new data constantly arrives and must be processed in
real time;
– each datum has a limited lifetime.
time time t+1

d1 d2 d3 d4 d5 d6 d7

• In an embedded system we must not only emit


outputs in real time, but we must also do that
using a minimum amount of memory
– Use a circular buffer to hold the data stream.
21
Circular buffer

x1 x2 x3 x4 x5 x6 x7 x8 x9 x10

t1 t2 t3 t4 t5 t6
Data stream

x1
x5
x9 x2
x6 x3
x7 x4
x8

Circular buffer

22
Queues
• Are used whenever data may arrive and depart at somewhat unpredictable times
• Implementation:
– Declare the array
– The variables head and tail keep track of the two ends of the queue
– Two error conditions: Reading from an empty queue and writing to a full queue.

Head Head d1 d5
Tail
Tail
d2
d3 Head d3

Tail d4

time t0 time t1 time t1+1


24
Outline
• Embedded Software
• Software components
• Representations of programs
• Assembly and linking
• Compilation flow
• Summary

25
Models of programs

• Source code is not a good representation for


programs:
– many different types of source code,
– Clumsy.
• Require a single model:
– to describe all of types of source code,
– and to perform many useful analyses on the model more
easily
• Compilers derive intermediate representations to
manipulate and optimize the program.

26
Data flow graph

• DFG: data flow graph.


• Models a program (also called basic block):
– with only one entry and exit point.
– with only data operations (e.g. arithmetic and
other computations) and no control operations
(conditionals).

27
Data flow graph

x = a + b;
a b c d
y = c - d;
z = x * y; + -
y1 = b + d;
x y

A basic block in single- * +


assignment form
z y1
DFG

29
DFGs and partial orders
- DFG shows the order in
which the operations are
a b c d performed in the program
+ -
- Partial order:
 [a+b, c-d]; [b+d, x*y]
y
x Can do pairs of operations
in any order.
* +
- DFG help us to determine
feasible reorderings of the
z y1 operations, so to reduce
pipeline or cache conflicts
30
Control-data flow graph
• CDFG: model both data operations
(arithmetic and other computations) and
control operations (conditionals).
• Uses data flow graphs as components, adding
constructs to describe control.
• Two types of nodes:
– Decision node: describe all the types of control in
a sequential program;
– Data flow node: represent a basic block.

31
Data flow node

Encapsulates a data flow graph to represent


basic block form:

x = a + b;
y=c+d

32
Control Nodes

T v1 v4
cond value

F v2 v3

Equivalent forms

33
CDFG example
T
if (cond1) bb1(); cond1 bb1()
else bb2(); F
bb3(); bb2()
switch (test1) {
case c1: bb4(); break; bb3()
case c2: bb5(); break;
case c3: bb6(); break; c3
c1 test1
}
c2
bb4() bb5() bb6()

34
for loop

for (i=0; i<N; i++) i=0


loop_body();
for loop F
i<N

T
i=0;
while (i<N) { loop_body();
i++;
loop_body(); i++; }
equivalent
End

35
Outline (h1h2)
• Embedded Software
• Software components
• Representations of programs
• Assembly and linking
• Compilation flow
• Summary

36
Assembly and linking

37
Assembly and linking
Describes the instructions and
• Last steps in compilation: data in binary format.

38
Multiple-module programs

• Programs may be composed from several


files.
• Addresses become more specific during
processing:
– relative addresses are measured relative to the
start of a module;
– absolute addresses are measured relative to the
start of the CPU address space.

39
Assemblers
• Major tasks:
– generate binary representation for symbolic
assembly instructions;
– translate labels into addresses;
– handle pseudo-ops (data, etc.).
• Generally one-to-one translation.
• Assembly labels: represent locations of
instructions and data
ORG 100 ; pseudo-ops
label1 ADR r4,c ;
40
Two-pass assembly

• Pass 1:
– generate symbol table:
• determining the address of each label
• Pass 2:
– generate binary instructions
• using the label values computed in the first pass

41
Symbol table
ORG 0x7
ADD r0,r1,r2 xx 0xB
xx ADD r3,r4,r5 yy 0x13
CMP r0,r3
yy SUB r5,r6,r7

assembly code symbol table

First Pass
42
Symbol table generation

• Use program location counter (PLC) to determine


address of each location.
• Scan program, keeping count of PLC.
• Addresses are generated at assembly time, not
execution time.

43
Symbol table example
ORG 0x7
PLC=0x7 ADD r0,r1,r2 xx 0xB
xx PLC=0xB ADD r3,r4,r5 yy 0x13
PLC=0xF CMP r0,r3
yy PLC=0x13 SUB r5,r6,r7

44
Relative address generation

• Some label values may not be known at


assembly time.
• Labels within the module may be kept in
relative form.
• Must keep track of external labels---can’t
generate full binary for instructions that use
external labels.

45
Linking
• Combines several object modules into a
single executable module.
– An assembly language program are usually
written as several smaller pieces + uses library
routines rather than as a single large file.
• Jobs:
– put modules in order;
– resolve labels across modules.

46
External References and entry points
entry point
xxx ADD r1,r2,r3 a ADR r4,yyy
B a external reference ADD r3,r4,r5
yyy %1

47
Module ordering
• Code modules must be placed in absolute positions in the
memory space.
• Load map or linker flags control the order of modules.
– Some data structures and instructions must be put at precise
memory locations
– Many different types of memory may be installed at different
address ranges.
module1

module2

module3
48
Outline
• Embedded Software
• Software components.
• Representations of programs.
• Assembly and linking.
• Compilation flow.
• Summary

51
Compilation
• Why need to understand Compilation process
– how a high-level language program is translated
into instructions: interrupt handling instructions,
placement of data and instructions in memory, etc
– how code is generated can help you meet your
performance goals:
• either by writing high-level code that gets compiled into
the instructions you want
• or by recognizing when you must write your own
assembly code.

52
Compilation
• Compilation process:
– compilation = translation + optimization
• The high-level language program is translated into the
lower-level form of instructions;
• optimizations try to generate better instruction
sequences
• Compiler determines quality of code:
– use of CPU resources;
– memory access scheduling;
– code size.
53
Basic compilation phases

High-level Language Code

Parsing, symbol table generation

Machine-independent
optimizations

Machine-dependent
optimizations

Assembly Code

54
Statement translation and optimization

• Source code is translated into intermediate


form such as CDFG.
• CDFG is transformed/optimized.
• CDFG is translated into instructions with
optimization decisions.
• Instructions are further optimized.

55
Arithmetic expressions

X= a*b + 5*(c-d) b
a c d
* -
expression
5

DFG
56
Arithmetic expressions, cont’d.
ADR r4,a
MOV r1,[r4]
a b c d ADR r4,b
MOV r2,[r4]
1 * 2 - MUL r3,r1,r2
5
ADR r4,c
MOV r1,[r4]
3 *
ADR r4,d
MOV r5,[r4]
SUB r6,r4,r5
4 + MUL r7,r6,#5
ADD r8,r7,r3
ADR r1,x
STR r8,[r1]
DFG Assembly code

57
Compiled code for arithmetic expressions

ldr r2, [fp, #-16] mov r3, r2


ldr r3, [fp, #-20] mov r3, r3, asl #2
mul r1, r3, r2 ; multiply add r3, r3, r2 ; add
ldr r2, [fp, #-24] add r3, r1, r3 ; add
ldr r3, [fp, #-28] str r3, [fp, #-32] ; assign
rsb r2, r3, r2 ; subtract

code generated by the ARM gcc compiler


58
Control code generation

if (a+b > 0) T

x = 5; a+b>0 x=5

else F

x = 7; x=7
expression

DFG

59
Control code generation, cont’d.
ADR r5,a
LDR r1,[r5]
ADR r5,b
T LDR r2,[r5]
1 a+b>0 x=5 2 ADD r3,r1,r2
CMP R3, #0
F
BLE label3
LDR r3,#5
3 x=7 ADR r5,x
STR r3,[r5]
B stmtent
Label3
stment LDR r3,#7
ADR r5,x
STR r3,[r5]
stmtent ...
60
Compiled code for control

ldr r2, [fp, #-16] mov r3, #5 ; true block


ldr r3, [fp, #-20] str r3, [fp, #-32]
add r3, r2, r3 b .L4 ; go to end of if
cmp r3, #0 ; test the statement
branch condition .L3: ; false block
ble .L3 ; branch to false mov r3, #7
block if <= str r3, [fp, #-32]
.L4:

code generated by the ARM gcc compiler


61
Data structure transformations

• The compiler must also translate references to


data structures into references to raw
memories.
• Different types of data structures use different
data layouts.
• Some offsets into data structure can be
computed at compile time, others must be
computed at run time.

67
One-dimensional arrays

• C array name points to 0th element:

a a[0]
a[1] = *(a + 1) pointer
a[2]

68
Two-dimensional arrays
• Column-major layout:

a[0,0]
a[0,1] M
...
N

... a[1,0]
a[1,1] = a[i*M+j]

69
Compiler Optimizations

71
Expression simplification

• Constant folding:
– N+1 = 8+1 = 9 (N has ben declared as a constant)
• Algebraic:
– a*b + a*c = a*(b+c) (Why?)
• Strength reduction:
– a*2 = a<<1

72
Dead code elimination

• Dead code:
#define DEBUG 0
if (DEBUG) dbg(p1); 0
0
• Can be eliminated by
analysis of control flow, 1
constant folding. dbg(p1);

73
Procedure inlining

• Eliminates procedure linkage overhead:

int foo(a,b,c) { return a + b + c;}


z = foo(w,x,y);

z = w + x + y;

74
Register allocation

• Goals:
– choose register to hold each variable;
– determine lifespan of variable in the register.
• Basic case: within basic block.

75
Register lifetime graph

w = a + b; t=1
x = c + w; t=2 a
b
y = c + d; t=3
c
d
w
x
y

1 2 3 time

76
QUIZ

How to solve the situation where a section of code


requires more registers than are available?

w = a + b; /* statement 1 */
x = c + d; /* statement 2 */
y = x + w; /* statement 3 */
z = a − b; /* statement 4 */

Original code Lifetime Graph

77
QUIZ

How to solve the situation where a section of code


requires more registers than are available?

w = a + b; /* statement 1 */ w = a + b; /* statement 1 */
x = c + d; /* statement 2 */ z = a − b; /* statement 4 */
y = x + w; /* statement 3 */ x = c + d; /* statement 2 */
z = a − b; /* statement 4 */ y = x + w; /* statement 3 */

Original code Modified code

78
QUIZ

How to solve the situation where a section of code


requires more registers than are available?

w = a + b; /* statement 1 */
z = a − b; /* statement 4 */
x = c + d; /* statement 2 */
y = x + w; /* statement 3 */

Lifetime Graph
Modified code

79
Instruction scheduling
• When a instruction is executed and which
resources does it use?

• In pipelined machines, execution time of one


instruction depends on the nearby
instructions: opcode, operands.

80
Summary

83
Quiz
Q1: State machine example, circular buffer, Queues?
Q2: For each basic block given below, rewrite it in single-assignment
form, and then draw the data flow graph for that form.

a). x = a + b; b). r = a + b − c;
y = c + d; s = 2 * r;
z = x + e; t = b − d;
c). a = q − r; r = d + e;
b = a + t; d). w = a − b + c;
a = r + s; x = w − d;
c = t − u; y = x − 2;
w = a + b − c;
z = y + d;
y = b * c;

84
Quiz
Q3: Draw the CDFG for the following code fragments:
a).
if (y == 2) {r = a + b; s = c − d;}
else r = a − c d).
b). for (i = 0; i < N; i++)
x = 1; x[i] = a[i]*b[i];
if (y == 2) { r = a + b; s = c − d; } e).
else { r = a − c; } for (i = 0; i < N; i++) {
c). if (a[i] == 0)
x = 2; x[i] = 5;
while (x < 40) { else
x = foo[x]; x[i] = a[i]*b[i];
} }

Q4: What are common components of a firmware? How does a firmware


execute?

85
Quiz
Q5: Show the contents of the assembler’s symbol table at the end of code
generation for each line of the following programs:
a.
ORG 200 b.
p1: ADR r4,a ORG 100
LDR r0,[r4] p1: CMP r0,r1
ADR r4,e BEQ x1
LDR r1,[r4] p2: CMP r0,r2
ADD r0,r0,r1 BEQ x2
CMP r0,r1 p3: CMP r0,r3
BNE q1 BEQ x3
p2: ADR r4,e
c.
ORG 200
S1: ADR r2,a
LDR r0,[r2]
S2: ADR r2,b
LDR r2,a
ADD r1,r1,r2
86
Quiz
Q6: Draw the CDFG for the following C code before and after applying
dead code elimination to the if statement:
#define DEBUG 0
proc1();
if (DEBUG) debug_stuff();
switch (foo) {
case A: a_case();
case B: b_case();
default: default_case();
}
Q7: Unroll the loop below:
for (i = 0; i < 32; i++)
x[i] = a[i] * c[i];
a. two times
b. three times

87
Quiz
Q8: Apply loop fusion or loop distribution to these code fragments as
appropriate. Identify the technique you use and write the modified code.
a.
for (i=0; i<N; i++) c.
z[i] = a[i] + b[i]; for (i=0; i<N; i++) {
for (i=0; i<N; i++) for (j=0; j<M; j++) {
w[i] = a[i] − b[i]; c[i][j] = a[i][j] + b[i][j];
b. x[j] = x[j] * c[i][j];
for (i=0; i<N; i++) { }
x[i] = c[i]*d[i]; y[i] = a[i] + x[j];
y[i] = x[i] * e[i]; }
}
Q9: What is software pipelining? Give a example about software pipelining.

88

You might also like