0% found this document useful (0 votes)
25 views23 pages

LEC12-Optimization and New Trends

The document discusses different approaches to optimizing programs including code motion, sharing common subexpressions, reducing procedure calls, and reducing memory accesses. It also introduces concepts like embedded computers, the internet of things, big data, cloud computing, artificial intelligence, and edge computing.

Uploaded by

鄔浚偉
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views23 pages

LEC12-Optimization and New Trends

The document discusses different approaches to optimizing programs including code motion, sharing common subexpressions, reducing procedure calls, and reducing memory accesses. It also introduces concepts like embedded computers, the internet of things, big data, cloud computing, artificial intelligence, and edge computing.

Uploaded by

鄔浚偉
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 23

Lecture 12

Optimization
and New Trends
COMP1411: Introduction to Computer Systems

Dr. Xianjin XIA


Department of Computing,
The Hong Kong Polytechnic University
Spring 2024

Acknowledgement: These slides are based on the textbook (Computer Systems: A Programmer’s Perspective) and its slides.
These slides are only intended to use internally. Do not publish it anywhere without permission.
Levels of optimization for a single program

Source Executable
Compiler
Code Code

Optimized Optimized
Source Compiler Executable
Code Code

2
Contents
• Introducing several program optimization approaches
• Code motion
• Share common subexpressions
• Reducing procedure calls
• Reducing memory accesses

• Introducing popular concepts of computing

3
Code motion
• Reduce frequency with which computation performed
• If it will always produce same result
• Especially moving code out of loop

void set_row(double *a, double *b,


long i, long n)
{ long j;
long j; int ni = n*i;
for (j = 0; j < n; j++) for (j = 0; j < n; j++)
a[n*i+j] = b[j]; a[ni+j] = b[j];
}

4
Compiler-Generated Code Motion
void set_row(double *a, double *b,
long i, long n) long j;
{ long ni = n*i;
long j; double *rowp = a+ni;
for (j = 0; j < n; j++) for (j = 0; j < n; j++)
a[n*i+j] = b[j]; *rowp++ = b[j];
}

set_row:
testq %rcx, %rcx # Test n
jle .L4 # If 0, goto done
movq %rcx, %rax # rax = n
imulq %rdx, %rax # rax *= i, %rdx:i
leaq (%rdi,%rax,8), %rdx # rowp = a + n*i*8 %rdi  a
movl $0, %r8d # j = 0
.L3: # loop:
%rsi  b
movq (%rsi,%r8,8), %rax # t = b[j] %rdx  i
movq %rax, (%rdx) # *rowp = t %rcx  n
addq $1, %r8 # j++
addq $8, %rdx # rowp++
cmpq %r8, %rcx # Compare n:j
jg .L3 # If >, goto loop
.L4: # done:
rep ; ret 5
Share Common Subexpressions
• Reuse portions of expressions

/* Sum neighbors of i,j */ long inj = i*n + j;


up = val[(i-1)*n + j ]; up = val[inj - n];
down = val[(i+1)*n + j ]; down = val[inj + n];
left = val[i*n + j-1]; left = val[inj - 1];
right = val[i*n + j+1]; right = val[inj + 1];
sum = up + down + left + right; sum = up + down + left + right;

3 multiplications: i*n, (i–1)*n, (i+1)*n 1 multiplication: i*n

leaq 1(%rsi), %rax # i+1 imulq %rcx, %rsi # i*n


leaq -1(%rsi), %r8 # i-1 addq %rdx, %rsi # i*n+j
imulq %rcx, %rsi # i*n movq %rsi, %rax # i*n+j
imulq %rcx, %rax # (i+1)*n subq %rcx, %rax # i*n+j-n
imulq %rcx, %r8 # (i-1)*n leaq (%rsi,%rcx), %rcx #
addq %rdx, %rsi # i*n+j i*n+j+n
addq %rdx, %rax # (i+1)*n+j
addq %rdx, %r8 # (i-1)*n+j

6
Procedure calls
• Procedure to Convert String to Lower Case
void lower(char *s)
{
int i;
for (i = 0; i < strlen(s); i++)
if (s[i] >= 'A' && s[i] <= 'Z')
s[i] -= ('A' - 'a');
}

200
180
160
140
CPU seconds

120
100
80
60
40
20
0
0 100000 200000 300000 400000 500000
String length

7
Procedure calls
• Why?
/* My version of strlen */
size_t strlen(const char *s)
{
size_t length = 0;
while (*s != '\0') {
s++;
length++;
}
return length;
}

• Overall performance, string of length N


• N calls to strlen
• Require NxN times
• Overall O(N2) performance

8
Procedure calls
• Improving performance
• Move call to strlen outside of loop
• Since result does not change from one iteration to another

void lower(char *s)


{
int i;
int len = strlen(s);
for (i = 0; i < len; i++)
if (s[i] >= 'A' && s[i] <= 'Z')
s[i] -= ('A' - 'a');
}

9
Procedure calls
200

180

160

140
CPU seconds

120

100

80

60

40

20

0
0 50000 100000 150000 200000 250000 300000 350000 400000 450000 500000

String length

10
Memory matters
/* Sum rows is of n X n matrix a
and store in vector b */
void sum_rows1(double *a, double *b, long n) {
long i, j;
for (i = 0; i < n; i++) {
b[i] = 0;
for (j = 0; j < n; j++)
b[i] += a[i*n + j];
}
}

Value of B:
double A[9] =
{ 0, 1, 2, init: [4, 8, 16]
4, 8, 16,}
32, 64, 128 }; i = 0: [3, 8, 16]

double B[3] = A+3; i = 1: [3, 28, 16]

sum_rows1(A, B, 3); i = 2: [3, 28, 224]

11
Memory matters
/* Sum rows is of n X n matrix a
and store in vector b */
void sum_rows1(double *a, double *b, long n) {
long i, j;
for (i = 0; i < n; i++) {
b[i] = 0;
for (j = 0; j < n; j++)
b[i] += a[i*n + j];
}
}

# sum_rows1 inner loop


.L53:
addsd (%rcx), %xmm0 #
FP add
addq $8, %rcx
decq %rax
movsd %xmm0, (%rsi,%r8,8) #
FP store
In each inner loop iteration,
jne there
.L53 is a memory write time consuming

12
Memory matters
/* Sum rows is of n X n matrix a
and store in vector b */
void sum_rows2(double *a, double *b, long n) {
long i, j;
for (i = 0; i < n; i++) {
double val = 0;
for (j = 0; j < n; j++)
val += a[i*n + j];
b[i] = val;
}
}

# sum_rows2 inner loop


.L66:
addsd (%rcx), %xmm0 # FP Add
addq $8, %rcx
decq %rax
jne .L66

Removing memory writes in the inner loop

13
Embedded Computers
• Computers are becoming smaller and smaller …

14
Internet-of-Things
• Things are getting smarter

15
Internet-of-Things
• The vision: connecting everything

16
Internet-of-Things
• Application domains

17
Big Data and Cloud

18
Knowledge management
• Turning data into wisdom, learn from data

19
Artificial Intelligence
• Learning and making decisions from big data

20
Deep Neural Networks

21
Edge Computing

22
Thank You

23

You might also like