0% found this document useful (0 votes)

18 views70 pages

EE234 - Lec - 04

This document is a lecture on Assembly Language Programming for the AVR XMEGA Microcontroller, covering shift and rotate instructions, Boolean instructions, and creating time delays using program loops. It includes examples of assembly code for shifting a 32-bit number and generating delays, as well as discussions on stack operations and subroutine management. The document also addresses issues related to parameter passing and local variable allocation in subroutines.

Uploaded by

alexspammail123

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

18 views70 pages

EE234 - Lec - 04

Uploaded by

alexspammail123

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 70

AVR XMEGA Microcontroller

Lecture 4

Assembly Language Programming

By
Dr. Han –Way Huang
Minnesota State University, Mankato

02/25/2025 1
Shift and Rotate Instructions
 Useful in bit field manipulation
 Useful for bit field extraction

lsl rd ; bit 7 of rd shift to C flag, 0 is shifted to bit 0

lsr rd ; 0 is shifted to bit 7, bit 0 is shifted to C flag

rol rd ; C flag transfers to bit 0, bit 7 is shifted to C

flag
ror rd ; C flag transfers to bit 7, bit 0 is shifted to C
flag
asr rd ; arithmetic shift right, bit 7 duplicates itself
02/25/2025 2
swap r ; upper 4 bits and lower 4 bits swap
Example 4.1 Let [r0] = 0x9B and C = 0, what will be the contents of
r0 and C after the execution of the following instructions:
(a) lsl r0 (b) lsr r0 (c) ror r0 (d) rol r0 (e) asr r0 (d) swap r0
Solution:

02/25/2025 3
Multiple-Byte Shift
 For a k-byte value stored at loc, loc+1, …, loc+k-1, the byte
at loc is the least significant byte whereas the byte at loc+k-1
is the most significant byte.
 Right-shift operation should start at the most significant
byte (loc+k-1).
 Left-shift operation should start at the least significant byte
(loc).

02/25/2025 4
To shift right
Step 1
Shift the byte at loc+k-1 to the right.
Step 2
Rotate the byte at loc+k-2 to the right.
Step 3
Repeat step 2 for the bytes located at loc+k-3 until loc.

02/25/2025 5
To Shift Left
Step 1
Shift the byte at loc
Step 2
Rotate the byte at loc + 1
Step 3
Rotate the remaining byte until reaching the byte at loc.

02/25/2025 6
Example 4.2 Write a program to shift the 32-bit number stored at
0x2000~0x2003 in data memory to the right four places.
Solution:
.include <atxmega128A1def.inc>
.cseg
.def lpCnt, r19
.org 0x00
jmp start
.org 0xF6
start: ldi lpCnt, 4
loop: lds r0, 0x2000
lds r1, 0x2001
lds r2, 0x2002
lds r3, 0x2003

02/25/2025 7
sloop: lsr r3 ; least significant byte
ror r2 ; second least significant byte
ror r1 ; second most significant byte
ror r0 ; most significant byte
dec lpCnt
brnesloop
sts 0x2000, r0
sts 0x2001, r1
sts 0x2002, r2
sts 0x2003, r3
here: rjmphere

02/25/2025 8
Boolean Instructions
Table 3.8 A summary of AVR Boolean instructions
Mnemonics Description Operation
and Rd, Rr Logical AND Rd  [Rd] [Rr]
andi Rd, k Logical AND with immediate Rd  [Rd]  k
or Rd, Rr Logical OR Rd  [Rd]  [Rr]
ori Rd, k Logical OR with immediate Rd  [Rd]  k
eor Rd, Rr Exclusive OR Rd  [Rd] [Rr]
com Rd One’s complement Rd  0xFF – [Rd]
neg Rd Two’s complement Rd  0x00 – [Rd]
sbr Rd, k Set bit (s) in register Rd  [Rd]  k
cbr Rd, k Clear bit (s) in register Rd  [Rd] (0xFF – k)
tst Rd Test for zero or minus Rd  [Rd[ [Rd]
clr Rd Clear register Rd  [Rd] [Rd]
set Rd Set register Rd  0xFF

02/25/2025 9
Applications of Boolean Instructions
Clear a few bits of a register
andi r16, 0xF0 ; clear the lower 4 bits
Set a few bits of a register to 1
ori r16, 0x44 ; set bits 6 and 2 of r16 to 1
Toggle a few bits of a register
ldi r17, 0xCC
eor r16, r17 ; toggle bits 7, 6, 3, & 2 in r16
Find the One’s Complement of a Register
com r16

02/25/2025 10
Create Time Delay Using Program Loops
 Instruction execution takes time
 Time delay can be created by executing an appropriate number
of instructions.
Method:
Step 1
Select a sequence of instructions that takes a certain number of
CPU clock cycles to execute.
Step 2
Repeat the chosen instruction sequence for an appropriate
number of times.
02/25/2025 11
ldi r21, 250
loop0: push r0 ; 2 CPU clock cycles
pop r0 ; 2 CPU cycles
push r0
pop r0
push r0
pop r0
push r0
pop r0
push r0
pop r0
push r0
pop r0
push r0
pop r0
nop ; 1 CPU clock cycle
dec r21 ; 1 CPU clock cycle
brne loop0 ; 2 (1) cycle when branch is taken (not taken)

02/25/2025 12
The instruction sequence in the previous page can be shortened to
ldi r21, 250
loop0: ldi r20, 4 ; 1 CPU clock cycle
loopi: push r0 ; 2 CPU clock cycles
pop r0 ; 2 CPU clock cycles
dec r20 ; 1 CPU clock cycle
brne loopi ; 2 (1) cycles when branch is taken (not taken)
dec r21 ; 1 cycle
brne loop0 ; 2 (1) cycles when branch is taken (not taken)

By loading 250 into r21, the previous loop can create a delay of 0.25 ms
assuming that CPU clock is 32 MHz.

02/25/2025 13
Instruction Sequence that Creates 50 ms delay:
ldi r17, 200
loop1: ldi r21, 250
loop0: ldi r20, 4 ; 1 CPU clock cycle
loopi: push r0 ; 2 CPU clock cycles
pop r0 ; 2 CPU clock cycles
dec r20 ; 1 CPU clock cycle
brne loopi ; 2 (1) cycles when branch is taken (not
taken)
dec r21 ; 1 cycle
brne loop0 ; 2 (1) cycles when branch is taken (not
taken)
dec r17
brne loop1

02/25/2025 14
Creating Longer Delay
 Use multi-layer program loops
 An instruction sequence that create a delay of 1 s is as follows:
ldi r18,20
loop2: ldi r17, 200
loop1: ldi r21, 250
loop0: ldi r20, 4
loopi: push r0
pop r0
dec r20
brne loopi
dec r21
brne loop0
dec r17
brne loop1
dec r18
brne loop2

02/25/2025 15
Stack Data Structure
 Element can only be accessed from its top.
 Add a new element to the stack by pushing.
 Removing an element from the stack by pulling (or popping).
 Has a pointer that either points to the top element or the
location above the top element (for AVR MCU).
Setup the Stack Pointer
ldi r16, low(RAMEND)
out CPU_SPL, r16
ldi r16, high(RAMEND)
out CPU_SPH, r16

02/25/2025 16
Instructions for Stack Operation
pop rd ; SP  [SP] + 1; rd  [SP]
push rd ; mem([SP])  [rd]; SP  [SP] – 1,
Use of Stack
 Store return address of subroutine call and interrupt service.
 Temporary storage
 Store local variables for subroutine execution
 Holding place for return values for subroutine call

02/25/2025 17
A Simple Subroutine
; --------------------------------------------------------------------------------------------
; This subroutine swaps the contents of r16 & r17
; --------------------------------------------------------------------------------------------
swapRegs: push r16
mov r16, r17
pop r17
ret

02/25/2025 18
Subroutine to Generate a Delay of 250 ms

delay250us:
ldi r21, 250
loopo: ldi r20, 4
loopi: push r0
pop r0
dec r20
brne loopi
dec r21
brne loopo
ret

Flexibility can be added to this subroutine and make it more useful.

02/25/2025 19
delayby250us:
ldi r21, 250
loopo: ldi r20, 4
loopi: push r0
pop r0
dec r20
brne loopi
dec r21
brne loopo
dec r16
brne delayby250us
ret
How to Call (to create 50-ms delay)
ldi r16, 200
call delayby250us
02/25/2025 20
delayby50ms:
ldi r17, 200
loop3: ldi r21, 250
loop2: ldi r20, 4
loop1: push r0
pop r0
dec r20
brne loop1
dec r21
brne loop2
dec r17
brne loop3
dec r16
brne delayby50ms
ret

02/25/2025 21
Issues Related to Subroutine Call
 Parameter passing
 Local variable allocation and de-allocation
 Result returning

Parameter Passing
 Using CPU registers (r0~r31)
 Using stack
 Using global memory

02/25/2025 22
Local Variable Allocation & Deallocation
 Temporary variables and results are needed for the execution
of a subroutine.
 Temporary variables and results are useful only during the
execution of the subroutine.
 Should be allocated in stack for easy allocation and
deallocation.
 When allocated in stack, a subroutine can be made into re-
entrant (can call itself).
 Best allocated in CPU registers for AVR—why?

02/25/2025 23
Allocating k Bytes in Stack for Local Variables
in YL, CPU_SPL ; transfer SP to Y
in YH, CPU_SPH ; “
sbiw YL, k
out CPU_SPL, YL
out CPU_SPH, YH
Made into a Macro How to Call?
.macro allocStk allocStk k
in YL, CPU_SPL
in YH, CPU_SPH
sbiw YL, @0
out CPU_SPL, YL
out CPU_SPH, Yh
.endmacro

02/25/2025 24
Macro for Deallocating Local Variables in Stack
.macro deallocStk
in YL, CPU_SPL
in YH, CPU_SPH
adiw YL, @0
out CPU_SPL, YL
out CPU_SPH, YH
.endmacro

How to Invoke?
deallocStk k ; call to deallocate k bytes in stack

02/25/2025 25
AVR Stack Frame

02/25/2025 26
How to Return Results?
 Returned in registers, stack, or global memory
 Best returned in registers
 If results are to returned in stack, the stack slot to hold the
result should be allocated by the caller.

02/25/2025 27
Accessing Local Variables in Stack

To Read locVark: To Write into locVark:

in YL, CPU_SPL in YL, CPU_SPL
in YH, CPU_SPH in YH, CPU_SPH
ldd rj, Y+k std Y+k, rj

02/25/2025 28
Register Usage Convention
 Both the subroutine and the caller are required to use registers,
interference might exist.
 Interference must be avoided to ensure the correct execution of the
program.
Table 5.1 Recommendation for register usage
 A recommendation for the use of AVR registers is given in Table 5.1.
Name Usage
r4~r7, r12~r15, r28, r29 Callee saved
r0~r3, r8~r11, r20, r21 Caller saved
r16~r19, r30, r31 Parameter passing
r22~r27 Result returning

02/25/2025 29
Instructions for Making Subroutine Call

Table 5.2 Subroutine call instructions

Instruction Operation
call k PC ¬ k; stack ¬ PC + 2; SP ¬ SP – 2 (devices with 16-bit PC)
SP ¬ SP – 3 (devices with 22-bit PC)
eicall PC(15:0) ¬ Z(15:0); PC(21:16) ¬ EIND; stack ¬ PC + 1; SP ¬ SP – 3
icall PC(15:0) ¬ Z(15:0); PC(21:16) ¬ 0; (devices with 22-bit PC); stack ¬ PC + 1;
SP ¬ SP – 2 (devices with 16-bit PC); SP ¬ SP – 3 (devices with 22-bit PC)
rcall k PC ¬ PC + k + 1; stack ¬ PC + 1; SP ¬ SP – 2 (devices with 16-bit PC);
SP ¬ SP – 3 (devices with 22-bit PC)
ret PC(15:0) ¬ stack (devices with 16-bit PC); SP ¬ SP + 2
PC(21:0) ¬ stack (devices with 22-bit PC); SP ¬ SP + 3

02/25/2025 30
A Few Examples of Subroutines
delay50ms: delayby50ms:
ldi r19, 200 ldi r19, 200
loop3: ldi r21, 250 loop3: ldi r21, 250
loop2: ldi r20, 4 loop2: ldi r20, 4
loop1: push r0 loop1: push r0
pop r0 pop r0
dec r20 dec r20
brne loop1 brne loop1
dec r21 dec r21
brne loop2 brne loop2
dec r19 dec r19
brne loop3 brne loop3
ret dec r16
brne delayby50ms
ret

ldi r16, 20
call delayby50ms ; create 1 s
02/25/2025 31
delay
Subroutine to Multiply two 16-bit Unsigned Integers
P and Q are two 16-bit unsigned integers
P = PHPL = PH x 28 + PL
Q = QHQL = QH x 28 + QL

P x Q = PHQH x 216 + (PHQL + QHPL) x 28 + PLQL

02/25/2025 32
msb lsb
partial product
PLQL
partial product
PHQL

partial product
PLQH
partial product
+ PHQH

address PR + 3 PR + 2 PR + 1 PR Final product P × Q

Figure 5.4 16-bit by 16-bit multiplication

02/25/2025 33
Subroutine to Multiply two 16-bit Unsigned Integers
Incoming arguments: r16:r17 & r18:r19
Result returned in: r22~r25 (lsb to msb)

;
-----------------------------------------------------------------------------------------------------------------
-------------
; The first and second numbers are passed in r17:r16 & r19:r18.
; The product is returned in r25..r22.
;
-----------------------------------------------------------------------------------------------------------------
-------------
.def p1 = r22 ; lsb of the product
.def p2 = r23
.def p3 = r24
.def p4 = r25 ; msb of the product
.def PH = r17 ; high byte of multiplicand
.def PL = r16 ; low byte of multiplicand
02/25/2025 .def QH = r19 ; high byte of multiplier 34
mul16U: mul PL, QL
movw p1, r0 ; (p2:p1) <-- QL x PL
mul PH, QH
movw p3, r0 ; (p4:p3) <-- QH x PH
mul PL, QH ; compute PL x QH
add p2, r0 ; add partial product to p3:p2
adc p3, r1 ; “
clr zero ; add carry to p4
adc p4, zero ; “
mul PH, QL ; compute PH x QL
add p2, r0 ; add partial product to p3:p2
adc p3, r1 ; “
adc p4, zero ; add carry to p4
ret

02/25/2025 35
Write an instruction sequence to multiply the 16-bit numbers stored
at data memoy 0x2000~0x2001 & 0x2010~0x2011 and store the
product at data memory 0x2020~0x2023.
Solution:
lds r16, 0x2000
lds r17, 0x2001
lds r18, 0x2010
lds r19, 0x2011
call mul16U
sts 0x2020, r22
sts 0x2021, r23
sts 0x2022, r24
sts 0x2023, r25

02/25/2025 36
Algorithm for Multiplying two Signed 16-bit Numbers
Step 1
Multiply two operands disregarding the sign.
Step 2
If op1 is negative, subtract op2 from the upper half of the
product.
Step 3
If op2 is negative, subtract op1 from the upper half of the
product.

02/25/2025 37
Signed 16-bit Multiplication Subroutine
 Incoming argument
 Multiplicand: r16: r17
 Multiplier: r18: r19
 Result
 Product: r22~r25

02/25/2025 38
mul16s: call mul16U
sbrs r17, 7 ; check the sign of the first number
rjmp chk2 ; first number is positive, check 2nd number
sub r24, r18 ; subtract the second number from upper half of
sbc r25, r19 ; product
chk2: sbrs r19, 7 ; check the sign of the second number
rjmp doneMs ; second number is positive, prepare return
sub r24, r16 ; subtract the first number from upper half of
sbc r25, r17 ; product
doneMs: ret
.include “mul16U.asm”

02/25/2025 39
Writing Subroutine to Perform Unsigned Division
 Shift-and-subtract method is often used to carry the division.

msb
R register Q register

lsb
Set LSB

Write R Shift left

C
Controller
ALU

P register

Figure A16. The shift-and-subtract divider hardware

02/25/2025 40
16-bit Unsigned Division Algorithm
Step 1
Load divisor, dividend, and 0, into P, Q, and R registers. lpCnt 
16.
Step 2
Shift R:Q to the left one place.
Step 3
Subtract P from R and place the difference back to R if the
difference is non-negative.
Step 4
Set the bit 0 of Q to 1 if the difference computed in Step 3 is non-
negative. Otherwise, set it to 0.
Step 5
lpCnt  lpCnt – 1; if (lpCnt > 0) go to Step 2; else Stop.

02/25/2025 41
;
--------------------------------------------------------------------------------------------
----------------------
; Dividend and divisor are passed in r17:r16 & r19:r18,
respectively.
; Quotient is returned in r25:r24; remainder is returned in
r23:r22.
;
--------------------------------------------------------------------------------------------
----------------------
.def lpCnt = r20
.def RL = r22
.def RH = r23
.def QL = r24
.def QH = r25
.def PL = r18
02/25/2025 42
.def PH = r19
div16U: ldi lpCnt, 16 ; set up loop count
movw QL, r16 ; load dividend into Q register
clr RL ; initialize R register to 0
clr RH ; “
dvloop: lsl QL ; shift R:Q to left one place
rol QH ; “
rol RL ; “
rol RH ; “
cp RL, PL ; transfer RL:RH to tmpL:tmpH
cpc RH, PH
brlo nxtb ; perform unsigned comparison
ori QL, 0x01 ; set bit 0 of Q to 1
sub RL, PL ; put difference in R
sbc RH, PH ; “
nxtb: dec lpCnt
brne dvloop
ret

02/25/2025 43
Example. Write a program to find all elements in an array of 16-bit numbers divisible
by 5 and save them in data memory starting from 0x2000. The array has 30 elements.
.include <atxmega128A1def.inc>
.def lpCnt = r20
.dseg
.org 0x2000
result: .byte 20
.cseg
.org 0x00
jmp start
.org 0xF6
start: ldi r16, low(RAMEND)
out CPU_SPL, r16
ldi r16, high(RAMEND)
out CPU_SPH, r16
call setCPUClkto32Mwith32MIntOsc
ldi ZL, low(array<<1)
ldi ZH, high(array<<1)
ldi YL, low(result)
ldi YH, high(result)
ldi lpCnt, 30

02/25/2025 44
loop: lpm r16, Z+
lpm r17, Z+
ldi r18, 5
ldi r19, 0
call div16U
cpi r22, 0 ; is remainder equals 0?
brne false
st Y+, r16
st Y+, r17
false: dec lpCnt
brne loop
here: jmp here
.include “sysclk_xmega.asm”
.include “div16U.asm”
array: .dw 1234, 2345, 3456, 4567, 5678
.dw 1122, 2233, 3344, 4455, 5566

02/25/2025 45
Convert an Internal Binary Number into a BCD String
xx: number to be converted
quo: quotient of division
rem: remainder of a division
ptr: pointer to the buffer to store the resultant string

Step 1
Push 0 into the stack.
Step 2
Quo  xx / 10; rem  xx mod 10;
Step 3
Stack  rem + 0x30;
Step 4
If quo  0, xx  quo and go to Step 2; else, next step.
Step 5
Pop the string out of the stack and store it in the buffer.

02/25/2025 46
Example 6.4 Write a subroutine that can convert a 16-bit binary
number held in r16:r17 into an ASCII BCD string and stores the string
in a buffer pointed by the Z pointer.
Solution:
Parameter Passed:
 16-bit number to be converted: in r16:r17
 Pointer to buffer to hold the converted string: in Z

02/25/2025 47
.def sign = r12
.equ NULL =0
bin2BCD:clr sign
push sign
sbrs r17,7 ; check the sign of the number
rjmp normal
inc sign ; indicate sign is negative
com r16 ; find the magnitude of the given number
com r17 ; "
movw r24,r16 ; "
adiw r24,1 ; "
movw r16,r24 ; "
normal: ldi r18,10 ; set divisor to 10
clr r19 ; “
call div16U
ldi r26,0x30
add r22,r26 ; convert remainder digit to ASCII code

02/25/2025 48
push r22
sbiw r24,0 ; is quotient 0?
breq popStk ; quotient is 0, pop string out of stack
movw r16, r24; quotient is not 0, continue to divide
rjmp normal ; loop
popStk: sbrs sign,0 ; check the sign of the number
rjmp popLp ; do nothing if positive
ldi r17,'-' ; push a minus sign into stack
push r17
popLp: pop r17
cpi r17,NULL ; is it a NULL character?
breq exit
st Z+,r1
rjmp popLp
exit: st z,r17 ; terminate the string with a NULL character
ret

02/25/2025 49
Example 6.5 Write a program to find all 4-digit decimal numbers
that have the following property:
The sum of the square of the upper half and the square of the lower half of the
given number equals to the original number.

.include <atxmega128A1def.inc>
.def kL = r4 ; test number (start from 1000)
.def kH = r5 ; “
.def TOPL = r6 ; register to hold upper bound
.def TOPH = r7 ; (10000)
.dseg
.org 0x2000
buf: .byte 20
.cseg
.org 0x00
jmp start
.org 0xF6

02/25/2025 50
start: ldi r16,low(RAMEND) ; set up stack pointer
out CPU_SPL,r16 ; “
ldi r17,high(RAMEND) ; “
out CPU_SPH,r17 ; “
call setCPUClkto32Mwith32MIntOsc
ldi r16,low(1000)
ldi r17,high(1000)
movw kL,r16 ; transfer 1000 to kL:kH
ldi r16,low(10000)
ldi r17,high(10000)
movw TOPL,r16 ; place 10000 in TOPL:TOPH
ldi ZL,0 ; use Z as buffer pointer
ldi ZH,0x20 ; "
floop: movw r16,kL ; pass number to be tested in r16:r17
call test
cpi r22,1 ; does test subroutine return 1?
brne next ; no, check next numbers
st Z+,kL ; save the number
st Z+,kH ; "
02/25/2025 51
next: movw r28,kL ; increment kL:kH by 1
adiw r28,1 ; "
movw kL,r28 ; "
cp kL,TOPL
brne floop
cp kH,TOPH
brne floop
done: jmp done

02/25/2025 52
test: ldi r18,100
clr r19
call div16U
mul r22, r22 ; compute rem * rem
movw r22, r0 ; transfer back to r22:r23
mul r24, r24 ; compute quo * quo
add r0, r22 ; compute rem*rem + quo*quo
adc r1, r23 ; “
cp r0, r16 ; compare with r16:r17
brne false ; branch if low bytes are not equal
cp r1, r17 ; compare high bytes
brne false
ldi r22, 1 ; returns 1 if the given number has the property
ret
false: clr r22
ret
.include "div16U.asm"
.include "sysClock_xmega.asm"

02/25/2025 53
Finding the Square Root (Successive Approximation Method)
SAR: successive approximation register
mask: mask to set a bit in an 8-bit register to 1
lpCnt: loop count
Temp: temporary variable

02/25/2025 54
Start

SAR[n - 1, ..., 0]  0
i n-1

SAR[i]  1

i i-1 yes
SAR * SAR > num? SAR[i]  0

no
i = 0?
yes

Stop

Figure 5.5 Successive-approximation method for finding square root

02/25/2025 55
Algorithm
Step 1
sar  0; mask  0x80; lpCnt  8; temp  0
Step 2
temp  sar OR mask // guess bit i is 1
Step 3
If ((temp * temp)  num) sar  temp;
Step 4
Mask  mask SRL 1 (shift right logically one place);
Step 5
lpCnt  lpCnt – 1
Step 6
If (lpCnt == 0) stop; else go to Step 2.

02/25/2025 56
Drawback of the Successive Approximation Algorithm
 Too pessimistic—the square root tend to be too small
 The better approximation may be [SAR] + 1 instead of [SAR]
 Need to compare ([SAR]+1)2 – num and num – [SAR]2.

Example 6.6 Write a subroutine that can find the square root of a
16-bit number. The 16-bit number of which the square root is to be
found is passed in r16: r17 and the square root is returned in r22.

02/25/2025 57
.def mask = r20
.def sar = r22
.def tmp = r1
.def lpcnt = r21
.def qL = r16
.def qH = r17
SqRoot16: ldi mask, 0x80
clr sar
ldi lpcnt, 8
sqLoop1: mov tmp, sar ; make a guess of a sar bit
or tmp, mask ; "
mul tmp, tmp ; compute sar * sar
cp qL, r0 ; compare sar * sar with q
cpc qH, r1 ; "
brlo nextb
or sar, mask ; keep the guess
nextb: lsr mask
dec lpcnt
brne sqLoop1

02/25/2025 58
;
-----------------------------------------------------------------------------------------------------------------
------------------
; Find out whether sar or sar+1 is closer to the true square root by comparing
; whether q – sar*sar or (sar+1)2 – q is smaller.
;
-----------------------------------------------------------------------------------------------------------------
------------------
mul sar, sar ; compute D1 = q - sar * sar
movw r8, qL ; "
sub r8, r0 ; "
sbc r9, r1 ; "
mov r0, sar ; compute D2 = (sar+1)*(sar+1) - q
inc r0 ; "
mul r0, r0 ; "
sub r0, qL ; "
sbc r1, qH ; "
cp r8, r0 ; compare D1 with D2
cpc r9, r1 ; "
brlo selSar ; choose sar if D1 < D2
02/25/2025 59
inc sar
Example 6.7 Write a program to find the square root of an array of 16-bit
numbers.
Solution:
.include <atxmega128A1Udef.inc>
.def llCnt = r25
.cseg
.org 0x00
jmp start
.org 0xF6
start: ldi r16, low(RAMEND)
out CPU_SPL, r16
ldi r16, high(RAMEND)
out CPU_SPH, r16
call setCPUClkto32Mwith32MIntOsc
ldi llCnt, 12
ldi YL,0 ; Y points to SRAM
ldi YH,0x20 ; “
ldi ZL,low(array<<1) ; Z points to array
ldi ZH,high(array<<1) ; “
02/25/2025 60
mloop: lpm r16,Z+ ; fetch the next 16-bit number
lpm r17,Z+ ; “
call SqRoot16
st Y+, r22
dec llCnt
brne mloop
again: jmp again
array: .dw 1234,2345,3456,4567,5678,6789
.dw 1601,2026,2506,3509,3600,5000

02/25/2025 61
Bubble Sort
 Go through the array as many iterations as one less than the
number of array or file elements.
 In each iteration, compare each adjacent pair of elements
from the lowest toward the highest array index. Swap the
adjacent elements if they are not in order.
 The sorting efficiency can be improved by keeping track of
whether swapping has been done in an iteration. If no
swapping has been done, the sorting algorithm should stop.

02/25/2025 62
iteration ¬ N - 1

sorted ¬ 1
inner ¬ iteration
i¬ 0

array[i] > array[i+1]? no

yes
swap array[i] & array[i+1]
sorted ¬ 0

inner ¬ inner - 1
i¬ i+1

no
inner = 0?

yes
yes
sorted = 1?

no
Iteration ¬ iteration - 1

no
iteration = 0?
yes
Stop

Figure 5.9 Logic flow of bubble sort

02/25/2025 63
Stack frame of bubble sort subroutine

SP
3 2 1
sorted
iCnt
eCnt
r28
r29
ret_addr
Figure 5.10 Stack frame for bubble sort subroutine

02/25/2025 64
.include <atxmega128A1Udef.inc>
.equ SPL = CPU_SPL ; commented for MEGA devices
.equ SPH = CPU_SPH ; "
.macro allocStk ; this macro allocate space for local variables
in r28,SPL
in r29,SPH
sbiw r28,@0
out SPL,r28
out SPH,r29
.endmacro
.macro deallocStk ; this macro deallocates space used by local
variables
in r28,SPL
in r29,SPH
adiw r28,@0
out SPL,r28
out SPH,r29
.endmacro

02/25/2025 65
.equ NN = 30
.def lpcnt = r21
.dseg
.org 0x2000
array: .byte 40
.cseg
.org 0x00
rjmp start
.org 0xF6
start: ldi r20,low(RAMEND) ; initialize stack pointer
out SPL,r20 ; "
ldi r20,high(RAMEND) ; "
out SPH,r20 ; "
ldi ZL,low(xarr<<1) ; set up pointer to the array in program memory
ldi ZH,high(xarr<<1) ; "
ldi XL,low(array) ; set up pointer to the buffer array in data
memory
ldi XH,high(array) ; "
ldi lpcnt,NN ; set up loop count
02/25/2025 66
cLoop: lpm r0,z+ ; copy the array from program memory to data
memory
st x+,r0 ; so that it can be sorted
dec lpcnt ;"
brne cLoop ;"
ldi r16,low(array) ; pass array pointer
ldi r17,high(array) ; "
ldi r18,NN ; pass array count
call bubble
again: rjmp again
;
---------------------------------------------------------------------------------------------------------------
---------------------------
; The next subroutine uses bubble sort algorithm to sort an array in data
memory.
; The array count is passed in r18 and the pointer to the array is passed in
r16~r17.
; All array elements are nonnegative.
;
02/25/2025 67
---------------------------------------------------------------------------------------------------------------
bubble: push YH
push YL
allocStk 3 ; allocate 3 bytes for local variables
in YL, SPL ; set Y point to the byte above the top of stack
in YH, SPH ; "
dec r18 ; initialize iteration count to NN - 1
std Y+eCnt, r18 ; "
eLoop: ldd r18,Y+eCnt ; set up inner loop count
std Y+iCnt, r18 ; "
movw ZL, r16 ; place array base address in Z
ldi r20, 1 ; set flag to indicate array is sorted
std Y+sorted, r20 ; "
iloop: ld r8, z ; fetch element array[i]
ldd r9, z+1 ; fetch element array[i+1]
cp r8, r9 ; compare array[i] with array[i+1]
brlo next

02/25/2025 68
st Z, r9 ; swap array[i] with array[i+1]
std Z+1, r8 ; "
clr r20 ; indicate array not sorted
std Y+sorted, r20 ; "
next: adiw ZL, 1 ; increment array pointer by 1
ldd r20,Y+iCnt ; decrement inner loop count
dec r20 ; "
std Y+iCnt, r20 ; "
brne iloop ; continue if inner loop count is not 0
; at the end of an iteration
ldd r20,Y+sorted ; check array sorted flag
cpi r20,true ; "
breq done ; stop if sorted flag is true (1)
ldd r20,Y+eCnt ; decrement iteration loop count
dec r20 ; "
std Y+eCnt,r20 ; "
brne eLoop

02/25/2025 69
done: deallocStk 3
pop YL
pop YH
ret
xarr: .db 12,91,20,33,45,72,24,19,17,101
.db 11,92,21,34,44,71,25,18,16,131
.db 41,43,49,50,99,79,89,98,37,59
// End of program

02/25/2025 70

R For Health Data Science
100% (1)
R For Health Data Science
365 pages
Mips Instruction Set
No ratings yet
Mips Instruction Set
57 pages
Advanced CPU Instructions: Laboratory Exercise #5
0% (1)
Advanced CPU Instructions: Laboratory Exercise #5
4 pages
BasicsBiblical Hebrew-01
100% (1)
BasicsBiblical Hebrew-01
7 pages
Lecture 12 Stack and Subroutines
No ratings yet
Lecture 12 Stack and Subroutines
24 pages
Jump, Loop and Call Instructions: The 8051 Microcontroller and Embedded Systems: Using Assembly and C
No ratings yet
Jump, Loop and Call Instructions: The 8051 Microcontroller and Embedded Systems: Using Assembly and C
25 pages
Module3 - 8051 Stack, IO Port Interfacing and Programming - Updated
No ratings yet
Module3 - 8051 Stack, IO Port Interfacing and Programming - Updated
10 pages
Chapter 4 Stack Organization
No ratings yet
Chapter 4 Stack Organization
18 pages
8051 Stack, I/O Port Interfacing and Programming
No ratings yet
8051 Stack, I/O Port Interfacing and Programming
30 pages
Stacks & Subroutines
No ratings yet
Stacks & Subroutines
35 pages
ECE 3120 Computer Systems Shift and Rotate
No ratings yet
ECE 3120 Computer Systems Shift and Rotate
19 pages
Social Aspects of Interlanguage
No ratings yet
Social Aspects of Interlanguage
4 pages
Microcontroller 8051
No ratings yet
Microcontroller 8051
72 pages
Sped Report Card - Docx Final
No ratings yet
Sped Report Card - Docx Final
2 pages
Microcontroller (1) Lab Manual: Prepared By: Eng: Mohsen Ali AL-awami Supervisered By: DR: 2010-2011
No ratings yet
Microcontroller (1) Lab Manual: Prepared By: Eng: Mohsen Ali AL-awami Supervisered By: DR: 2010-2011
20 pages
Stack and Delay
No ratings yet
Stack and Delay
11 pages
Eduqas GCSE Music 9-1 Practice Paper
100% (1)
Eduqas GCSE Music 9-1 Practice Paper
34 pages
Examples Stack N SR
No ratings yet
Examples Stack N SR
11 pages
Lab Manual For Computer Organization and Assembly Language: Stack
No ratings yet
Lab Manual For Computer Organization and Assembly Language: Stack
8 pages
Mm-Assignmemt-4 2
No ratings yet
Mm-Assignmemt-4 2
9 pages
#59 Blinking LED - Alternative Delay Subroutine
No ratings yet
#59 Blinking LED - Alternative Delay Subroutine
14 pages
2.avr Risc
No ratings yet
2.avr Risc
46 pages
Lab 5
No ratings yet
Lab 5
9 pages
Microprocessor
No ratings yet
Microprocessor
7 pages
Lecture5 INSTRUCTIONS MICROPROCESSOR APLICATIONS
No ratings yet
Lecture5 INSTRUCTIONS MICROPROCESSOR APLICATIONS
58 pages
#58 Blinking An Led Using The Atmega328 Assembly Language
No ratings yet
#58 Blinking An Led Using The Atmega328 Assembly Language
24 pages
Investigating CPU Instructions
No ratings yet
Investigating CPU Instructions
4 pages
AVR Assembler Examples
No ratings yet
AVR Assembler Examples
19 pages
Lab 4: Digital Input Output: EE222: Microprocessor Systems
No ratings yet
Lab 4: Digital Input Output: EE222: Microprocessor Systems
5 pages
Programming Model 2 Tutorial
No ratings yet
Programming Model 2 Tutorial
8 pages
Counting of Figures (Reseoning)
No ratings yet
Counting of Figures (Reseoning)
4 pages
Lab 03
No ratings yet
Lab 03
7 pages
Subroutines and Loop Delay
No ratings yet
Subroutines and Loop Delay
8 pages
Chap 8
No ratings yet
Chap 8
50 pages
L09-Call and Stack
No ratings yet
L09-Call and Stack
33 pages
Arithmetic Instructions
No ratings yet
Arithmetic Instructions
100 pages
Stack
No ratings yet
Stack
13 pages
LABORATORY EXERCISE #6: Investigating CPU Instructions Objectives
No ratings yet
LABORATORY EXERCISE #6: Investigating CPU Instructions Objectives
6 pages
Electric Network Analysis Lab
No ratings yet
Electric Network Analysis Lab
11 pages
Topic 8 Call, Stack & Subroutine (ISMAIL - SKE - 2019)
No ratings yet
Topic 8 Call, Stack & Subroutine (ISMAIL - SKE - 2019)
38 pages
The Stack
No ratings yet
The Stack
9 pages
Chapter 08 ARM Subroutines 2 Stack Perserve Environment Edited
No ratings yet
Chapter 08 ARM Subroutines 2 Stack Perserve Environment Edited
61 pages
Pic:Micro-Controller Architecture & Instruction Set: ECE Department, DTU
No ratings yet
Pic:Micro-Controller Architecture & Instruction Set: ECE Department, DTU
19 pages
Module 3
No ratings yet
Module 3
97 pages
NET3001 4 AdvAsm
No ratings yet
NET3001 4 AdvAsm
43 pages
Microprocessor Systems & Interfacing EEE-342: Comsats University
No ratings yet
Microprocessor Systems & Interfacing EEE-342: Comsats University
10 pages
Shot Booty
No ratings yet
Shot Booty
5 pages
Lecture 4 Program Loops and Arrays
No ratings yet
Lecture 4 Program Loops and Arrays
50 pages
Micro Topic 6 PDF
No ratings yet
Micro Topic 6 PDF
17 pages
M PLC Experiments
No ratings yet
M PLC Experiments
38 pages
Microprocessors and Microcontrollers-2
No ratings yet
Microprocessors and Microcontrollers-2
141 pages
Marxismo y Dialéctica, Lucio Colletti
No ratings yet
Marxismo y Dialéctica, Lucio Colletti
24 pages
DucHuy CA Lab2 2021
No ratings yet
DucHuy CA Lab2 2021
25 pages
Lecture 2
No ratings yet
Lecture 2
23 pages
ARM Prog Model 6 Subroutines PDF
No ratings yet
ARM Prog Model 6 Subroutines PDF
34 pages
Embedded Lab Experiment Program
No ratings yet
Embedded Lab Experiment Program
30 pages
The Stack, Subroutines, Interrupts and Resets
No ratings yet
The Stack, Subroutines, Interrupts and Resets
20 pages
Lecture Notes # 8 Outline of The Lecture: Control Transfer Instructions CALL Statement Subroutines
No ratings yet
Lecture Notes # 8 Outline of The Lecture: Control Transfer Instructions CALL Statement Subroutines
7 pages
PowerElectronics Class6
No ratings yet
PowerElectronics Class6
26 pages
MC - Module 3 Notes
No ratings yet
MC - Module 3 Notes
10 pages
COAL Lab Manual - Week 13
No ratings yet
COAL Lab Manual - Week 13
10 pages
MP Module 3
No ratings yet
MP Module 3
8 pages
Avr A & A: Rchitecture Ssembly
No ratings yet
Avr A & A: Rchitecture Ssembly
45 pages
IT343 Lecture 02
No ratings yet
IT343 Lecture 02
25 pages
Kabire Magazine Issue 3
No ratings yet
Kabire Magazine Issue 3
39 pages
Directed Writing Practice 1 Formal Letter
100% (4)
Directed Writing Practice 1 Formal Letter
4 pages
1000+ Core Java & Advance Java
No ratings yet
1000+ Core Java & Advance Java
24 pages
Performance Comparison of Graph Database and Relational Database
No ratings yet
Performance Comparison of Graph Database and Relational Database
14 pages
Cognos Framework Manager Example
100% (1)
Cognos Framework Manager Example
7 pages
Isogonal
No ratings yet
Isogonal
56 pages
Ensayos en Línea para Comprar
100% (1)
Ensayos en Línea para Comprar
7 pages
Lec 11 SPI
No ratings yet
Lec 11 SPI
54 pages
Dictionary of Interjections (Aww, Dics Oh, Ah, Eek, Oops)
No ratings yet
Dictionary of Interjections (Aww, Dics Oh, Ah, Eek, Oops)
6 pages
Ap Syllabus
No ratings yet
Ap Syllabus
7 pages
Mcma 2433
No ratings yet
Mcma 2433
53 pages
Gram and Voc Starter Unit Possessive 'S: Be: Affirmative, Negative and Questions
No ratings yet
Gram and Voc Starter Unit Possessive 'S: Be: Affirmative, Negative and Questions
95 pages
EE234 - Lec - 02
No ratings yet
EE234 - Lec - 02
49 pages
EE234 - Lec - 03
No ratings yet
EE234 - Lec - 03
45 pages
Lec 12 ADC
No ratings yet
Lec 12 ADC
87 pages
Lec - 12 Revised by Prof
No ratings yet
Lec - 12 Revised by Prof
91 pages
Signals and Daemon Processes: UNIX Programming
No ratings yet
Signals and Daemon Processes: UNIX Programming
17 pages
Báo Cáo Thực Hành Vi Điều Khiển
No ratings yet
Báo Cáo Thực Hành Vi Điều Khiển
39 pages
Net Order Value in Me2k Is Not Correct
No ratings yet
Net Order Value in Me2k Is Not Correct
2 pages
What Is Grammar - Nelson
No ratings yet
What Is Grammar - Nelson
9 pages
Lec 13 TWI
No ratings yet
Lec 13 TWI
143 pages
Indiana University Summer Language Workshop Student Handbook 2020
No ratings yet
Indiana University Summer Language Workshop Student Handbook 2020
17 pages
EE234 - Lec - 05 - v2
No ratings yet
EE234 - Lec - 05 - v2
34 pages
Reed
No ratings yet
Reed
12 pages
Emeet Luna
No ratings yet
Emeet Luna
6 pages
I3350 - Lecture 1 - Part 1 - Introduction
No ratings yet
I3350 - Lecture 1 - Part 1 - Introduction
7 pages
Word Formation Practice Acts & Key FIRST TRAINER
No ratings yet
Word Formation Practice Acts & Key FIRST TRAINER
4 pages
Inv Preview Acc 5721434 88556 23
No ratings yet
Inv Preview Acc 5721434 88556 23
2 pages
Common Regular Verbs
No ratings yet
Common Regular Verbs
2 pages
Grammar Exercises - Revisión Del Intento
No ratings yet
Grammar Exercises - Revisión Del Intento
6 pages
IP Routing Protocols All-in-one: OSPF EIGRP IS-IS BGP Hands-on Labs
From Everand
IP Routing Protocols All-in-one: OSPF EIGRP IS-IS BGP Hands-on Labs
Redouane MEDDANE
No ratings yet
CISCO PACKET TRACER LABS: Best practice of configuring or troubleshooting Network
From Everand
CISCO PACKET TRACER LABS: Best practice of configuring or troubleshooting Network
Mulayam Singh
No ratings yet
ROUTING INFORMATION PROTOCOL: RIP DYNAMIC ROUTING LAB CONFIGURATION
From Everand
ROUTING INFORMATION PROTOCOL: RIP DYNAMIC ROUTING LAB CONFIGURATION
Mulayam Singh
No ratings yet

EE234 - Lec - 04

Uploaded by

EE234 - Lec - 04

Uploaded by

AVR XMEGA Microcontroller

Assembly Language Programming

lsl rd ; bit 7 of rd shift to C flag, 0 is shifted to bit 0

lsr rd ; 0 is shifted to bit 7, bit 0 is shifted to C flag

rol rd ; C flag transfers to bit 0, bit 7 is shifted to C

Flexibility can be added to this subroutine and make it more useful.

To Read locVark: To Write into locVark:

Table 5.2 Subroutine call instructions

P x Q = PHQH x 216 + (PHQL + QHPL) x 28 + PLQL

address PR + 3 PR + 2 PR + 1 PR Final product P × Q

Figure 5.4 16-bit by 16-bit multiplication

Write R Shift left

Figure A16. The shift-and-subtract divider hardware

Figure 5.5 Successive-approximation method for finding square root

array[i] > array[i+1]? no

Figure 5.9 Logic flow of bubble sort

You might also like