200 How To Write A Bootloader From Scratch
200 How To Write A Bootloader From Scratch
com/blog/how-to-write-a-bootloader-from-scratch
Previously, we wrote a startup file to bootstrap our C environment, and a linker script
to get the right data at the right addresses. These two will allow us to write a
monolithic firmware which we can load and run on our microcontrollers.
In practice, this is not how most firmware is structured. Digging through vendor SDKs,
you’ll notice that they all recommend using a bootloader to load your applications. A
bootloader is a small program which is responsible for loading and starting your
application.
In this post, we will explain why you may want a bootloader, how to implement one,
and cover a few advanced techniques you may use to make your bootloader more
useful.
If you’d rather listen to me present this information and see some demos in action,
watch this webinar recording
Table of Contents
Why you may need a bootloader
A minimal bootloader
Setting the stage
Deciding on a memory map
Implementing the bootloader itself
Making our app bootloadable
Putting it all together
Beyond the MVP
Message passing to catch reboot loops
Relocating our app from flash to RAM
Locking the bootloader with the MPU
Closing
Why you may need a bootloader
Bootloaders serve many purposes, ranging from security to software architecture.
Most commonly, you may need a bootloader to load your software. Some
microcontrollers like Dialog’s DA14580 have little to no onboard flash and instead
rely on an external device to store firmware code. In that case, it is the bootloader’s
job to copy code from non-executable storage, such as a SPI flash, to an area of
memory that can be executed from, such as RAM.
Bootloaders also allow you to decouple parts of the program that are mission critical,
or that have security implications, from application code which changes regularly. For
example, your bootloader may contain firmware update logic so your device can
recover no matter how bad a bug ships in your application firmware.
Last but certainly not least, bootloaders are an essential component of a trusted
boot architecture. Your bootloader can, for example, verify a cryptographic signature
to make sure the application has not been replaced or tampered with.
A minimal bootloader
Let’s build a simple bootloader together. To start, our bootloader must do two things:
Another important factor is your flash sector size: you want to make sure you can
erase app sectors without erasing bootloader data, or vice versa. Consequently, your
bootloader region must end on a flash sector boundary (typically 4kB).
0x0 +---------------------+
| |
| Bootloader |
| |
0x4000 +---------------------+
| |
| |
| Application |
| |
| |
0x30000 +---------------------+
We can transcribe that memory into a linker script:
/* memory_map.ld */
MEMORY
{
bootrom (rx) : ORIGIN = 0x00000000, LENGTH = 0x00004000
approm (rx) : ORIGIN = 0x00004000, LENGTH = 0x0003C000
ram (rwx) : ORIGIN = 0x20000000, LENGTH = 0x00008000
}
__bootrom_start__ = ORIGIN(bootrom);
__bootrom_size__ = LENGTH(bootrom);
__approm_start__ = ORIGIN(approm);
__approm_size__ = LENGTH(approm);
Since linker scripts are composable, we will be able to include that memory map into
the linker scripts we write for our bootloader and our application.
You’ll notice that the linker script above declares some variables. We’ll need those
for our bootloader to know where to find the application. To make them accessible in
C code, we declare them in a header file:
/* memory_map.h */
#pragma once
We know how to do the first part from our previous post: we need a valid stack
pointer at address 0x0 , and a valid Reset_Handler function setting up our
environment at address 0x4. We can reuse our previous startup file and linker script,
with one change: we use memory_map.ld rather than define our own MEMORY
section.
We also need to put our code in the bootrom region from our memory rather than
the rom region in our previous post.
/* bootloader.ld */
INCLUDE memory_map.ld
/* Section Definitions */
SECTIONS
{
.text :
{
KEEP(*(.vectors .vectors.*))
*(.text*)
*(.rodata*)
_etext = .;
} > bootrom
...
}
To jump into our application, we need to know where the Reset_Handler of the app
is, and what stack pointer to load. Again, we know from our previous post that those
should be the first two 32-bit words in our binary, so we just need to dereference
those addresses using the __approm_start__ variable from our memory map.
/* bootloader.c */
#include <inttypes.h>
#include "memory_map.h"
int main(void) {
uint32_t *app_code = (uint32_t *)__approm_start__;
uint32_t app_sp = app_code[0];
uint32_t app_start = app_code[1];
/* TODO: Start app */
/* Not Reached */
while (1) {}
}
Next we must load that stack pointer and jump to the code. This will require a bit of
assembly code.
ARM MCUs use the msr instruction to load immediate or register data into system
registers, in this case the MSP register or “Main Stack Pointer”.
We wrap those two into a start_app function which accepts our pc and sp as
arguments, and get our minimal bootloader:
/* bootloader.c */
#include <inttypes.h>
#include "memory_map.h"
int main(void) {
uint32_t *app_code = (uint32_t *)__approm_start__;
uint32_t app_sp = app_code[0];
uint32_t app_start = app_code[1];
start_app(app_start, app_sp);
/* Not Reached */
while (1) {}
}
Note: hardware resources initialized in the bootloader must be de-initialized before
control is transferred to the app. Otherwise, you risk breaking assumptions the app
code is making about the state of the system
/* app.ld */
INCLUDE memory_map.ld
/* Section Definitions */
SECTIONS
{
.text :
{
KEEP(*(.vectors .vectors.*))
*(.text*)
*(.rodata*)
_etext = .;
} > approm
...
}
We also need to update the vector table used by the microcontroller. The vector
table contains the address of every exception and interrupt handler in our system.
When an interrupt signal comes in, the ARM core will call the address at the
corresponding offset in the vector table.
For example, the offset for the Hard fault handler is 0xc, so when a hard fault is hit,
the ARM core will jump to the address contained in the table at that offset.
By default, the vector table is at address 0x0, which means that when our chip
powers up, only the bootloader can handle exceptions or interrupts! Fortunately,
ARM provides the Vector Table Offset Register to dynamically change the address of
the vector table. The register is at address 0xE000ED08 and has a simple layout:
31 7 0
+-----------------------------------+--------------+
| | |
| TBLOFF | Reserved |
| | |
+-----------------------------------+--------------+
Where TBLOFF is the address of the vector table. In our case, that’s the start of our
text section, or _stext. To set it in our app, we add the following to our
Reset_Handler:
/* startup_samd21.c */
/* Set the vector table base address */
uint32_t *vector_table = (uint32_t *) &_stext;
uint32_t *vtor = (uint32_t *)0xE000ED08;
*vtor = ((uint32_t) vector_table & 0xFFFFFFF8);
One quirk of the ARMv7-m architecture is the alignment requirement for the vector
table, as specified in section B1.5.3 of the reference manual:
The Vector table must be naturally aligned to a power of two whose alignment value
is greater than or equal to (Number of Exceptions supported x 4), with a minimum
alignment of 128 bytes.The entry at offset 0 is used to initialize the value for SP_main,
see The SP registers on page B1-8. All other entries must have bit [0] set, as the bit is
used to define the EPSR T-bit on exception entry (see Reset behavior on page B1-20
and Exception entry behavior on page B1-21 for details).
Our SAMD21 MCU has 28 interrupts on top of the 16 system reserved exceptions, for
a total of 44 entries in the table. Multiply that by 4 and you get 176. The next power
of 2 is 256, so our vector table must be 256-byte aligned.
Putting it all together
Because it is hard to witness the bootloader execute, we add a print line to each of
our programs:
/* boootloader.c */
#include <inttypes.h>
#include "memory_map.h"
int main() {
serial_init();
printf("Bootloader!\n");
serial_deinit();
start_app(app_start, app_sp);
/* app.c */
int main() {
serial_init();
set_output(LED_0_PIN);
printf("App!\n");
while (true) {
port_pin_toggle_output_level(LED_0_PIN);
for (int i = 0; i < 100000; ++i) {}
}
}
Note that the bootloader must deinitialize the serial peripheral before starting the
app, or you’ll have a hard time trying to initialize it again.
You can compile both these programs and load the resulting elf files with gdb which
will put them at the correct address. However, the more convenient thing to do is to
build a single binary which contains both programs.
We implement those rule in our Makefile, to avoid having to type them out each
time:
# Makefile
$(BUILD_DIR)/$(PROJECT)-app.bin: $(BUILD_DIR)/$(PROJECT)-app.elf
$(OCPY) $< $@ -O binary
$(SZ) $<
$(BUILD_DIR)/$(PROJECT)-boot.bin: $(BUILD_DIR)/$(PROJECT)-boot.elf
$(OCPY) --pad-to=0x4000 --gap-fill=0xFF -O binary $< $@
$(SZ) $<
Last but not least, we need to concatenate our two binaries. As funny as that may
sound, this is best achieved with cat:
# Makefile
$(BUILD_DIR)/$(PROJECT).bin: $(BUILD_DIR)/$(PROJECT)-boot.bin
$(BUILD_DIR)/$(PROJECT)-app.bin
cat $^ > $@
Beyond the MVP
Our bootloader isn’t too useful so far, it only loads our application. We could do just
as well without it. In the following sections, I will go through a few useful things you
can do with a bootloader.
More often than not, we can use a region of RAM to get the same result. As long as
the system remains powered, the RAM will keep its state even if the device reboots.
First, we carve some RAM for shared data in our memory map:
/* memory_map.ld */
MEMORY
{
bootrom (rx) : ORIGIN = 0x00000000, LENGTH = 0x00004000
approm (rx) : ORIGIN = 0x00004000, LENGTH = 0x0003C000
shared (rwx) : ORIGIN = 0x20000000, LENGTH = 0x1000
ram (rwx) : ORIGIN = 0x20001000, LENGTH = 0x00007000
}
We can then create a data structure and assign it to this section, with getters to read
it:
/* shared.h */
#include <inttypes.h>
uint8_t shared_data_get_boot_count(void);
void shared_data_increment_boot_count(void);
void shared_data_reset_boot_count(void);
/* shared.c */
#include "shared.h"
uint8_t shared_data_get_boot_count(void) {
return sd->boot_count;
}
void shared_data_increment_boot_count(void) {
sd->boot_count++;
}
void shared_data_reset_boot_count(void) {
sd->boot_count = 0;
}
We compile the shared module into both our app and our bootloader, and can read
the boot count in both programs.
/* memory_map.ld */
MEMORY
{
bootrom (rx) : ORIGIN = 0x00000000, LENGTH = 0x00010000
approm (rx) : ORIGIN = 0x00010000, LENGTH = 0x00004000
ram (rwx) : ORIGIN = 0x20000000, LENGTH = 0x00004000
eram (rwx) : ORIGIN = 0x20004000, LENGTH = 0x00004000
}
__bootrom_start__ = ORIGIN(bootrom);
__bootrom_size__ = LENGTH(bootrom);
__approm_start__ = ORIGIN(approm);
__approm_size__ = LENGTH(approm);
__eram_start__ = ORIGIN(eram);
__eram_size__ = LENGTH(eram);
In this case, approm is our app storage and eram is our executable RAM, where we
want to copy our program. Our bootloader needs to copy the code from approm to
eram before executing it.
We know from our previous blog post that executable code typically ends up in
the .text section so we must tell the linker that this section is stored in approm but
executed from eram so our program can execute correctly.
This is similar to our .data section, which is stored in rom but lives in ram while the
program is running. We use the AT linker command to specify the storage region and
the > operator to specify the load region. This is the resulting linker script section:
/* app.ld */
SECTIONS {
.text :
{
KEEP(*(.vectors .vectors.*))
*(.text*)
*(.rodata*)
} > eram AT > approm
...
}
We then update our bootloader to copy our code from one to the other before
starting the app:
/* booloader.c */
If you do not know about the MPU, check out Chris’s excellent blog post from a few
weeks ago.
Remember that our MPU regions must be power-of-2 sized. Thankfully, our
bootloader already is! 0x4000 is 2^14 bytes.
/* bootloader.c */
int main(void) {
/* ... */
base_addr = 0x0;
*mpu_rbar = (base_addr | 1 << 4 | 1);
// AP=0b110 to make the region read-only regardless of privilege
// TEXSCB=0b000010 because the Code is in "Flash memory"
// SIZE=13 because we want to cover 16kiB
// ENABLE=1
*mpu_rasr = (0b110 << 24) | (0b000010 << 16) | (13 << 1) | 0x1;
start_app(app_start, app_sp);
/* Not reached */
while (1) {}
}
Closing
We hope reading this post has given you a good idea of how bootloaders work, and
what you can do with them. As with previous posts, code examples are available on
Github in the zero to main repository.
What cool things does your bootloader do? Tell us all about it in the comments, or at
[email protected].
Next time in the series, we’ll talk about bootstrapping the C library!
Interested in learning more device firmware update best practices? Watch this
webinar recording
François Baldassari has worked on the embedded software teams at Sun, Pebble, and
Oculus. He is currently the CEO of Memfault.