CO Lab Manual
CO Lab Manual
Edition 2020-20211
1
Revision 20201014T1431
Acknowledgements
The CSE1400 lab course assignments were originally developed by Sidney Cadot for the PowerPC
architecture. The accompanying documents, two versions of the assignments and a tutorial on
PowerPC assembler, were also written by Sidney. Jonne Zutt maintained these documents over
the years after Sidney left the university. In 2004, the decision was made to change the target
architecture of the lab course from the somewhat obscure PowerPC to the ubiquitous Intel x86
platform. The lab course environment had to change accordingly: the carefully tweaked commer-
cial IDE/PowerPC emulator running on Microsoft Windows was abandoned in favour of the GNU
assembler and debugger, which are available on nearly every Linux distribution. Because of these
changes, the assignments and the accompanying reference material were rewritten by Denis de
Leeuw Duarte.
In 2005 a new curriculum has been started, which transformed the old lab into a more elaborate
project for Software Technology students, while Media and Knowledge Technology students only
had to do a trimmed down version. The manual was modified to match the new requirements by
Bas van der Doorn and Sander Koning.
In 2011 and 2012 the manual was further updated by Mihai Capotă and Alexandru Iosup and
in 2013 and 2014 the manual received a refresh and expansion by Otto Visser. In 2014, the decision
was made to switch from the x86 to the x86-64 architecture and the manual was edited by Elvan
Kula. Maarten Sijm restructured the manual in 2019 and during the Corona (no, not the beer)
summer of 2020 further restructuring and editing was done by Sára Juhošová, Taico Aerts and
Otto Visser.
i
Rules & Regulations
This preface states the rules and regulations to which you need to adhere in order to have your
assignments approved. These rules effectively apply to all lab courses, but to avoid any confusion,
they are stated here explicitly for this course.
We stress that your work will not be considered fit for approval until you meet all of these criteria.
The details on what we consider to be correct in this context will be defined in the subsequent
paragraphs.
Correct Specifications
If an assignment so asks, you will need to hand in a correct specification for the specified part of
the exercise. Specifications must be written in pseudocode. Pseudocode is code that resembles
actual high level (non-assembler) program code closely. It is called pseudocode because it does
not necessarily have to be real, compilable code. A good, clear example of pseudocode is provided
in Section (3.1). We expect your pseudocode to be similar in clarity, quality and level of detail.
Specifically, this means that your pseudocode must have:
• proper comments
• a good, clean layout
We expect that you have your specifications checked before you start on your implementation work.
It is highly likely that the teaching assistants will ask you to make changes to your specifications,
so we strongly advise you not to start programming before having your specs approved. If you
choose to ignore this advice, we do not accept responsibility for any wasted effort on your part.
ii
Algorithmically Correct Code
The programs that you hand in should be algorithmically correct. The correct algorithm is, by
definition, the algorithm that was conveyed to you by the teaching assistants during the approval
of your specifications. This implies that a program cannot be algorithmically correct if you did not
have your specifications approved by a teaching assistant. We explicitly state here that you are
not free to implement the desired functionality at your personal leisure and that solutions which
are merely functionally correct are not sufficient for approval.
Deadlines
In addition to the rules listed here, there may be active deadlines. Please check the Brightspace
pages for the most up-to-date information regarding active deadlines. Please note that teaching
assistants are not allowed to consider any work after the expiration of a deadline. Any complaints
should thus be addressed to the course coordinators.
Anti-Fraud Policy
Our anti-fraud policy is very simple: zero-tolerance, within the limits set by TU Delft. We will
pursue each case of potential fraud, and will use the means provided by TU Delft to punish
(attempts to) fraud.
The following are some of the cases that are considered fraud:
• Sending your code to other groups. The motivation of “I sent it for them to find some
inspiration” does not work.
• Copying somebody else’s code. Changing the names of variables in someone else’s code and
submitting the results is still considered fraud.
• Receiving help from someone, when the help amounts to letting that someone write your
code.
• Renting the services of a programmer, for example from Rent-a-Coder.ro, to solve the
assignments for you.
iii
Contents
Acknowledgements i
1 Introduction 2
1.1 Lab Course Rules and Etiquette . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.1 Verifying Your Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Assumed Prior Knowledge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Schedule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.4 Why Assembly (Still) Matters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2 Getting Started 7
2.1 Setting Up Your Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1.1 Direct GCC Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1.2 WSL: Windows Subsystem for Linux . . . . . . . . . . . . . . . . . . . . . . 8
2.1.3 Virtual Machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 Editors & Syntax Highlighting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 Building, Running and Debugging Programs . . . . . . . . . . . . . . . . . . . . . . 11
2.3.1 Building & Running . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3.2 Debugging with GDB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3 Mandatory Content 13
3.1 Designing a Program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.2 Assembler Directives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.3 x86-64 Assembly Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.3.1 About the AT&T Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.3.2 Instructions and Operands . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.3.3 Instruction Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.4 Registers & Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.5 The Stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.5.1 The Stack Pointer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.5.2 Cleaning up the Stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.5.3 The Base Pointer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.5.4 Subroutine Prologue and Epilogue . . . . . . . . . . . . . . . . . . . . . . . 23
3.5.5 Accessing arguments passed via the stack . . . . . . . . . . . . . . . . . . . 24
3.6 Subroutines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.6.1 Calling Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.6.2 Recursive Subroutines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.7 I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.7.1 Printing to the Terminal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.7.2 Reading from the Terminal . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.8 The End of the Program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
iv
3.9 Programming Constructs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.9.1 Conditional Branching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.9.2 if-then-else Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.9.3 Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.10 ASSIGNMENT 1: Powers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.10.1 Part A: Getting Started . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.10.2 Part B: Your First Assembly Algorithm . . . . . . . . . . . . . . . . . . . . 31
3.11 ASSIGNMENT 2: Recursion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.12 Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.13 Bit Shifting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.14 ASSIGNMENT 3: Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4 Bonus Content 35
4.1 ANSI Escape Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.2 ASSIGNMENT 4: A Colourful Discovery (max 500 points) . . . . . . . . . . . . . 35
4.3 Advanced Programming Constructs . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.3.1 Switch-Case Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.3.2 Lookup Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.4 Doing I/O without the C library . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.5 ASSIGNMENT 5 (500 points) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.5.1 Option A: Implement “diff” in Assembly . . . . . . . . . . . . . . . . . . . 39
4.5.2 Option B: Implement a Simplified printf Function . . . . . . . . . . . . . . 40
4.5.3 Option C: Implement a Hashing Function . . . . . . . . . . . . . . . . . . . 41
4.6 ASSIGNMENT 6: Hide Data in a Bitmap (750 points) . . . . . . . . . . . . . . . . 42
4.7 ASSIGNMENT 7: Implement an Interpreter for Brainfuck (500–800 points) . . . . 45
4.8 Mixing Assembly Language with C/C++ . . . . . . . . . . . . . . . . . . . . . . . 46
4.9 ASSIGNMENT 8: Implement a Game (1,000 points) . . . . . . . . . . . . . . . . . 46
4.9.1 Notes and Hints about Solving this Assignment . . . . . . . . . . . . . . . . 47
4.9.2 Last But Not Least . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
1
Chapter 1
Introduction
Given that this course is given in the first quarter of the first year, it is quite likely that you
have zero programming experience. There is however also a reasonably large group amongst you
that seems to have some programming experience. For those, a computer was usually “sketched”
as a machine that executes program statements in a line-by-line fashion, manipulating variables
and performing calculations as it goes along. This simple model allowed you to write your first
computer programs in Java, Python, or some other language. Of course, the underlying mechanics
of executing programs are a bit more complicated. It is quite likely that you have already en-
countered the limitations of this naive model in the form of incomprehensible error messages and
seemingly inexplicable program failures. Apparently, it is necessary to acquire a more thoroughly
detailed mental picture of the machine in order to solve many common programming problems.
The goal of this lab course is to fill in these details. Much of this knowledge is to be acquired
through practical assignments, but careful reading is also an important part of the educational
process.
You should read the remainder of this introductory section as well as all of Section 3 be-
fore starting on the first assignment in Section 3.10. Take special notice of Section 1.2, which
summarises and refreshes the prior knowledge that we assume on your part.
Keep in mind that the work you hand in during the lab is subject to certain rules and quality
guidelines. The rules and guidelines are listed explicitly for this lab course in the Preface. You
are expected to check your work prior to handing it in for review. The lab course assistants will
only consider work that is in full compliance with the stated rules and guidelines.
2
1.1.1 Verifying Your Work
In order for you to get the credit that you passed a lab exercise, you need to have your work
checked by a teaching assistant. For this, you will have to submit your work to Submit1 and then
enqueue to the digital Queue2 . Please enable notifications on this page. Once one of our assistants
picks up your request, you will receive a notification. There will be a link to start a video call
(Jitsi) with the TA on the page. In this video call you have the opportunity to explain your work.
Background Knowledge
We assume that you will follow and understand the lectures and study the accompanying book and
lecture notes in the necessary detail. In particular, we assume that you know and understand what
opcodes, instructions, subroutines, stacks, registers and program counters are and that you have
a general idea of what occurs during the compilation and linking stage of an executable program.
We further assume that you know the difference between bits, nibbles and bytes and that you
can convert numbers between different number representation systems (hexadecimal, binary, etc.)
and we assume that you understand the concept of endianness. These topics will all be treated
during the lectures and instructions.
In this lab course you will learn how to program in the x86-64 assembly language. You should
know that “the x86-64 architecture” is the generic name for the architecture of CPUs found in
garden-variety personal computers3 . We assume that you have studied the x86-64 architecture
in your lecture notes. You should know that x86-64 has fourteen 64-bit general purpose registers
(RAX, RBX, RCX, RDX, RDI, RSI, and R8-R15) which you can use freely when writing programs.
It also has a 64-bit stack pointer register (RSP) which contains the memory address of the top
of the current program stack and a base pointer register (RBP), which is used during subroutine
execution. The purpose of these registers will be explained to you during the assignments.
Finally, you should know what we mean by the Von Neumann architecture, i.e., you should
know that a computer is roughly comprised of three subsystems: a CPU, a random access main
memory and an IO subsystem which is in turn comprised of IO devices such as mice, keyboards,
sound cards, hard disks, etc. You should know that the CPU is capable of executing instructions
which reside in the main memory and that instructions are simply binary codes of varying lengths.
In the next paragraph we refresh your knowledge of some important definitions regarding computer
languages.
Essential Concepts
It is likely that you do not yet have a very clear image in your mind as to what goes on inside the
bowels of your computer when it is executing a program. The main goal of this lab course is to
remedy this situation. To start off, we will eliminate any romantic preconceptions on your part
regarding machine, assembly, and high-level programming languages by carefully restating their
definitions and their respective purposes. Much of this should already be known to you but it is
essential that you read the following carefully.
1 https://submit.tudelft.nl
2 https://queue.tudelft.nl
3 In the literature you may encounter the terms AMD64 and x64, which are synonyms for x86-64.
3
Computer memory stores programs as sequences of instructions and data in binary format. To
a computer, there is no essential difference between instructions and data. Here is a real example
of part of a computer program, as it would look in the main memory of a computer:
0 01001000
1 11000111
2 11000000
3 00000001
4 00000000
5 00000000
6 00000000
7 01001000
8 11000111
9 11000001
10 00000001
11 00000000
12 00000000
13 00000000
14 01001000
15 00000001
16 11000001
The line numbers are not part of the program but you could think of them as memory addresses,
as each byte has its own memory address. This type of “zeros and ones” program representation
is called the machine language representation of a program. The machine language representation
is, of course, the only representation that a computer can understand. Any higher level language
representation of a program (such as a Java or C program) needs to be translated into machine
language at some point, before it can be executed by the hardware. Machine languages are specific
to the CPU architecture, e.g., there is a PowerPC machine language, an x86-64 machine language,
etc.
The little code snippet above is actual x86-64 machine code written in binary notation. As you
can see, programs take up a large amount of space when written in binary, so we will commonly
write such code in hexadecimal format. As you know, each nibble is represented by one hex digit
so we can rewrite these 17 bytes of code in 34 hex digits:
We have regrouped the bytes in such a way that one instruction fits on one line and we have
added some explanatory comments, which are denoted by the ‘#’ character. Do not be afraid at
this point if you do not understand the code fragment completely, but please do take a close look.
In this piece of code you can see three x86-64 instructions with some operand data. The first
instruction is the so called mov instruction. It moves a value to a register. In this case, the value
is the number 1 and the register is the x86-64’s RAX register. On the second line you see another
mov instruction, again with operand 1, but this time it moves the value to the RCX register. The
third line shows us the add instruction which sums the contents of RAX and RCX and stores the
result in RCX. As you can see, not all of the instructions in the x86-64 architecture are of the
same length. Usually the actual instruction is one to two bytes long. The mov instruction is only
two bytes long (48 c7) and the add instruction is three bytes long (48 01 c1). The operand data
of the mov instructions is four bytes long. Since x86-64 is a little-endian machine, the four byte
integer 1 is encoded as 01 00 00 00 as you can see. One final thing to note is that the operand
registers are encoded as c0 and c1.
4
In ancient times4 , programmers had to enter programs into the computer’s memory by means
of punch cards - small pieces of cardboard which contained zeros and ones encoded in the presence
or absence of holes in the cardboard. Obviously, writing computer programs in zeros and ones or
hexadecimal numbers like this was a very cumbersome, error-prone task and in modern times it
would be next to impossible. Modern computer programs easily contain around 10 MB of machine
code, which would amount to 83 886 080 zeros and ones or 20 971 520 hex digits. You could imagine
the horror of having to type them by hand. Worse, you could imagine the horror of finding bugs
in such programs! For these reasons, people switched to assemblers in the 1950s. Assemblers are
special computer programs that translate text from a more humanly readable symbolic assembly
language (“assembly” for short) into machine code. In the assembly language, each instruction
code has a short mnemonic or nickname associated with it and each number can be represented
in decimal or hex, instead of in bits. Just as each architecture has its own machine code, each
machine code has its own assembly language. Below, we see the same program snippet as above,
but this time it is written in x86-64 assembly language:
movq $1, %rax # Move the number 1 into the RAX register
movq $1, %rcx # Move the number 1 into the RCX register
addq %rax, %rcx # Add the contents of RAX to RCX
As you can see, our cryptic piece of binary machine code is as simple as 1 + 1! Well, almost. The
most significant property of assembly language is that it closely resembles the raw machine code in
structure. If you have a table of opcodes and some knowledge of the fields in an x86-64 instruction
you could easily translate this assembly code directly into machine code - even by hand.
Unfortunately, Writing large programs in assembly language still has its drawbacks. As pro-
grammers, we still have to deal with millions upon millions of instructions and we still have to
concern ourselves with registers, stacks and memory locations, while we would rather like to focus
on solving our problems. If we want to add two numbers, we would like to tell the computer
something like y = a + b rather than having to explain the operation in register-level detail. In
short, we would like to program computers in a language that translates easily to mathematics or
English rather than machine language. This desire prompted the development of so-called 3GLs5 .
You have probably heard of a lot of these 3GLs and in due time you will learn many of them:
C, C++, Java, Pascal, Prolog, Haskell, etc. In order to run a program that is written in a 3GL,
we need to translate it to assembly language first6 . The tools that do this are called compilers.
Unlike assemblers, compilers are amongst the most complex of computer programs in existence.
Compiler technology continues to evolve as it has done for over sixty years since admiral Grace
Hopper wrote the first compiler in assembly language7 . It is often difficult to predict exactly what
instructions a compiler will generate when given a particular snippet of 3GL code. Nonetheless,
below is a line of C/Java code that might cause a compiler to spit out something that looks like
our example fragment:
int x = 1 + 1;
And there it is! As an aside, it is, in fact, highly unlikely that a compiler will ever generate our
little snippet of assembly code due to optimisations like constant folding. Luckily for you, that is
entirely outside the scope of this lab course.
1.3 Schedule
During the lab course you will learn the basics of writing programs in assembly language. You will
learn how to call functions and what recursion looks like. The following table gives an overview
4 In computing terms, “ancient” is roughly between 1920 and 1950.
5 This stands for “third generation language”, with machine language being the first and assembly being the
second generation.
6 We could also use an interpreter, but we ignore that option here.
7 Hopper is also renowned for the discovery of the first “bug”: a dead moth in one of the relay switches of the
Mark II calculator.
5
of the schedule for the lab.
Week 1: Read the lab manual and set up the environment
Week 2: Assignment 1 (4 hours)
Week 3: Assignment 2 (6 hours)
Week 4: Assignment 3 (6 hours)
Week 5: Midterm Exam
Week 6-8: Bonus Assignments (optional)
Week 10: Endterm Exam
You will need your time for this lab course, it is not easy. Many students underestimate the lab,
do not start when they should and find out that they cannot complete it anymore. Do not let this
happen to you and make sure you visit every session, so you can talk about your questions to the
assistants.
Expressing basic methods like algorithms for sorting and searching in machine language
makes it possible to carry out meaningful studies of the effects of cache and RAM size
and other hardware characteristics (memory speed, pipelining, multiple issue, look
aside buffers, the size of cache blocks, etc.) when comparing different schemes.
The point Knuth makes here is that you cannot ever expect to develop proper computer programs
if you do not have a basic understanding of how computers work on the lowest level and of how
programs are represented there. A point that Knuth even fails to mention is that we live in
an online world today and that malicious attackers use their knowledge of assembly language
to exploit the programs that you may one day write. Thus, learning something about assembly
language is a lesson that will be of essential value to you whether you aspire to be a kernel hacker,
a systems analyst, a game programmer, a web developer, or a theoretical computer scientist. In
fact, here is another priceless quote from Knuth that says it all:
People who are more than casually interested in computers should have at least some
idea of what the underlying hardware is like. Otherwise the programs they write will
be pretty weird.
You should now be mentally prepared to start the assignments. Good luck!
6
Chapter 2
Getting Started
This chapter contains all the information you will need to get started on this lab including how to
set up your environment, tips on which applications to use, and instructions on how to compile,
run, and debug your code.
Windows
Mac OS Linux
Windows 10 Other
If your system is listed under direct GCC support, you can simply install the required software
on your system as explained below.
Linux
Depending on your Linux distribution the installation will be different. Use the package manager
of your Linux distribution to install the gcc and gdb packages. We support GCC v4 or above but
we will use v8 when checking your work. For example, for Debian based distributions like Ubuntu,
simply write the following command in your terminal:
If you are running another Linux distribution we will assume you know what you are doing and
will be able to install gcc and gdb yourself (we recommend the package build-essential).
7
2.1.2 WSL: Windows Subsystem for Linux
Supported systems: Windows 10
Windows 10 includes the option to install and run Linux inside of Windows. It is simple to install
and set up. Please note however, if you want a graphical interface for the “Create a Game”
assembly bonus assignment, you have to use a virtual machine for it to display on Windows.
Installation
1. Right click on the Windows button or press Windows + X.
2. Select “Windows PowerShell (Admin)”.
3. Paste the following command into PowerShell and press enter:
dism.exe /online /enable-feature /featurename:Microsoft-Windows-Subsystem-Linux /all /norestart
4. Restart your computer.
5. Go to the Microsoft Store (open the start menu and type store).
6. Search for “Ubuntu 20.04” or go directly to https://www.microsoft.com/store/apps/
9n6svws3rx71.
7. Click “Get” to install.
8. Wait for the installation to complete. If nothing happens for a while, try pressing enter.
You can now launch Ubuntu from your start menu. The first time you will be asked to set a
username and password for your Linux account (be sure to remember these or write them down).
Finally, run the following command to install the necessary build tools:
sudo apt update && sudo apt upgrade -y && sudo apt install build-essential gdb -y
For more instructions and troubleshooting, see https://docs.microsoft.com/en-us/windows/
wsl/install-win10.
Finding files
When using WSL, it is recommended to put your files on a location in Windows, e.g. somewhere
in your Documents folder. You can access this location in WSL by using the cd command to
navigate to the correct folder, though you will have to translate your Windows path to a Linux
path. For example, if your files are at C:\Users\Student\Documents\CO\Lab, you should use the
following command to navigate to it in WSL (case sensitive!):
cd /mnt/c/Users/Student/Documents/CO/Lab
Should you want to access/edit the files created in a WSL-only location from Windows (the other
way around from what we recommend), you can do the following. In your Ubuntu terminal, enter
explorer.exe . (with the dot!) This will open the current folder in Windows Explorer. Please
note that this location of \\wsl$\Ubuntu-20.04\... is only accessible while WSL is running.
With a virtual machine you can run a virtual computer as a program on your computer. We provide
you with a virtual machine (VM) with Debian and the necessary tools already installed. We also
pre-installed useful editors and their assembly syntax highlighting support (see Section 2.2). You
can run this on your own machine using VirtualBox.
8
Installing and Starting
1. Download the virtual machine from Brightspace
(Content → Course Resources → Lab → Virtual Machine).
2. Download VirtualBox from https://www.virtualbox.org/wiki/Downloads and install it.
3. Open Oracle VirtualBox.
4. Click the “Import” button.
5. Select the virtual machine you downloaded in step 1 and click Next.
6. Leave all settings at the default settings and click Import.
The virtual machine is now ready to be used. Click the “Start” button (green arrow) to start it.
To log in use the following credentials:
username: student
password: pwd
Troubleshooting
The virtual machine is very slow.
You can increase the number of CPU cores available to the virtual machine which might alleviate
this problem. Select the virtual machine, click Settings → System → Processor and increase the
number of processors. Ensure that the amount selected falls within the green bar displayed by
VirtualBox.
• Go to your BIOS settings. This can usually be done by pressing F12 while booting. To enter
your BIOS settings from Windows:
– Press start.
– Click the power icon.
– Hold shift while clicking “Restart”.
– Wait for a bit until the Windows UEFI settings appear.
– Click “Troubleshoot”.
– Click “Advanced options”.
– Click “UEFI Firmware settings”. This will reboot your PC into the BIOS settings.
• Inside the BIOS Settings, go to the “Advanced” menu, then to “System”, and enable “Vir-
tualization options”.
• Make sure to “Save and Exit”, and not just “Exit” the BIOS Settings.
• Once your PC has rebooted, you should be able to run the virtual machine in VirtualBox.
If not, please ask the lab assistants for help.
9
For other laptops, the BIOS setting might have a different name or be under a different category.
Try googling for “enable virtualization on <model of your laptop>” to find instructions specific
to your laptop.
Sublime Text
You can add syntax support for x86 assembly as follows:
1. Go to Tools → “Install Package Control...” and wait for the installation to complete
2. Go to Tools → “Command Palette...” and enter “Package Control”
3. Select “Package Control: Install Package” (See Figure 2.1)
4. Enter x86, click on GAS-x86 (See Figure 2.2)
To enable the highlighting for a file, select View → Syntax → GAS/AT&T x86/x64. You can also
set it as the default for all files of the same extension by going to View → Syntax → Open all with
current extension as... and then selecting GAS/AT&T x86/x64.
Save your files with the .s extension and VS Code will automatically enable the syntax highlighting.
10
Figure 2.2: Installing x86 Syntax Highlighting for Sublime Text
gedit
To install x86 assembly syntax highlighting for gedit, open a terminal and paste the following
command:
wget https://raw.githubusercontent.com/calculuswhiz/gedit-GAS-x86_64-highlighter/master/GAS_x86-64.lang
sudo cp GAS_x86-64.lang /usr/share/gtksourceview-3.0/language-specs/
Save your file with the .s extension and gedit will automatically enable the syntax highlighting.
Otherwise, go to View, Highlight Mode..., and select “GAS (x86-64)”.
VIM
Ensure that git is installed, then execute the following:
Save your file with the .s extension and vim will automatically enable the syntax highlighting.
Emacs
Emacs has an asm-mode which adds some support in writing assembly. There is a GAS mode
for Emacs as well, which you can download at https://github.com/Taeir/emacs-gas/blob/
master/gas-mode.el.
./nameofyourprogram
11
where nameofyourprogram is the name you want to give your executable file and nameofyoursource.s
is the file in which you wrote your piece of code.
The longer story: in order to create an executable program out of your assembly source code
you need to assemble it using a tool called gas, the GNU assembler. This creates a so-called
object file 1 . Since your actual program may consist of subroutines that are defined in different
object files and libraries, the resulting object file(s) needs to be linked into an executable using the
tool ld, the linker. Unfortunately, it is also necessary to include a host of other files in the linking
process, such as the standard C library and the C runtime environment, to produce a proper Linux
executable. The exact files to link may differ from one Linux distribution to another and the list
may become rather long, which is why we do what every sane person does: we swallow our pride
and cheat by simply using gcc, the GNU compiler collection, to call gas and ld on our behalf. If
you are curious as to how bad the actual calls look, try using gcc in verbose mode with the -v
flag. You will see that we did not succumb without battle:
The -no-pie flag has little to do with pie; PIE stands for Position Independent Executable and is
an effort to make binaries less predictable and hacking harder. Unfortunately, it would make the
assignments a bit harder as well, so you will have to provide -no-pie for now to disable it. More
information can be found on the Internet 2 .
This will tell gcc to compile your program with debug flags enabled. In order to start the debugger,
type in the following command:
gdb ./nameofyourprogram
You will enter a gdb shell, where you can use some basic commands to instruct the debugger.
For example, break myfile.s:10 will set a breakpoint in the file ‘myfile.s’ on line 103 and run
will run your program. When execution is paused (either at a breakpoint or due to a segfault), you
can use info reg to inspect the current register content or next to step to the next instruction.
Many other commands can be found (for example) in this cheatsheet4 .
12
Chapter 3
Mandatory Content
In this section you will study the mandatory content of this lab which includes the development pro-
cess, implementation of a simple, non-trivial example program as well as some basic programming
constructs you will need within this lab. This chapter also contains the mandatory assignments.
The manual is shaped in such a way that all the knowledge you need for an assignment is located
before it so make sure to read everything in order so as to not be confused about what to do.
In the assignments you can borrow ideas from the example for your own programs but for now,
the most important thing is that you will learn:
Step 1: Description
The program we are going to write is quite simple: it will ask the user for an input number and
increment it by 1 if it is even, otherwise return the same number. For simplicity, we will assume
that the input will always be greater than or equal to 0. The following table illustrates some
possible inputs and their outputs:
input output
0 1
1 1
2 3
3 3
10 11
21 21
42 43
1041 1041
13
Step 2: Specification
Now that we are familiar with what our program is supposed to do, we will transform the de-
scription into a formal specification. Why do we have to do this? Well, because in the real world,
there are many small, practical details that need to be considered before we can actually imple-
ment an algorithm. One such problem is the problem of representation: how will we represent
the information in our program? Will we use a linked list1 structure, an array, a map? Is a
structure even needed? What implications will it have for the complexity of our program if we
choose one representation over another? Resolving such questions is part of the creative challenge
of programming, so usually you will have to decide on these matters for yourself. However, we
will always expect you to formalise these decisions in the form of a good specification, before you
start programming. As an example of what we consider to be a good specification, we present the
specification of our program below.
Since this program is very simple, it does not actually require any structure to hold our data
but simply a single variable on which we will perform the operations. Here is the pseudocode2
that describes it:
main ( ) {
// p r i n t t h e welcome s t r i n g
p r i n t ( ”Welcome t o our program ! ” )
// c a l l t h e i n o u t s u b r o u t i n e
inout ()
}
inout () {
// a s k f o r t h e i n p u t
i n t NUMBER = r e a d ( k e y b o a r d i n p u t )
// p r i n t t h e outcome
p r i n t (NUMBER)
}
The pseudocode above uses only simple operations on a simple data type (an integer) and an
if statement which is a basic programming construct. These constructs can easily be translated
to assembler programs, as we shall see in the next step.
Step 3: Implementation
The final step in our development process is to translate the specification into working assembler
code. Here, we present the complete implementation of the program in working x86-64 assembler.
In later exercises, you can use this program as a template for your own work. Try to read along
and understand what happens, using the comments in the code and the subsequent explanations
as a guide. Do not be intimidated if you do not understand all the details just yet, the following
sections contain a detailed explanation of the language.
# ∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗
# ∗ Program : O d d i f i e r ∗
# ∗ D e s c r i p t i o n : This program p r i n t s t h e c l o s e s t >= odd number t o t h e i n p u t ∗
# ∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗
.text
welcome : . a s c i z ” \nWelcome t o our program ! \ n”
prompt : . a s c i z ” \ n P l e a s e e n t e r a p o s i t i v e number : \ n”
1 http://en.wikipedia.org/wiki/Linked_list
2 https://en.wikipedia.org/wiki/Pseudocode
14
input : . a s c i z ”%l d ”
ou tp ut : . a s c i z ”The r e s u l t i s : %l d . \n\n”
. g l o b a l main
main :
movq %rsp , %rbp # i n i t i a l i z e the base p o i n t e r
movq $0 , %r a x # no v e c t o r r e g i s t e r s i n u s e f o r p r i n t f
movq $welcome , %r d i # f i r s t p a r a m e t e r : welcome s t r i n g
call printf # c a l l p r i n t f t o p r i n t welcome
call inout # c a l l the subroutine inout
movq $0 , %r d i
call exit
# ∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗
# ∗ Subroutine : inout ∗
# ∗ D e s c r i p t i o n : t h i s s u b r o u t i n e t a k e s an i n t e g e r a s i n p u t from a u s e r , ∗
# ∗ i n c r e m e n t s i t by 1 i f i t i s even , and p r i n t s i t out ∗
# ∗ P a r a m e t e r s : t h e r e a r e no p a r a m e t e r s and no r e t u r n v a l u e ∗
# ∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗
inout :
# prologue
pushq %rbp # push t h e b a s e p o i n t e r
movq %r s p , %rbp # copy s t a c k p o i n t e r v a l u e t o b a s e p o i n t e r
movq $0 , %r a x # no v e c t o r r e g i s t e r s i n u s e f o r p r i n t f
movq $prompt , %r d i # param1 : prompt s t r i n g
call printf # c a l l p r i n t f t o p r i n t prompt
odd :
movq $0 , %r a x # no v e c t o r r e g i s t e r s i n u s e f o r p r i n t f
movq $output , %r d i # param1 : ou tp ut s t r i n g
# param2 : number ( i n RSI )
call printf # c a l l p r i n t f t o p r i n t t h e o ut pu t
# epilogue
movq %rbp , %r s p # c l e a r l o c a l v a r i a b l e s from s t a c k
popq %rbp # r e s t o r e base p o i n t e r l o c a t i o n
ret # r e t u r n from s u b r o u t i n e
15
3.2 Assembler Directives
The commands that start with a period (e.g. .bss, .text, .global, .skip, .asciz, etc.) are
assembler directives. Assembler directives have special functions in an assembler program. For
instance, the .text directive at the beginning of the file tells the assembler to put all the subsequent
code in a specific section. Other assembler directives, like .global, make certain labels visible to
the outside world. The following is a description of the most commonly used assembler directives
or pseudo-instructions as they are sometimes called. The directives are grouped by functionality.
For a full reference, see the official documentation of the GNU assembler3 .
The memory space of a program is divided into three different sections. These directives tell the
assembler in which section it should put the subsequent code.
The .text segment is intended to hold all instructions. The .text segment is read-only. It is
perfectly fine to include constants and ASCII strings in this segment.
The .data segment is used for initialised variables (variables that receive an initial value at
the time you write your program, such as those created with the .word directive).
The .bss segment is intended to hold uninitialised variables (variables that receive a value
only at runtime). Therefore, this section is not part of the executable file after compilation, unlike
the other two sections.
The .equ directive can be used to define symbolic names for expressions, such as numeric constants.
An example of usage is given below:
. e q u FOO, 1024
The .byte, .word, .long and .quad directives can be used to reserve and initialise memory
for variables and/or constants. Just as the assembler translates instructions into bits of memory
contents directly (as explained in subsection 1.2), these directives will be transformed into memory
contents as well, i.e. there is no special magic involved here. .byte reserves one byte of memory,
.word reserves two bytes of memory .long reserves four bytes and .quad reserves 8 bytes. Whether
these bytes will actually be writable depends on the section in which you define them (see above
for a description of the sections). Each directive allows you to define more than one value in a
comma separated list.
A few examples:
3 https://sourceware.org/binutils/docs-2.25/as/Pseudo-Ops.html#Pseudo-Ops
16
FOO: .byte 0xAA, 0xBB , 0xCC # t h r e e b y t e s s t a r t i n g a t a d d r e s s FOO
BAR: .word 2 7 1 8 , 2818 # a c o u p l e o f words
BAZ : .long 0xDEADBEEF # a s i n g l e long
BAK: .quad 0xDEADBEEFBAADF00D # a s i n g l e quadword
Note that the x86-64 is a little-endian machine, which means that a value like .long 0x01234567
will actually end up in memory as 67 45 23 01. Of course, you normally do not notice this since
the movl-instruction will automatically reverse the byte order while it loads the long back into
memory. Taking endianness into account, it should be clear that the following three statements
are completely equivalent:
FOO: .byte 0x0D , 0xF0 , 0xAD, 0xBA, 0xEF , 0xBE , 0xAD, 0xDE
FOO: .word 0xF00D , 0xBAAD, 0xBEEF, 0xDEAD
FOO: .long 0xBADF00D, 0xDEADBEEF
FOO: .quad 0xDEADBEEFBAADF00D
Sometimes it is necessary to reserve memory in bigger chunks than bytes, words, longs or quads.
The .skip directive can be used to reserve blocks of memory of arbitrary size:
BUFFER: . s k i p 1024 # r e s e r v e 1024 b y t e s o f memory
Placing this directive in the .data section will initialize all bytes with zero, while placing
it in .bss will leave the data uninitialized (and will thus contain “random” data from previous
programs).
These directives can be used to reserve and initialise blocks of ASCII encoded characters. In
many higher-level programming languages, including C, strings are simply blocks of ASCII codes
terminated by a zero byte (0x00). The .asciz directive adds such a zero byte automatically. The
following two examples are thus equivalent:
WELCOME: . a s c i i ” Hello ! ! ” # A string..
. b y t e 0 x00 # . . f o l l o w e d by a 0− b y t e .
WELCOME: . a s c i z ” Hello ! ! ” # A s t r i n g f o l l o w e d by a 0− b y t e .
This directive enters a label into the symbol table. The symbol table is a table of contents of sorts
which is contained in the binary assembled file. Publishing labels in the symbol table is useful if
you want other programs to have access to your labels, e.g. if you want the labels to be visible in
the debugger or if you want other programs to use your subroutines4 . One very important use of
the symbol table is to export the main label. This label must be exported because the operating
system needs to know where to start running your program.
. g l o b a l main
4 Sharing subroutines is not part of this lab course, but if you are interested you can have a look at subsection
4.8
17
3.3 x86-64 Assembly Language
This subsection contains a short language reference for the x86-64 assembly language. Apart from
a list of commonly used instructions, there is a short rundown of the differences between the so
called “AT&T syntax” and the “Intel syntax”. This course uses the GNU assembler 5 . Because
this assembler only supports the AT&T syntax, we use this syntax throughout the course. If you
are also interested in the Intel syntax, you can find it in the official Intel x86-64 platform manual.
18
stands for “long” (4 bytes) and q stand for “quadword” (8 bytes). As an example, take a look at
four different uses of the push instruction:
pushb $3 # Push one b y t e onto t h e s t a c k ( 0 x03 )
pushw $3 # Push two b y t e s onto t h e s t a c k ( 0 x0003 )
pushl $3 # Push f o u r b y t e s onto t h e s t a c k ( 0 x00000003 )
pushq $3 # Push e i g h t b y t e s onto t h e s t a c k ( 0 x0000000000000003 )
All four instructions in the example push the literal value ‘3’ onto the stack, but the actual size of
the operand is different in each situation. We will do assembly in 64-bit, so mostly you will have
to use the q postfix.
In our instruction set reference table in section 3.3.3, we do not list these suffixes explicitly.
The Intel manual does not list them either.
Partial Registers
The size suffix is especially important when you use partial registers. The x86-64 allows you to
address smaller parts of the 64-bits registers through special names. By replacing the initial R
with an E on the first eight registers, it is possible to access the lower 32 bits. If we use the RAX
register as an example, you can address the least significant 32 bits of this register by using the
name EAX. To access the lower 16 bits the initial R should be removed (AX for RAX). In a similar
fashion, the highest and lowest order bytes in this AX register can be addressed by the names AH
and AL respectively. Again, we present a few examples using the mov instruction:
movl %eax , %ebx # Copy 32 b i t s v a l u e s between r e g i s t e r s
movq %rax , %rbx # Copy 64 b i t s v a l u e s between r e g i s t e r s
movw %ax , %bx # Copy o n l y t h e l o w e s t o r d e r 16 b i t s
movb %a l , %b l # Copy only the lowest order 8 b i t s
movb %ah , %a l # Copy 8 b i t s within a s i n g l e r e g i s t e r
The following table shows how you can access the lower X bytes of each register:
Addressing Memory
There are several ways of addressing the memory in AT&T syntax:
• Immediate addressing:
– Directly using a label will yield the value at the address of the label. (e.g. label)
– Prefixing a label with a $ will yield the address of the label. (e.g. $label)
19
• Indirect addressing:
– Surrounding a register with parentheses will yield the value at the memory address
stored in the register. (e.g. (%RAX))
– Prefixing the left parenthesis with a displacement will yield the value at the memory
address stored in the register plus the displacement (e.g. -8(%RBP)).
– Advanced: Accessing memory with displacement(base, index, scale) will yield the
value at memory address “displacement + base + index × scale” Here, displacement
is a constant expression (may include labels), base and index are registers, and scale is
either 1, 2, 4, or 8. (e.g. table(%RDI, %RCX, 8))
• Most of the instructions with two operands require at least one of their operands to be a
register. The other operand may be either a register or a memory location. This may differ
per instruction and is specified in detail in the official Intel manual 7 .
• The multiplication and division instructions require their operands to be in special registers8 .
The multiplication instruction will store the result in both %RDX and %RAX (denoted in the
table by %RDX:%RAX), while the division instructions will require the dividend to be in these
two registers. The higher-order bits should be in %RDX while the lower-order bits should be
in %RAX. Note that the division operator also stores the remainder of the division in %RDX.
6 https://sourceware.org/binutils/docs-2.25/as/i386_002dMemory.html#i386_002dMemory
7 http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html,
20
Mnemonic Operands Action Description
Data Transfer
mov. SRC, DST DST = SRC Copy.
pushq SRC %RSP -= 8, (%RSP) = SRC Push a value onto the stack.
popq DST DST = (%RSP) , %RSP += 8 Pop a value from the stack.
xchg. A, B TMP = A, A = B, B = TMP Exchange two values.
movzb. SRC, DST DST = SRC (one byte only) Move byte, zero extended.
movzw. SRC, DST DST = SRC (one word only) Move word, zero extended.
Arithmetic
add. SRC, DST DST = DST + SRC Addition.
sub. SRC, DST DST = DST - SRC Subtraction.
inc. DST DST = DST + 1 Increment by one.
dec. DST DST = DST - 1 Decrement by one.
mul. SRC %RDX:%RAX = %RAX * SRC Unsigned multiplication.
imul. SRC %RDX:%RAX = %RAX * SRC Signed multiplication.
div. SRC %RAX = %RDX:%RAX / SRC Unsigned division.
%RDX = %RDX:%RAX % SRC
idiv. SRC %RAX = %RDX:%RAX / SRC Signed division.
%RDX = %RDX:%RAX % SRC
Branching
jmp ADDRESS Jump to address (or label).
je ADDRESS Jump if equal.
jne ADDRESS Jump if not equal.
jg ADDRESS Jump if greater than.
jge ADDRESS Jump if greater or equal.
jl ADDRESS Jump if less than.
jle ADDRESS Jump if less or equal.
call ADDRESS Jump and push return address.
ret Pop address and jump to it.
loop decq %RCX, jump if not zero.
Logic and Shifting
cmp. A, B sub A B (Only set flags) Compare and set condition flags.
xor. SRC, DST DST = SRC ˆ DST Bitwise exclusive or.
or. SRC, DST DST = SRC | DST Bitwise inclusive or.
and. SRC, DST DST = SRC & DST Bitwise and.
shl. A, DST DST = DST << A Shift left by one bit.
shr. A, DST DST = DST >> A Shift right by one bit
Other
lea. A, DST DST = &A Load effective address.
int INT NR Software interrupt.
21
3.4 Registers & Variables
The example program uses a variable to store data. We see in the comments that we are using
a variable called “number” which corresponds to the one used in the pseudocode. But where do
these variables live? Do they exist in the registers, on the stack or somewhere in main memory?
The answer is: all of the above. Sometimes, like in the case of number, we can simply keep our
variables in the registers. The registers are fast and easy to access, so if possible we like to keep
our variables there. The number of registers is limited however, so sometimes we may have to
temporarily push their values on the stack and later pop it into a register when we need to check
the value.
Aside from register shortage, there is one very important reason to store variables on the stack
in some cases: on the x86-64 platform, registers are caller saved by convention. This means that
if you call a subroutine from your program, like printf or one of your own subroutines, the
subroutine may and will likely overwrite some of your registers. In other words, if you need the
data in your registers to be consistent after you call a subroutine, you will need to save it on the
stack. We will delve into stack details in one of the exercises.
low address
.text
.data
.bss
heap
free memory
RSP
stack
high address
The parts of the memory labeled .text, .data, and .bss contain all program instructions
and other data originating from assembler directives. More information on these memory sections
can be found in the assembler directive reference (paragraph ??). The heap is used to store data
allocated using the C functions malloc and calloc. 9
22
3.5.2 Cleaning up the Stack
Of course, if the stack grows too large, it will eventually overwrite your program’s code and data.
This is called a stack overflow and it is usually indicative of a serious design flaw in your program.
To avoid this problem, the caller must clean up the stack after every function call by adding the
parameter-block size to the stack pointer directly. This is because (as you will read in section 3.6)
the stack is at times used to pass parameters to subroutines. Look at the simple example of the
print function below. Cleaning the stack should not be more difficult than this:
pushq $42 # Push a magic number , t h e s e v e n t h argument
movq . . . , % . . . # Move arguments 2 t h r o u g h 6 t o t h e i r r e g i s t e r s
movq $ f o r m a t s t r , %r d i # F i r s t argument : t h e f o r m a t s t r i n g
movq $0 , %r a x # no v e c t o r arguments f o r p r i n t f
call printf # P r i n t t h e numbers
addq $8 , %r s p # Clean t h e s t a c k ( pop t h e magic number )
23
3.5.5 Accessing arguments passed via the stack
Take another look at the memory layout in the previous image. Now imagine that the current
subroutine is one that takes nine arguments. Knowing how the memory is laid out on the “border”
between two subroutines, we know that we can find the stack arguments two memory spaces (or,
16 bytes) below the current base pointer. You can use indirect memory addressing (see paragraph
3.3.2) to retrieve these values.
3.6 Subroutines
A subroutine is simply a block of instructions which starts at some memory address (indicated
with a label). If we want to execute or call a subroutine, we simply need to jump10 to its first
instruction. After executing the subroutine we expect control to return to us, i.e. we expect the
program to return to the first instruction after our subroutine call. To make this possible, the
called subroutine should somehow be aware of the address of the next instruction after the call.
By convention, we simply push that address onto the stack right before making the jump. To our
ease and comfort, the kind people at Intel provided a single instruction that performs both these
steps in one fell swoop: the call instruction. Calling a simple subroutine is thus no more difficult
than this:
c a l l somesub # c a l l t h e somesub s u b r o u t i n e
The label somesub in this example is associated with the starting address of the subroutine that
we want to call. The somesub-subroutine will now execute all of its instructions and finish off with
the ret instruction. This instruction will pop the return address (which the call instruction had
pushed) from the stack and make execution will simply return to the first instruction after the
call instruction.
Passing Parameters
Usually, we will want to pass some parameters to a subroutine. To do this, we need to put them
somewhere where the subroutine can find them when it executes. We are more or less free to
choose between using the registers, the stack or some part of memory other than the stack to
store these parameters, as long as both the writer of the subroutine and the user agree on the
location. By convention11 we will use registers for this purpose. More specifically, we will place
the arguments in the following registers:
1. %RDI
2. %RSI
3. %RDX
4. %RCX
10 A “jump” is nothing more than loading a new memory address into the program counter, or RIP, as this register
is called on the x86-64.
11 This convention is the so called “C calling convention” and if we adhere to it our subroutines and calls will be
fully compatible with the system’s standard C library. You can find the full documentation of the calling conventions
here: https://web.archive.org/web/20160801075139/http://www.x86-64.org/documentation/abi.pdf
24
5. %R8
6. %R9
We will clarify this by an example. Let us assume that we have a subroutine called foo, that
takes three integer arguments, i.e. the signature of the subroutine is foo(int a, int b, int c).
Imagine that we want to call foo with the parameters 1, 5 and 2, i.e. foo(1, 5, 2) in pseudocode.
In assembler, we copy the arguments in the registers and execute the call instruction to call this
subroutine:
movq $2 , %rdx # t h i r d argument
movq $5 , %r s i # s e c o n d argument
movq $1 , %r d i # f i r s t argument
call foo # Call the subroutine
If the subroutine that you are calling needs more than six arguments, then the remaining
arguments need to be pushed to the stack in reverse order (first argument pushed last). Note that
the called subroutine will not remove the arguments from the stack, so you should pop them off
yourself after the call returns. If you are interested in writing your own subroutine that needs
more than six arguments, see paragraph 3.5.5.
Stack Alignment
If you are going to use the stack in your code, you need to make sure that the stack remains
16-byte aligned. This means that the %RSP register should always be a multiple of 16 when you
do a call.12
If you are not using the stack inside a subroutine, then this is easy: any call instruction
pushes an 8 byte return address and in the prologue of your function you push the old %RBP value.
These two pushes together are exactly 16 bytes, thus ensuring your stack remains aligned. For
more information on why you need to push the %RBP register to the stack, see paragraphs 3.5.3
and 3.5.4.
Note that the main routine will also be called with an aligned stack, but the return address
pushed by this call causes the stack to be unaligned again.
25
3.6.2 Recursive Subroutines
A recursive subroutine is a subroutine that calls itself during its execution. This enables the
subroutine to repeat itself for a number of times. Below is the pseudocode of a recursive example
function. For a given x less than or equal to 42, the function calculates and returns the sum of all
integers from x to 42. Pseudocode:
f u n c t i o n example ( x ) {
i f ( x == 4 2 )
return 42;
else
r e t u r n ( x + example ( x + 1 ) )
}
With recursive subroutines there’s still an issue: when does the routine need to stop from
calling itself? To prevent infinite recursion, you need to determine a recursive case and a base case
(or stop condition). With this example it would be logical to stop the recursion when the function
receives an input value of 42. This is done by checking for the condition at every invocation of
the function. If the condition holds we can return a known correct value. If the condition did not
hold the function will call itself with different parameters and use the result of that invocation to
compute the correct value.
Recursive functions are often used in computer science because they allow programmers to
write a minimal amount of code. It often produces code that is very compact. However, recursion
can cause infinite loops when the stop condition is not written properly.
3.7 I/O
If you examine the pseudocode and the resulting assembly code of the example carefully, you see
that we have translated the print(number); statement into the following lines of assembler code:
Doing I/O in an assembler program can be quite tricky. First of all, normal processes do not
have permission to access the hardware I/O devices directly, so all input and output has to be
handled by the operating system. Since different operating systems have different ways of doing
things it isn’t very useful to teach you the specifics of one system13 . Instead, we will use the
operating system’s standard C library to do I/O for us. Calling functions in the C library is no
different from calling subroutines in your own programs. This has many benefits. First of all,
there is a standard C library available on most operating systems and second, it will do some
nice tricks for us such as ASCII-to-integer conversions and vice versa. In this subsection we will
discuss the printf and scanf subroutines from the standard C library. Both these functions are
functions that take a non-fixed amount of arguments, also known as “varargs”. These functions
take an extra (hidden) argument in RAX, defining the number of vector registers used in the call.
During this lab we will not be using these registers, so you always load a zero into RAX.
26
Basic Example
In its simplest form, printf takes only one argument: the memory address of a string of ASCII
characters. We will now present a pseudocode example followed by an assembly example.
Pseudocode:
p r i n t f ( ” H e l l o world ! \ n” ) ;
Assembly:
mystring : . a s c i z ” H e l l o world ! \ n”
movq $0 , %r a x # no v e c t o r r e g i s t e r s i n u s e f o r p r i n t f
movq $m ys t r in g , %r d i # load address of a s t r i n g
call printf # Call the p r i n t f r o u t i n e
As you can see there are two strange details in this example. First of all, we include the special
‘\n’-sequence inside the string. This is translated by the assembler to a single ‘newline’-character.
Second, we do not actually provide the entire string as an argument, but rather just the memory
address of the first character of the string, which is denoted by the mystring label14 . By conven-
tion, C functions know where a string ends by looking for a byte with the value 0x00. That byte
indicates the end of the string.
Printing Variables
In addition to simple printing, we can also use printf to print variables and other calculated
output to the terminal. We do this by embedding special character sequences in our string and by
passing extra values to printf. Note that these character sequences have no special meaning for
the assembler like ‘\n’, instead they are understood by the printf function (and related functions).
We give another example.
Pseudocode:
p r i n t f ( ” I am %l d y e a r s o l d \n” , 2 5 ) ;
Assembly:
mystring : . a s c i z ” I am %l d y e a r s o l d \n”
movq $0 , %r a x # no v e c t o r r e g i s t e r s i n u s e f o r p r i n t f
movq $25 , %r s i # load the value
movq $m ys t r in g , %r d i # load the s t r i n g address
call printf # Call the p r i n t f r o u t i n e
The printf function will automatically convert the integer value 25 to a ASCII representation
of the decimal number 25 and it will substitute the value into the string at the point where the
‘%ld’ sequence is encountered. The ‘%ld’ sequence simply tells printf that it may expect an
extra argument and that the argument must be interpreted as a long decimal number (64 bits)
for printing.
27
Specifier Usage
%d decimal number (32 bits)
%lx hexadecimal number (64 bits, using lowercase a-f)
%lX hexadecimal number (64 bits, using uppercase A-F)
%lu unsigned decimal number (64 bits)
%c character
%s string of characters (passed as a memory address)
%% the literal character ‘%’
In assembly, the address of a variable depends on its location. If you want to store a value into
a global address you can simply use its label as the address. If you want to put a value in a
stack variable you could calculate the address using the base pointer. Fortunately, x86-64 offers a
lea instruction (“load effective address”) which makes this rather simple. We provide a complete
example of a scanf call which reads a decimal number from the keyboard and stores it in some
local stack variable:
f o r m a t s t r : . a s c i z ”%l d ”
...
subq $8 , %r s p # Reserve stack space f o r v a r i a b l e
l e a q −8(%rbp ) , %r s i # Load a d d r e s s o f s t a c k v a r i n r s i
movq $ f o r m a t s t r , %r d i # l o a d f i r s t argument o f s c a n f
movq $0 , %r a x # no v e c t o r r e g i s t e r s f o r s c a n f
c a l l scanf # Call scanf
28
3.9.1 Conditional Branching
There are many so-called “jump” or “branch” instructions in the x86-64 instruction set which load
a new value into the program counter. These instructions come in two flavors. First, there are the
regular branch instructions such as jmp or call which cause program execution to continue at a
different memory address. Second are the conditional branch instructions, which will only jump
to the new target address if some condition holds. We can use these conditional jump instructions
to implement conditional constructs, such as if-statements and while-loops:
Pseudocode:
i f (RAX > 1 ) {
// IF−code
} else {
//ELSE−code
}
Implementation:
cmpq $1 , %r a x # compare RAX t o 1
jg ifcode # jump t o IF−code i f RAX > 1
jmp e l s e c o d e # jump t o ELSE−code o t h e r w i s e
ifcode :
... # IF−code
jmp end
elsecode :
... # ELSE−code
end :
The cmp instruction on the first line compares the contents of RAX to the number 1. It stores
the results of this comparison (e.g. whether the contents of RAX were greater than-, equal to
or less than 1) in the special RFLAGS register. The jg instruction (“jump if greater-than”) is
a conditional branch instruction. It tests the contents of the RFLAGS register and jumps to the
ifcode label if the flags indicate that the second operand of the cmp instruction was greater
than the first. For an overview of the various conditional branch instructions, see the instruction
set reference in paragraph 3.3.3. The subsequent paragraphs will demonstrate other programming
constructs based on the conditional branch instructions. Paragraph 3.9.2 will give a more compact
implementation of the if-statement.
Implementation:
cmpq $1 , %r a x # compare RAX t o 1
jg ifcode # jump t o IF−code i f RAX > 1
elsecode :
... # ELSE−code
jmp end
ifcode :
29
... # IF−code
end :
3.9.3 Loops
The do-while-loop Pseudocode:
do {
// l o o p code
} w h i l e (RAX > 1 ) ;
In this example we jump back to the beginning of the loop as long as the condition holds.
Implementation-wise, this is the simplest type of loop:
loop :
... # l o o p code
cmpq $1 , %r a x # r e p e a t t h e l o o p
jg loop # i f RAX > 1
In this example we will break the loop if the condition does not hold, i.e. we jump to the end if
RAX is lesser or equal to 1:
loop :
cmpq $1 , %r a x # i f RAX <= 1 jump t o
jle end # t h e end o f t h e l o o p
... # l o o p code
RAX++;
}
30
3.10.1 Part A: Getting Started
In this assignment you will be asked to write your first assembly program. You will have to use
the knowledge you acquired from the example program in order to complete this task, so make
sure you have a thorough understanding of it. Remember that you can always ask the lab course
assistants for help. For this program, you will not have to write any specifications, since there
is no significant algorithmic complexity involved. However, you are of course required to write
proper comments.
In order to complete this assignment you will need to call the printf subroutine. Paragraph
3.6.1 of the reference section explains the details of calling subroutines and paragraph 3.7.1 explain
how to use the printf subroutine. Paragraph 2.3.1 explains the commands that you will need to
enter on your shell in order to build and run your program.
Exercises:
1. Create a new text file, called “power.s”.
2. Implement a simple main routine that exits the program immediately with the proper exit
code and without crashing.
3. Build your program and run it.
4. Alter your main routine in such a way that it prints a message containing your names, netIDs
and the name of the assignment on the terminal.
You should not need more than one call to printf to display your message. After completing
the rest of the exercises in part B, you will need to have the source code of this program checked
by the teaching assistants, so make sure you keep all your files in order.
Exercises:
1. The following partial specification of the pow subroutine is given:
/∗ ∗
∗ The pow s u b r o u t i n e c a l c u l a t e s powers
∗ o f non−n e g a t i v e b a s e s and e x p o n e n t s .
∗
∗ Arguments :
∗
∗ base − the e xpone ntial base
∗ exp − t h e exponent
∗
∗ Return v a l u e : ’ b a s e ’ r a i s e d t o t h e power o f ’ exp ’ .
∗/
i n t pow ( i n t base , i n t exp ) {
int total = 1;
// . . .
return total ;
}
31
Complete the specification of the pow subroutine. You should only use looping constructs
and simple arithmetic operations to compute the total.
It is not required to sign-off this specification, but you can ask a teaching assistant to check
that it is correct. Also when you have questions about this assignment, the teaching assistant
will ask for your specification.
2. Create a new subroutine called pow, which will be the implementation of your pow subroutine.
3. Alter your main routine in such a way that it asks the user for a non-negative base and
exponent.
4. Alter your main routine in such a way that it calls pow with the numbers it reads and prints
the result of pow on the terminal.
Exercises:
1. Copy your “power” program to a new file called “factorial.s”. Create a new subroutine called
factorial. This new subroutine should take one parameter, n, and for now it should simply
do nothing and return n in the RAX register. Alter your main routine in such a way that it
calls factorial with the number it reads, instead of calling power. main should print the
result of factorial on the terminal.
2. Write a pseudocode specification of your factorial subroutine. The subroutine accepts a
non-negative parameter n and it should return n!. Make sure your algorithm is recursive.
It should not need to be more than a few lines of pseudocode.
It is not required to sign-off this specification, but you can ask a teaching assistant to check
that it is correct. Also when you have questions about this assignment, the teaching assistant
will ask for your specification.
3. Implement your factorial routine. Test your program thoroughly.
3.12 Memory
3.13 Bit Shifting
One of the powerful things about the assembly language is that you can very easily manage your
values on the bitwise level. Bit shifting is a common advantage of this property.
The following is an example of a bit shift:
movq $0 , %r a x # 00000000 . . . 00000000 00000000
movb $142 , %a l # 00000000 . . . 00000000 10001110
shr $3 , %r a x # 00000000 . . . 00000000 00010001
It shifts the bits in memory by the number and in the direction you specify (shl is left-shift, shr
is right-shift).
32
3.14 ASSIGNMENT 3: Memory
Hundreds of years ago, archeologists discovered a treasure chest full of ancient scripts and drawings.
Sadly, the puny minds of the current civilisation could not comprehend the contents of these
precious documents and after decades of searching for a way to unlock their mysteries, people
were forced to admit defeat and locked them away until someone worthy came along and could
decode them.
Then, a few days ago, a strange machine was discovered. Our knowledge of technology is yet
too limited to be able to get it working, but the group of scientists working on it managed to glean
how the machine decodes these artifacts.
Now, they have tasked you with recreating it in the primitive ways of our technology. They
have found that the artifacts are encoded in 8-byte memory blocks. The bytes in a memory block
signify the following (from highest to lowest):
Byte 1 - 2 Unknown - this is still being worked on but they have already determined
that it is not crucial knowledge for decoding the messages.
Byte 3 - 6 The next memory block to visit.
Byte 7 The amount of times that character should be printed.
Byte 8 The ASCII16 character which should be printed.
Exercises:
1. Download the files for this assignment from Brightspace.
The file you will be writing your code in is “decoder.s”. It currently includes the message from
“helloWorld.s” (the .include ‘‘helloWorld.s’’ line does this). If you want to change the
input message, simply change this line to include the file you want.
16 http://www.asciitable.com/
33
2. Write a pseudocode specification of your decode subroutine. The subroutine accepts the
address of the message as its first parameter and has no return value. The following is an
outline for you specification:
/∗ ∗
∗ The d ec od e s u b r o u t i n e d e c o d e s t h e m e s s a g e s .
∗
∗ Arguments :
∗
∗ a d d r e s s − t h e a d d r e s s o f t h e message
∗ i n memory
∗
∗ Return v a l u e : none
∗/
v o i d d ec od e ( i n t a d r e s s ) {
// . . .
}
You may also write helper subroutines which will be called from decode.
It is not required to sign-off this specification but you can ask a teaching assistant to check
that it is correct. Also, when you have questions about this assignment, the teaching assistant
will ask for your specification.
3. Implement your decode subroutine. Test your program thoroughly. Provided are test files
“abc sorted.s”, “helloWorld.s”, and “final.s”. We recommend starting with “abc sorted.s”.
This should print two lines: 0 through 9 on the first line and a through z on the second line.
The memory is sorted, so this should work by each time just taking the next memory block.
For Hello World you will need to extract the index of the next memory block. As for the
contents of Final: the archeologists have not been able to figure this out yet. Can you?
This exercise wraps up the basic assembler programming assignments. You should go and have
your code checked by a lab course assistant. Well done!
34
Chapter 4
Bonus Content
This chapter contains only bonus content and assignments. For each assignment, you can receive
extra points towards your final grade (the amount is always listed with the assignment).
35
0 BLACK
2 GREEN
And so, here is what the message is actually supposed to look like:
Hello World !
If foreground and background colour are the same you would end up with unreadable text.
The scientists have not figured what to do when this happens, so for now you should ignore the
colours if the foreground and background colours are the same.
Science is advancing rapidly here! The scientist figured out what to do with memory blocks where
the foreground and background colour are the same. These are apparently used for special effects.
The scientists have not figured them all out yet, but we already know of the following:
0 reset to normal
37 stop blinking
42 bold
66 faint
105 conceal
153 reveal
182 blink
Note for WSL users: Support for these special effects depends on the terminal you are using.
To get them to work properly, we recommend using ”Windows Terminal”, which is available
in the Microsoft Store: https://www.microsoft.com/en-us/p/windows-terminal-preview/
9n8g5rfz9xk3. You will need the latest preview version for it to work.
36
s w i t c h (RAX) {
case 0:
// c a s e 0 code
break ;
case 1:
// c a s e 1 code
break ;
case 2:
// c a s e 2 code
break ;
}
To implement the switch-statement we have to create one small subroutine for each of the cases.
We then create a table containing the starting addresses of these subroutines and we use the value
of RAX to look up the proper subroutine address in the table. A table like this is called a jump
table:
# The j u m p t a b l e :
jumptable :
.quad case0sub
.quad case1sub
.quad case2sub
# The c a s e s u b r o u t i n e s :
case0sub :
. . . # c a s e 0 code
ret
case1sub :
. . . # c a s e 1 code
ret
case2sub :
. . . # c a s e 2 code
ret
# The a c t u a l s w i t c h s t a t e m e n t :
s h l q $3 , %r a x # m u l t i p l y RAX by 8
movq j u m p t a b l e(%r a x ) , %r a x # l o a d t h e a d d r e s s from t h e t a b l e
c a l l ∗%r a x # c a l l the subroutine
There is some trickery going on in the last three instructions that deserves some attention: first
of all, we have to remember that the subroutine addresses in the jumptable are eight bytes long,
so we will have to multiply our RAX register by eight before we can use it as a table index. We
can of course accomplish this by shifting the operand left by three bits. Second, we have to use
the ‘*’ when calling a subroutine whose address is located in a register.
37
Computing the n-th Fibonacci number is a very computationally intensive task and the fibonacci
subroutine can be tricky to implement in assembler. By studying the example carefully, we observe
that the only values which are actually calculated are fibonacci(30) through fibonacci(39).
Of course, we can simply precompute these values at compile time without having to implement
the fibonacci() routine at all. The resulting program is both faster and easier to implement:
// A t a b l e c o n t a i n i n g t h e F i b o n a c c i numbers from 30 t o 39
int fibtable [ ] = {
832040 ,
1346269 ,
2178309 ,
3524578 ,
5702887 ,
9227465 ,
14930352 ,
24157817 ,
39088169 ,
63245986
};
// P r i n t v a r i o u s F i b o n a c c i numbers
f o r ( i n t i = 0 ; i < 1 0 0 0 0 0 ; i ++) {
print ( fibtable [ i % 10]) ;
}
In assembly, we can use the .byte, .word, .long and .quad directives to construct the lookup
table:
fibtable :
.long 832040
.long 1346269
.long 2178309
.long 3524578
.long 5702887
.long 9227465
.long 14930352
.long 24157817
.long 39088169
.long 63245986
38
number in %RAX defining what system call you want. For instance, doing an exit is done by setting
%RAX to 60, putting the error code in %RDI (normally 0) and then use the syscall instruction.
# Perform t h e ’ s y s e x i t ’ system c a l l :
movq $60 , %r a x # system c a l l 60 i s sys exit
movq $0 , %r d i # normal e x i t : 0
syscall
A complete list of available Linux system calls can be found in the kernel source code 2 . The
complete call looks a bit less friendly than printf:
# d e f i n e t h e s t r i n g and i t s l e n g t h :
hello :
.asciz ” H e l l o ! \ n”
helloend :
.equ length , helloend − h e l l o
# Perform t h e ’ s y s w r i t e ’ system c a l l :
movq $1 , %r a x # system c a l l 1 i s s y s w r i t e
movq $1 , %r d i # f i r s t argument i s where t o w r i t e ; s t d o u t i s 1
movq $ h e l l o , %r s i # n e x t argument : what t o w r i t e
movq $ l e n g t h , %rdx # l a s t argument : how many b y t e s t o w r i t e
syscall
The code we see here is actually very similar to the code that we find inside the printf function
itself. Many functions in the C library, including printf, use inline assembly code to perform
their actual function (see 4.8). Since compilers are not operating system specific, the authors of
printf have to resort to this technique.
Now that your code does not depend on the C library anymore, you can even assemble and link
it yourself, without the gcc magic:
as -o hello.o hello.s
ld --entry main -o hello hello.o
Compare the size of your final program to the size of its C equivalent. Now that’s efficiency!
39
As you can see it tells us that the second line is different (2c2 means that line 2 in the original
has been changed to become line 2 in the new file). For more information on how the diff
command works, please check the diff manuals (type man diff in a terminal) and check Wikipedia
at http://en.wikipedia.org/wiki/Diff.
For this assignment your program will have to be able to do the following:
• Implement a line-by-line comparison version of diff. This means it is not required to have the
a and d outputs that the real diff offers. Only the changes will suffice, though we encourage
you to also try the detection of addition and deletion of lines.
• Implement at least the -i and -B options that diff offers (see the diff manual to learn what
these options do).
Note : It is not required that you read text from a file or the standard input. It is allowed for you
to hardcode the texts you are comparing in your source. If you do this however, the student assis-
tants will change this hardcoded text to confirm that your program works for different texts as well.
Note : The -i and -B options should be read from the command line arguments; not hardcoded.
By convention you get your command line arguments the following way: first an integer that tells
you how many arguments were provided (always at least 1; the name of your own program) and
then the actual parameters. More information can be found online.
Example:
Suppose you have the following format string:
My name is %s. I think I’ll get a %u for my exam. What does %r do? And %%?
40
Also suppose you have the additional arguments “Piet” and 10. Then your subroutine should
output:
My name is Piet. I think I’ll get a 10 for my exam. What does %r do? And %?
Hints
To get started you may divide the work in a number of steps. Note that these are just hints, you
do not have to follow these steps to finish this assignment.
1. Write a subroutine that prints a string using system calls.
2. Write a new subroutine to recognize format specifiers in the format string. Initially, you can
discard the format specifiers rather than process them. The rest of the string can be printed
using the subroutine from the previous hint.
3. Implement the various format specifiers. It may help to implement %u before %d. Again,
you may use the print function from the first hint if you implemented it.
4. It may help to store all input argument registers on the stack at the start of your function,
even if you don’t end up using them.
Your part of the code should not have a main function, but instead have a sha1 chunk function,
which will be called by our part of the code. This sha1 chunk function takes two parameters:
First, the address of h0 (h1, h2, etc. are stored directly after h0). (See Wikipedia’s pseudo code
for SHA-1 for these names.) Second, the address of the first 32-bit word of an array of 80 32-bit
words. The first sixteen of this array are set to the sixteen 32-bit words the chunk contains (which
are called w[0] till w[15] on Wikipedia). Your function should modify h0 till h4 as described in
the pseudo code.
When you execute the combined program, our part of the code prints a lot of information on
what is happening, and when your function is called. It displays the result of your function, and
whether that is correct or not. You can of course print more debugging information from your
own function using printf.
3 http://en.wikipedia.org/wiki/Sha1
4 there is also a sha1 test32.so for 32 bit compilation available
41
4.6 ASSIGNMENT 6: Hide Data in a Bitmap (750 points)
The scientists from Assignment 3 (3.14) suggested that our generation should also leave a message
for future generations. Frustrated at how long it took them to implement a decoder, the decision
was made that this new effort should also be an “interesting” puzzle.
You are tasked by the scientists to encrypt messages using assembly, into everyday barcodes.
This assembly code will then later be hacked into a popular barcode software package. In detail:
1. Make sure you understand the message, including the “lead” and “trail”.
2. Compress the message using the Run-Length Encoding (RLE) technique.
Data: Barcodes
Complexity: Easy
The message will be encoded in the first part of a barcode. For simplicity you can assume that
the barcode looks like this:
W W W W W W W W B B B B B B B B W W W W B B B B W W B B B W W
This pattern is 31 pixels long. In this pattern, W stands for a white pixel and B stands for
black.
Create a sequence by repeating this pattern, followed by a red pixel, 32 times. This will form
a 32 × 32 image, where each pixel is either white, black, or red. Note that this key represents an
RGB image, where each pixel is represented using three bytes.
42
Input Output
x y XOR(x,y)
0 0 0
0 1 1
1 0 1
1 1 0
x⊕0=x (4.1a)
x⊕x=0 (4.1b)
(x ⊕ y) ⊕ y = x ⊕ (y ⊕ y)
=x⊕0 (4.2)
=x
Equation 4.2 means that we can use XOR to first encrypt (x ⊕ y) and then decrypt ((x ⊕ y) ⊕ y)
a one-bit message x with the encryption/decryption key y. It turns out that these two one-bit
operations can be extended to n-bit operations, that is, for an n-bit message M and an n-bit key
K:
(M ⊕ K) ⊕ K = M (4.3)
For example, if the message is TEST in ASCII (M = 01010100 01000101 01010011 01010100
in binary) and the key is TRY! in ASCII (K = 01010100 01010010 01011001 00100001), the
encrypted text is:
Last, a good example of implementing the XOR encryption technique in C is the “C Tutorial -
XOR Encryption” by Shaden Smith, June 20095 .
Your task is to encrypt the RLE-encoded message using the barcode image as key. Note that
the key consists of 32 × 32 × 3 bytes, while the message is much shorter. This means that only a
small part of the image will change.
5 [Online] Available: http://forum.codecall.net/topic/48889-c-tutorial-xor-encryption/.
43
Data Representation: the BMP Format (Simplified)
Complexity: Difficult
Storing data as images requires complex data formats. One of the simplest is the bitmap (BMP)
format, which you must use. BMP files encode raster images, that is, images whose unit of
information is a pixel; raster images can be directly displayed on computer screens, as their pixel
information can be mapped one-to-one to the pixels displayed on the screen.
The BMP file format consists of a header, followed by a meta-description of the encoding used
for pixel data, followed sometimes by more details about the colours used in the image (look-
up table, see also the paragraph on barcodes). The BMP file format is versatile, that is, it can
accommodate a large variety of colour encodings, image sizes, etc. It is beyond the purpose of this
manual to provide a full description of the BMP format, which is provided elsewhere6 .
Luckily for you, of the many flavors of encodings, we opted to only accept one type. Thus, you
must use the following BMP format for this assignment:
1. File Header, encoded as signature (two bytes, BM in ASCII); file size (integer, four bytes);
reserved field (four bytes, 00 00 00 00 in hexadecimal encoding); offset of pixel data inside
the image (integer, four bytes). The file size is the sum between the file header size, the size
of the bitmap info header, and the size of the pixel data. The file header size is 14 (two
bytes for signature and four bytes each for file size, reserved field, and offset of pixel data).
The file size is the sum of 14 (the file header size), 40 (the size of the bitmap header), and
the size of the pixel data.
2. Bitmap Header, encoded as7 : header size (integer, four bytes, must have a value of 40); width
of image in pixels (integer, four bytes, set to 32–see see paragraph on barcodes); height of
image in pixels (integer, four bytes, set to 32–see paragraph on barcodes); reserved field
(two bytes, integer, must be 1); the number of bits per pixel (two bytes, integer, set here
to 24); the compression method (four bytes, integer, set here to 0–no compression); size of
pixel data (four bytes, integer); horizontal resolution of the image, in pixels per meter (four
bytes, integer, set to 2835); vertical resolution of the image, in pixels per meter (four bytes,
integer, set to 2835); colour palette information (four bytes, integer, set to 0); number of
important colours (four bytes, integer, set to 0).
3. Pixel Data, encoded as B G R triplets for each pixel, where B, G, and R are intensities of the
blue, green, and red channels, respectively, with values stored as one-byte unsigned integers
(0–255). It is important that the number of bytes per row must be a multiple of 4; use 0 to
3 bytes of padding, that is, having a value of zero (0) to achieve this for each row of pixels.
The total size of the pixel data is Nrows × Srow × 3, where Nrows is the number of rows in
the image (32–see paragraph on barcodes); Srow is the size of the row, equal to the smallest
multiple of 4 that is larger than the number of pixels per row (here, 32–see paragraph on
barcodes); and the constant 3 is the number of bytes per pixel (24 bits per pixel, as specified
in the field “number of bits per pixel”, see the Bitmap header description).
Note: the message will be quite more visible in this example than what one would do in
reality. As an example, the following image contains the message The quick brown fox jumps
over the lazy dog encrypted in the barcode pattern:
Note: Although Wikipedia is not a universally trustworthy source of information, many of its articles on technical
aspects, such as the “BMP file format” have been checked by tens to hundreds of domain experts.
7 This encoding is BITMAPINFOHEADER, which is a typical encoding for Windows and Linux machines. Older
encodings, such as BITMAPCOREHEADER for OS/2, are obsolete. Newer versions, such as BITMAPV5HEADER exist, but
they are too complex for our scientists.
44
assembly programming. Not bad!
Exercise:
You can find a tarball containing assembly code and instructions for reading a file specified as a
command line argument on Brightspace. You should write a brainfuck subroutine that takes a
pointer to the Brainfuck code as argument and executes this program.
Examples:
• hello.b:
>+++++++++[<++++++++>-]<.>+++++++[<++++>-]<+.+++++++..+++.>>>++++++++
[<++++>-]<.>>>++++++++++[<+++++++++>-]<---.<<<<.+++.------.--------.>>+.
Now, executing your program as follows
./brainfuck hello.b
should make it give
Hello World!
as output.
• cat.b:
,[.,]
This program is the equivalent of the Linux command cat; running this program will copy
whatever you enter in the console. By sending the null character to the console, the loop
halts (this is done by pressing Ctrl+2 in the terminal).
• More complex programs to test your interpreter with can be found all over the internet. (For
example: http://esoteric.sange.fi/brainfuck/utils/mandelbrot/mandelbrot.b.)
8 https://esolangs.org/wiki/Brainfuck
45
Bonus points:
Find ways to significantly improve the speed or memory usage of your interpreter. At the end of
this course, we will run all submitted Brainfuck interpreters and see which ones are the fastest.
The creators of the top 5 interpreters will receive 800 points, places 6–10 will receive 700 points,
and places 11–20 will receive 600 points. Everybody who at least submits a valid interpreter will
receive 500 points.
Of course, you should use the same subroutine name in your C program as you do in your assembler
file. After this prototype declaration, you can simply call the subroutine like you would call any
other C function. The final step is to compile both the assembler file and the C source file and to
link them together into a single binary. As usual, we offload all the hard work to gcc:
. . . and presto!
As an aside, it is also possible to include snippets of assembler code directly into your C source
code. This can be very useful in cases where specific tight loops in your programs take a lot of
time to execute. The details on how to inline assembler code, as this technique is called, are not
standardised and they may differ from one C compiler to another. Consult the manual of your
favourite compiler to find out how this mechanism works.
46
Specification”). We will check whether your idea is reasonable and sufficient. We will reply to
your email with details and approve your specification or request changes.
Your task is to implement a game, subject to the following requirements:
Requirements
1. Your implementation should implement correctly the rules of the game.
2. Your implementation should display the state of the game. If the rules of the game make
it possible, your implementation should display the current progress of the player toward
achieving the goal of the game.
3. Your implementation should permanently record the top scores, if the rules of the game
allow it. Your implementation should also have an option to display them. A bootable game
only needs to store top scores until the next reboot (i.e. they do not need to be written to
disk).
4. Your implementation should be able to receive input from at least one player. The input
has to come either from the keyboard or from the mouse.
47