Understanding
Process Memory
(Win32)
Software Security Assessment
Lecture 0x02
Keith Makan @k3170makan
Before we start...some toolage!
You will need some tools:
Immunity Debugger http://www.immunitysec.com/products-immdbg.shtml
IDA community Edition https://www.hex-rays.com/products/ida/index.shtml
Windows XP 32bit image (easy to obtain, will drop them at the campus this
week)
Oracle VirtualBox (https://www.virtualbox.org/ )
Some stuff you will probably need to read after this class:
http://rsquared.sdf.org/gdb/mlats.html
http://www.cs.nyu.edu/courses/fall04/V22.0201-003/ia32_chap_03.pdf
http://insecure.org/stf/smashstack.html
https://www.corelan.be/index.php/2009/07/19/exploit-writing-tutorial-part-1stack-based-overflows/
Why is this important?
Its fun to know how processes really work
You will probably get a lot better at debugging your programs
Needed for successful and meaningful exploitation!
Just like for SQL injection you need to know how an SQL statement works, for
memory corruption you need to know how memory works.
Simple work cycle to becoming a memory corruption guru: learn a
memory mechanism -> figure out how to corrupt it -> figure out ways
to bend it to your will.
You are computer scientists, enough said!
Hang on, why windows XP 32 bit?
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
Because it's ridiculously easy to exploit.
Because it's ridiculously easy to exploit.
Because it's ridiculously easy to exploit.
Because it's ridiculously easy to exploit.
Because it's ridiculously easy to exploit.
Because it's ridiculously easy to exploit.
Because it's ridiculously easy to exploit.
Because it's ridiculously easy to exploit.
Because it's ridiculously easy to exploit.
Because it's ridiculously easy to exploit.
Basic work cycle of an executable
1.
2.
3.
4.
5.
6.
Some idiot writes some code (.c,.cpp,etc.)
A compiler generates machine dependent code (raw assembler)
A linker maps in libraries (.DLLs, .sos )
A PE (Portable Executable) File is produced---or on Linux an ELF (Executable and Linkable
Format)
When the PE or ELF is executed; a memory loader maps the sections of the file into memory (.
bss,.data.text , .reloc, etc.)
a. During this phase the OS makes space for the stack and heap memory
b. Marks memory segments with the appropriate access rights
The operating system switches context to the .text section of the file.
This is a gross oversimplification but the important parts are mentioned. It starts with code, then a PE
(ELF) file is created and this is used to construct a memory image.
Some important things to take note of
The Compiler can only control certain attributes of a executable files
behaviour generally speaking
The Operating system can only control attributes that influence the process
during its execution generally speaking
The Compiler cannot control where an executable is loaded into memory for
example (since it would need to know what is executing and what will execute to some extent), and an
operating system cannot influence the contents of a process's code (it would need to
be able to predict the outcome of the code without actually running it).
This difference in responsibility and functionality is important to understand since it will
determines where certain security protections are enforced (at the compiler vs the operating
system!
A little about process memory
Store information needed to run a process.
Memory is read like a file (with random access)
Some parts hold instructions for the CPU to execute .i.e code, libraries
(.text *mostly*)
Other parts hold data targeted by computation (variables, static
values, dynamically assigned variables)
Memory is segments (or areas of memory) are marked with access rights
READ
EXECUTE
WRITE
or any combination of these (see following screenshot) - immunity
debugger*
The practical picture
access rights
Virtual
Memory
offset
Size of
section
memory map (loaded not executing).
A sample memory map (loaded executing)
The Stack
*stolen from corelan.be - because corelan is awesome!
Program Image
DLLs (Imported Code).
.text?
Used to store code that will dictate the processes behaviour
READ ONLY, well it's intended to be so
Corresponds to the .text area of the executable file (usually)
.text?
instructions
raw opcodes
memory offset
.data?
Holds references to values (variables) with non-NULL values at compile
time. Used to reference both global or local variables initialized this way.
i.e. int one=1, the variable one will have an address in the .data
segment
i.e. char *string = this exploit only works on my machine;
The .data can hold both static (immutable values) and non-static variables
i.e. const int one=1, this variables value cannot be changed, but is still
initialized with a non-zero value therefore it goes in the .data
*some compilers prefer to use the rdata section for non-mutable initialized data.
.data?
String values
initialized at
compile time
Variable offset
addresses
Literal values
encoded in hex
Example reference to .data
4
interpretation
Data section contents
Addresses
stack
addresses
values at
addresses
Some good reading sources...
Ive left out a few segments, if youre interested in the full story as far as
executable formats go, check these links out:
http://www.csn.ul.ie/~caolan/pub/winresdump/winresdump/doc/pefile2.html
https://evilzone.org/tutorials/(paper)-portable-executable-format-and-itsrsrc-section/
http://msdn.microsoft.com/en-us/magazine/cc301805.aspx
http://msdn.microsoft.com/en-us/library/ms809762.aspx
Linux
http://www.skyfree.org/linux/references/ELF_Format.pdf
http://wiki.osdev.org/ELF
http://www.linuxjournal.com/article/1059
.stack?
used as a scratch pad for local variables and switching execution
between functions
Used to set up arguments to pass to called functions
Grows in size dynamically toward address 0x0
Works just like the stacks you learned about in Dodds course
Adopted from the stuff Turing wrote about Turing machines only need one
stack and a tape drive (memory) to be able to compute anything =>
modern computers still rely on this fundamental principle.
The stack must provide a way for functions to call other functions and functions
to return to those that called them!
Have a little think about how you would have this work...
Using the Stack
ESP (x86)/RSP (x64) register is used to point to the top of a functions
stack
EBP (x86)/RBP (x64) register is used to point to the bottom of a functions
stack
These registers are used as ways to deference addresses to variables on
the stack.
Facilitates function nesting and recursion
Can also sometimes be used as a place to store dynamic variables (with
runtime dependent size) i.e. emulate a heap!
How the stack works
function A(){
Base
Pointer
function B(){
C();
}
function C(){
D();
Stack
Pointer
bottom
function Bs stack
top
function C stack
}
main(){
A();
}
function Ds stack
Destruction
function As stack
Growth
B();
From the code...
How functions setup their own stacks...
push
ebp
#save the previous functions EBP
calling ebp
calling stack
function
calling
Return Address
calling esp
calling EBP
function
being
called*
called stack
*this stack is still to be set up, this diagram does not reflect its actual size but instead the space it
will occupy
From the code
How functions setup their own stacks...
push
ebp
#save the previous functions EBP
mov
ebp, esp
#grab the current value of EBP and move it to ESP
calling stack
function
calling
Return Address
called ebp
calling EBP
called esp
*function
being
called
*esp and ebp are equal so effectively the stack currently occupies 0 space at the moment
From the code...
How functions setup their own stacks...
push
ebp
#save the previous functions EBP
mov
ebp, esp
#grab the current value of EBP and move it to ESP
push
ebx
#save the EBX value to the stack
calling stack
Return Address
function
calling
calling EBP
called ebp
called esp
calling EBX
called stack
function
being
called*
From the code...
How functions setup their own stacks...
sub
esp, 0Ch
#create space on the stack by sub-ing 0xC = 12 bytes
Return Address
called ebp
calling EBP
calling EBX
called esp
called stack
0xC
addresses
Some notes
The last diagram indicates a fully setup stack ready to rock!
Here the EBX on the stack is not important, its merely saved/preserved as
a convention of the specific function being called. For all intents and
purposes it has nothing to do with setting up the stack in the classical
sense and was clumped in as part of the this example as pure
happenstance.
We have left out a crucial part of this operation in order to simplify
explanation, if youve noticed this or have some questions sit tight were
not done ;)
From the code... (cont.)
How functions destroy their own stacks...
add
esp, 0Ch
#add back the 12 bytes we allocated on the stack
Return Address
called ebp
calling EBP
called esp
calling EBX
called stack
From the code... (cont.)
How functions destroy their own stacks...
pop
ebx
#restore the ebx value we saved
Return Address
called ebp
calling EBP
calling EBX
called esp
called stack
Value placed in
EBX register
From the code... (cont.)
How functions destroy their own stacks...
pop
ebp
#remove the ebx value we saved
calling stack
called ebp
Return Address
called esp
calling EBP
Saved EBP value lets
us know where to
place the bottom
bound of the old
stack
New stack
restored to
original
state
calling EBX
Old stack
called stack
From the code... (cont.)
How functions destroy their own stacks...
retn
#this instruction basically branches execution
calling stack
called ebp
Return Address
called esp
calling EBP
New stack
restored to
original
state
calling EBX
Old stack
called stack
The RETN instruction
Literally
pop eip
Used to branch execution after a process is done executing
Branches execution to whatever is on top of the stack! So this means
it takes the value saved on top of the stack and placed it inside eip.
The processor then expects to find instructions to execute at that address
value. I.e. call *stack[0] execute whatever stack+0 points to!
Hang on...
How does the processor know where to return to?
Where did this magic return address value come from? Who put it there?
What happens to the stack of the previous function when another one is
called?
All these details are hidden in the CALL opcode.
Its actually a shorthand for a bunch of operations (sort of)
Return is almost the perfect inverse of CALL.
The Call instruction explained...
1. Save the current EIP (plus an instructions) to
the stack (so we know where to return to)
2. Load the called location into the EIP
3. Execute as normal...
Stack after call and setup...
calling stack
1.
2.
3.
call instruction saves EIP (+1 instruction)
called function preserves the callers EBP
called function makes space for its stack
Here we have an example where the function
Immunity.004010BB has just been called, before
the next functions prologue began executing.
Return Address
calling EBP
called stack
saved return pointer
popped onto stack
practical example (all together now)
Example here shows a function stack at a random
snapshot during its execution.
Called Stack - set up after the call was
made.
Calling EBP value - the function
that called this ones EBP
Return Address
Some notes
I consider the return address, saved EBP and arguments part of the called
function's stack
This doesnt really matter but you could consider it part of the calling
function's stack given that it is the calling function that loads these
values
But it makes it an easier story to tell from the perspective of memory
corruption given that the calling function corrupts this data.
You could also think of the calling function as preparing the stack by
placing these values in the called stack.
What about function arguments?
What happens when a function has arguments?
How is this handled according to the mechanics of the stack?
Well
All functions passed to a callee (function being called) need to be accessible locally (via the its
stack).
The calling function will push these arguments onto its stack in REVERSE order before making
the call and branching execution to the callee. *on some architectures the EDI and ESI are used
to store points to arguments before the stack is used to store them when calling a function
(optimization effort)*
And Then
The callee will reference these arguments outside of its stack by using registers (EBP, ESP)
References and further reading
Make sure to soak these up before the next lecture ;)
Smashing the stack in 2010 http://www.mgraziano.info/docs/stsi2010.pdf
Smashing the stack for fun and profit http://insecure.org/stf/smashstack.html
Exploit Writing tutorial part 1 : Stack Overflows https://www.corelan.be/index.
php/2009/07/19/exploit-writing-tutorial-part-1-stack-based-overflows/
Part 1 : Introduction to Exploit Development https://www.fuzzysecurity.com/tutorials/expDev/1.
html