Before doing Linux binary analysis, utils which are easy to use are essential. Here are some utils when analysing ELF files or doing analysis on Linux, including ‘-h’ to search and some common usages
Usage: readelf <option(s)> elf-file(s) Display information about the contents of ELF format files Options are: -a --all Equivalent to: -h -l -S -s -r -d -V -A -I -h --file-header Display the ELF file header -l --program-headers Display the program headers --segments An aliasfor --program-headers -S --section-headers Display the section header --sections An aliasfor --section-headers -g --section-groups Display the section groups -t --section-details Display the section details -e --headers Equivalent to: -h -l -S -s --syms Display the symbol table --symbols An aliasfor --syms --dyn-syms Display the dynamic symbol table -n --notes Display the core notes (if present) -r --relocs Display the relocations (if present) -u --unwind Display the unwind info (if present) -d --dynamic Display the dynamic section (if present) -V --version-info Display the version sections (if present) -A --arch-specific Display architecture specific information (if any) -c --archive-index Display the symbol/file index in an archive -D --use-dynamic Use the dynamic section info when displaying symbols -x --hex-dump=<number|name> Dump the contents of section <number|name> as bytes -p --string-dump=<number|name> Dump the contents of section <number|name> as strings -R --relocated-dump=<number|name> Dump the contents of section <number|name> as relocated bytes -z --decompress Decompress section before dumping it -w[lLiaprmfFsoRtUuTgAckK] or --debug-dump[=rawline,=decodedline,=info,=abbrev,=pubnames,=aranges,=macro,=frames, =frames-interp,=str,=loc,=Ranges,=pubtypes, =gdb_index,=trace_info,=trace_abbrev,=trace_aranges, =addr,=cu_index,=links,=follow-links] Display the contents of DWARF debug sections --dwarf-depth=N Do not display DIEs at depth N or greater --dwarf-start=N Display DIEs starting with N, at the same depth or deeper -I --histogram Display histogram of bucket list lengths -W --wide Allow output width to exceed 80 characters @<file> Read options from <file> -H --help Display this information -v --version Display the version number of readelf
$ readelf -h /lib/x86_64-linux-gnu/libc.so.6 ELF Header: Magic: 7f 45 4c 46 02 01 01 03 00 00 00 00 00 00 00 00 Class: ELF64 Data: 2 s complement, little endian Version: 1 (current) OS/ABI: UNIX - GNU ABI Version: 0 Type: DYN (Shared object file) Machine: Advanced Micro Devices X86-64 Version: 0x1 Entry point address: 0x21ba0 Start of program headers: 64 (bytes into file) Start of section headers: 1795704 (bytes into file) Flags: 0x0 Size of this header: 64 (bytes) Size of program headers: 56 (bytes) Number of program headers: 10 Size of section headers: 64 (bytes) Number of section headers: 71 Section header string table index: 70
-l –program-headers Display the program headers
-S –section-headers Display the sections’ header
-s –syms Display the symbol table (useful when want to search some syms, but just check, not locate)
1 2 3 4 5 6 7 8 9 10 11 12
$ readelf -s /lib/x86_64-linux-gnu/libc.so.6
Symbol table '.dynsym' contains 2335 entries: Num: Value Size Type Bind Vis Ndx Name 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: 0000000000000000 0 OBJECT GLOBAL DEFAULT UND _rtld_global@GLIBC_PRIVATE (30) 2: 0000000000000000 0 OBJECT GLOBAL DEFAULT UND __libc_enable_secure@GLIBC_PRIVATE (30) 3: 0000000000000000 0 FUNC GLOBAL DEFAULT UND __tls_get_addr@GLIBC_2.3 (31) 4: 0000000000000000 0 FUNC GLOBAL DEFAULT UND _dl_exception_create@GLIBC_PRIVATE (30) 5: 0000000000000000 0 OBJECT GLOBAL DEFAULT UND _rtld_global_ro@GLIBC_PRIVATE (30) 6: 0000000000000000 0 FUNC GLOBAL DEFAULT UND __tunable_get_val@GLIBC_PRIVATE (30) ......
Usage: objdump <option(s)> <file(s)> Display information from object <file(s)>. At least one of the following switches must be given: -a, --archive-headers Display archive header information -f, --file-headers Display the contents of the overall file header -p, --private-headers Display object format specific file header contents -P, --private=OPT,OPT... Display object format specific contents -h, --[section-]headers Display the contents of the section headers -x, --all-headers Display the contents of all headers -d, --disassemble Display assembler contents of executable sections -D, --disassemble-all Display assembler contents of all sections -S, --source Intermix source code with disassembly -s, --full-contents Display the full contents of all sections requested -g, --debugging Display debug information in object file -e, --debugging-tags Display debug information using ctags style -G, --stabs Display (in raw form) any STABS info in the file -W[lLiaprmfFsoRtUuTgAckK] or --dwarf[=rawline,=decodedline,=info,=abbrev,=pubnames,=aranges,=macro,=frames, =frames-interp,=str,=loc,=Ranges,=pubtypes, =gdb_index,=trace_info,=trace_abbrev,=trace_aranges, =addr,=cu_index,=links,=follow-links] Display DWARF info in the file -t, --syms Display the contents of the symbol table(s) -T, --dynamic-syms Display the contents of the dynamic symbol table -r, --reloc Display the relocation entries in the file -R, --dynamic-reloc Display the dynamic relocation entries in the file @<file> Read options from <file> -v, --version Display this program's version number -i, --info List object formats and architectures supported -H, --help Display this information
Common Usages
-d, –disassemble Display assembler contents of executable sections
-f, –file-headers Display the contents of the overall file header
run/r/start: to run the program The difference is: 'run' is just load the program and run, but 'start' means pause at the front of 'main function' 'show args' and 'set args' to modify args
break/b: to set breakpoint
continue/c: to continue the rest program
next/n: to get to the next line of source code but not step into the function ni: get to next instruction.
step/s: to get next and step into the func si: step to next instruction.
ni&si: to run next instruction, and the difference is same as next&step
finish/fini: to run until return of this stack or func
print/p: to print the value of variables
list/l: to show 10 lines of codes, use list[m,n] and get codes from line m to line n
backtrace/bt: to list stack called
display,undisplay: track the value of the variable
info/i: show information of "args","locals","break","all-registers" and so on info proc mapping(s): show memory mapped address info all-registers: show regs
set: set values set *addr = val set args ... set environment LD_PRELOAD=./libx.so
def hook-stop: $ gdb ./linux_server -q Reading symbols from ./linux_server... (No debugging symbols found in ./linux_server) (gdb) define hook-stop Type commands for definition of "hook-stop". End with a line saying just "end". >x/10i $pc >end (gdb) r Starting program: /root/linux_server [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". IDA Linux 32-bit remote debug server(ST) v1.22. Hex-Rays (c) 2004-2017 Listening on 0.0.0.0:23946... ^C Program received signal SIGINT, Interrupt. => 0xf7fd1ef9 <__kernel_vsyscall+9>: pop %ebp 0xf7fd1efa <__kernel_vsyscall+10>: pop %edx 0xf7fd1efb <__kernel_vsyscall+11>: pop %ecx 0xf7fd1efc <__kernel_vsyscall+12>: ret 0xf7fd1efd: nop 0xf7fd1efe: nop 0xf7fd1eff: nop 0xf7fd1f00: nop 0xf7fd1f01: lea 0x0(%esi,%eiz,1),%esi 0xf7fd1f08: lea 0x0(%esi,%eiz,1),%esi 0xf7fd1ef9 in __kernel_vsyscall () (gdb)
shell: (gdb) shell $ cd / $ ls bin dev home lib lib64 lost+found mnt proc run srv tmp var boot etc init lib32 libx32 media opt root sbin sys usr $ exit exit (gdb)
set follow-fork-mode [parent|child]: determining which process (thread) to attach, when encountering `fork` actions.
set scheduler-locking [on|off|step]: only execute debugging thread and block others.
X Commands
When discovering the memory, use x command to check.
examine/x: to show memory usage: x /nfu <addr>
n demonstrate number of memory units
f demonstrate the method to display, can be: x Display variables in hexadecimal format. d Display variables in decimal format. u Display unsigned integers in decimal format. o Display variables in octal format. t Display variables in binary format. a Display variables in hexadecimal format. i instruction address format c Display variables in character format. f Display variables in floating point format.
u represents the length of an address unit, can be: b means single byte, h means double-byte, w represents four bytes, g means eight bytes
To sum up: Format letters are o(octal), x(hex), d(decimal), u(unsigneddecimal), t(binary), f(float), a(address), i(instruction), c(char) and s(string). Size letters are b(byte), h(halfword), w(word), g(giant, 8bytes)
So what is commonly used?
1 2 3 4
x /100wx addr -- show memory for x86 x /20gx addr -- show memory for x64 x /10i addr -- display memory as instructions ...
Debug Running Process
Debug local process dynamically First, find the pid of xxx
1
$ ps -ax | grep xxx
If return the pid nnn, use
1
$ gdb xxx nnn
or go into gdb to use
1
(gdb) file /path/to/xxx
and
1
(gdb) attach nnn
to attach gdb to the process
gdb+gdbserver Debug
Use gdb to connect to the stub, which gdbserver creates on the remote host.
First, build the gdbserver for target platform (take arm for e.g.)
1 2 3 4 5 6 7 8 9
$ cd gdb/ $ ./configure --target=arm-linux --program-prefix=arm-linux- --prefix=/opt/arm-linux-gdb/ $ make # build gdb $ make install # install gdb # if error occurs, google to solve it $ cd gdb/gdbserver/ $ ./configure --target=arm-linux --host=arm-linux-gnueabi # '--host' assigns cross compilation toolchain $ make # build gdbserver
Allow program to be debugged
1 2
$ arm-linux-gnueabi-gcc -g test.c -o test # '-g' is essential
Copy gdbserver to target machine, and run it
1 2
$ gdbserver IP:PORT xxx [--attach PID] # this IP is your host IP
At your host
1 2 3 4
$ arm-linux-gdb xxx ...... (gdb) target remote IP:PORT # this IP is your target IP
Enhance the display of gdb: colorize and display disassembly codes, registers, memory information during debugging. Add commands to support debugging and exploit development (for a full list of commands use peda help): aslr -- Show/set ASLR setting of GDB checksec -- Check for various security options of binary dumpargs -- Display arguments passed to a function when stopped at a call instruction dumprop -- Dump all ROP gadgets in specific memory range elfheader -- Get headers information from debugged ELF file elfsymbol -- Get non-debugging symbol information from an ELF file lookup -- Search for all addresses/references to addresses which belong to a memory range patch -- Patch memory start at an address with string/hexstring/int pattern -- Generate, search, or write a cyclic pattern to memory procinfo -- Display various info from /proc/pid/ pshow -- Show various PEDA options and other settings pset -- Set various PEDA options and other settings readelf -- Get headers information from an ELF file ropgadget -- Get common ROP gadgets of binary or library ropsearch -- Search for ROP gadgets in memory searchmem|find -- Search for a pattern in memory; support regex search shellcode -- Generate or download common shellcodes. skeleton -- Generate python exploit code template vmmap -- Get virtual mapping address ranges of section(s) in debugged process xormem -- XOR a memory region with a key
Installation
1 2 3
$ git clone https://github.com/longld/peda.git ~/peda $ echo"source ~/peda/peda.py" >> ~/.gdbinit # if using pwndbg, gef or other plugins, just change contents in ~/.gdbinit $ echo"DONE! debug your program with gdb and enjoy"
Other plugins include: pwndbg, gef, you can also write one by yourself.
remove canary, NX, PIE, and What a stack boundary=2/4 does is to ensure that the stack is set up into dword-size/qword-size increments, this prevents your machine from optimizing the stack.