Once upon a time I wanted to join a CTF and solve some challenges. I settled out and chose a pwn challenge. I downloaded the binary, started GDB and lo and behold....
I had no clue how to proceed. GDB is barely usable and it's command line interface at best obscure. I had no idea of disassembler or decompilers. And what the hell is pwntools? This article series should provide some insight to the most basic setup for solving pwn challenges so you don't have to feel the same pain I once did.
There are three essentials you will need to solve any ordinary pwn challenge:
- Disassembler/Decompiler
- Debugger
- Scripting
For each category, we are going to look at some options, and I'm gonna explain my setup. This shouldn't be followed as the holy grail, but instead it's meant to be a starting point to build your own work flow. For the first part, we will look into static analysis.
Example Challenge
To illustrate some tools and other things, I'm going to use the
x64 ret2csu
challenge from ropemporium. You can read more about this specific challenge here. It should serve as a nice example since it uses its own library to decrypt the flag and comes without any sourcecode. This means we have to use LD_Preload and do some reversing. Let's get started.
Static Analysis Tools
The first thing you want to do is some static analysis. This means disassembling, or even decompiling the binary.
While diassembling will only translate the bytes of the binary into opcodes used by the underlying architecture, a decompiler will try to reconstruct the original code. However, due to the use of optimization during the compile process, this isn't the easiest process. Sometimes information is lost, or represented in a confusing way. When you use the disassembly however, you can see what the processor is doing step by step. So it's usually recommended to combine both approaches. You should therefor get somewhat comfortable reading assembly code, as well as weird pseudo c code.
For this you have several options, free and paid.
Objdump
Kind of an honorable mention, objdump is part of the GNU Binary Utilities and probably installed by default on your system. It can provide disassembly as well as symbols and sections of your binary.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 | ❯ objdump -T libret2csu.so
libret2csu.so: file format elf64-x86-64
[...]
0000000000000000 DF *UND* 0000000000000000 GLIBC_2.2.5 exit
0000000000000000 w D *UND* 0000000000000000 _ITM_registerTMCloneTable
0000000000000000 w DF *UND* 0000000000000000 GLIBC_2.2.5 __cxa_finalize
0000000000202078 g D .data 0000000000000000 Base _edata
0000000000202088 g D .bss 0000000000000000 Base _end
000000000000093a g DF .text 0000000000000099 Base pwnme
0000000000202078 g D .bss 0000000000000000 Base __bss_start
0000000000000778 g DF .init 0000000000000000 Base _init
00000000000009d3 g DF .text 00000000000002a7 Base ret2win
0000000000000c7c g DF .fini 0000000000000000 Base _fini
❯ objdump -M intel --disassemble=pwnme libret2csu.so
libret2csu.so: file format elf64-x86-64
[...]
000000000000093a <pwnme>:
93a: 55 push rbp
93b: 48 89 e5 mov rbp,rsp
93e: 48 83 ec 20 sub rsp,0x20
942: 48 8b 05 97 16 20 00 mov rax,QWORD PTR [rip+0x201697] # 201fe0 <stdout@GLIBC_2.2.5>
949: 48 8b 00 mov rax,QWORD PTR [rax]
94c: b9 00 00 00 00 mov ecx,0x0
951: ba 02 00 00 00 mov edx,0x2
956: be 00 00 00 00 mov esi,0x0
95b: 48 89 c7 mov rdi,rax
95e: e8 bd fe ff ff call 820 <setvbuf@plt>
963: 48 8d 3d 1e 03 00 00 lea rdi,[rip+0x31e] # c88 <_fini+0xc>
96a: e8 31 fe ff ff call 7a0 <puts@plt>
96f: 48 8d 3d 2a 03 00 00 lea rdi,[rip+0x32a] # ca0 <_fini+0x24>
976: e8 25 fe ff ff call 7a0 <puts@plt>
97b: 48 8d 45 e0 lea rax,[rbp-0x20]
97f: ba 20 00 00 00 mov edx,0x20
984: be 00 00 00 00 mov esi,0x0
989: 48 89 c7 mov rdi,rax
98c: e8 3f fe ff ff call 7d0 <memset@plt>
991: 48 8d 3d 10 03 00 00 lea rdi,[rip+0x310] # ca8 <_fini+0x2c>
998: e8 03 fe ff ff call 7a0 <puts@plt>
99d: 48 8d 3d 6e 03 00 00 lea rdi,[rip+0x36e] # d12 <_fini+0x96>
9a4: b8 00 00 00 00 mov eax,0x0
9a9: e8 12 fe ff ff call 7c0 <printf@plt>
9ae: 48 8d 45 e0 lea rax,[rbp-0x20]
9b2: ba 00 02 00 00 mov edx,0x200
9b7: 48 89 c6 mov rsi,rax
9ba: bf 00 00 00 00 mov edi,0x0
9bf: e8 2c fe ff ff call 7f0 <read@plt>
9c4: 48 8d 3d 4a 03 00 00 lea rdi,[rip+0x34a] # d15 <_fini+0x99>
9cb: e8 d0 fd ff ff call 7a0 <puts@plt>
9d0: 90 nop
9d1: c9 leave
9d2: c3 ret
|
As you can see, you can easily list the exported symbols and disassemble functions of your choice. Just don't forget that
-M intel
argument, or you will be getting AT&T syntax. Without describing the difference here, just remember that intel syntax is the de facto standard as it is much easier readable.
Objdump is also useful if you just need to quickly grab the offset of a symbol. For example getting the offset for
system
in
glibc
1 2 3 4 | ❯ objdump -T /lib/x86_64-linux-gnu/libc.so.6 | grep system
000000000012d600 g DF .text 0000000000000063 GLIBC_2.2.5 svcerr_systemerr
0000000000048f20 g DF .text 000000000000002d GLIBC_PRIVATE __libc_system
0000000000048f20 w DF .text 000000000000002d GLIBC_2.2.5 system
|
Radare2
Another disassembly only tool, but with a few more features and helping visualizations to assist you during reversing. Radare2 is opensource and free. You can easily start it using
r2 libret2csu.so
and start the analysis by typing
aaa
. After that you can start reversing by listing all using
afl
and get the functions disassembly using
pdf
:
As you can see, you get some nice visual helpers to see where a jump goes. You also get references and fancy colors.
Additionally radare offers a more GUI-like mode you can enter with
v
.
You can use
?
to get a help screen with available commands.
Another feature of radare is the ability to use plugins. For example the retdec plugin will let you integrate the retdec decompiler into radare. Also, if you like radare, but want something more fancy looking than a terminal, take a look at Cutter. It gives you a relatively nice GUI and should ease the pain of learning how to use radare.
There are so many more features, and you could probably start your own blog just to go through them all. If you're actually interested, I recommend going through this great repo.
Ghidra
The first on our list to actually combine a disassembler and a decompiler. Released 2019(?) by the NSA. Yeah, that's right. THE NSA. But it has gained some traction and quite a few people use it, especially because it's free. To this day there are some conspiracy theories about possible backdoors, so you may want to use it in a VM.
As you can see, while not pretty, it gives you a pretty readable C-Like codeview to assist in you debugging. The tool itself is based on Java, which is why I don't really like to use it. The UI feels heavy and unintuitive, but that's really just preference.
As always, there are many more features (and plugins). Since it's free, it is probably your most complete bet, but also a little overkill for small CTF-like binaries. But grab your copy on the official website and see if it suits you.
IDA pro
IDA pro is the "big player" and standard in the industry. However, it's expensive (well into four figures) and also completely overkill for CTFs. I have never used it due to it's cost so I can't really say anything other than "it exists".
Binary Ninja
Another paid option, but not as expensive as IDA. But luckily binary ninja has a free cloud version, so you can try it out first. This is my tool of choice since it has a very usable UI and an incredibly well working disassembly engine. With it's recent introduction of the HLIL (High Level Intermediate Language) it also has some decompiling features.
The personal license costs 300$ per year, but as stated earlier there is a cloud version available. The cloud version has most of the features (even the HLIL) from the licensed version available and will be enough for CTF challenges. And the best is, it's free. With the support for plugins its easily worth the money and has become my favorite tool.
Loading plugins is dead simple using the integrated plugin manager. One plugin that will become handy for ctf is VulnFanatic. It helps to find simple bugs and highlights them for you.
Analyzing the Challenge
Now that you should have some kind of tool for static analysis running, let's take a look at the challenge binary: .. code-block:: assembly
main: 00400607 push rbp {__saved_rbp} 00400608 mov rbp, rsp {__saved_rbp} 0040060b call pwnme 00400610 mov eax, 0x0 00400615 pop rbp {__saved_rbp} 00400616 retn {__return_addr}
We can see that the main function of
ret2csu
just calls pwnme, which is in fact not part of the binary but linked as a symbol. We can easily guess that it has to be part of the provided library, so let's look at that next using binary ninjas awesome HLIL.
1 2 3 4 5 6 7 8 9 | 0000095e setvbuf(fp: *stdout, buf: nullptr, mode: 2, size: 0)
0000096a puts(str: "ret2csu by ROP Emporium")
00000976 puts(str: "x86_64\n")
0000098c void var_28
0000098c memset(&var_28, 0, 0x20)
00000998 puts(str: "Check out https://ropemporium.co…")
000009a9 printf(format: data_d12)
000009bf read(fd: 0, buf: &var_28, nbytes: 0x200)
000009d2 return puts(str: "Thank you!")
|
In line 8 we can see that 0x200 bytes are being read into var_28. In this HLIL view we can't see how much space was actually reserved for that variable, but switching to disassembly shows us:
0000097b lea rax, [rbp-0x20 {var_28}]
. So only 0x20 bytes were reserved on the stack, leading to an easy buffer overflow.
There is also this monster of a function:
Looking closer we can see that the lower part is basically just opening and decrypting our flag file. This is what we want, so we should take a closer look at the branches to get there. And to show off how awesome binary ninja is, we use the HLIL representation:
I mean, just compare those two images. Obviously this will save you a lot of time during a competetion. To be fair, radare with ret2dec and ghidra provide equally useful output on this binary. Anway, back to analyzing.
As you can see, the function just checks if
RDI RSI and RDX
are set to some magic values. This corresponds to the first three function arguments on x64. The check happens 3 times, so we can't just jump behind it and get straight to the "print flag" section. We have to actually set those arguments and call the function.
Counter Measures
Now that we a good grasps of what the binary does (Calling a function in a library), what the vulnerability is (simple buffer overflow) and what we have to do (set three registers and call ret2win), we should also look at what countermeasures are implemented. Since everything happens in the library and not the binary itself, it's enough to check the
libret2csu.so
file:
1 2 3 4 5 6 7 | ❯ checksec libret2csu.so
[*] '/root/dev/ctf/ret2csu/libret2csu.so'
Arch: amd64-64-little
RELRO: Partial RELRO
Stack: No canary found
NX: NX enabled
PIE: PIE enabled
|
We can see it has no canary, so we can easily overflow into the saved return pointer. But NX and PIE are enabled, so we are limited to partial overwrites and due to NX we can't just load some shellcode and execute it.
Next steps
We are now at a point where we can start our exploit development. For this we will need some scripting and debugging capabilities. But since this post is already long enough, I will end it here to be continued at a later time. I hope you got some insight into what tools are available for reversing a simple pwn challenge. There are many more articles from other great people comparing those tools and giving even more in-depth information. If you have any comments (or know other great tools I should check out) leave me a comment on Twitter