Pwn Challenges Setup Part 1 - Reversing


Once upon a time I wanted to join a CTF and solve some challenges. I settled out and chose a pwn challenge. I downloaded the binary, started GDB and lo and behold....

I had no clue how to proceed. GDB is barely usable and it's command line interface at best obscure. I had no idea of disassembler or decompilers. And what the hell is pwntools? This article series should provide some insight to the most basic setup for solving pwn challenges so you don't have to feel the same pain I once did.

There are three essentials you will need to solve any ordinary pwn challenge:

  • Disassembler/Decompiler
  • Debugger
  • Scripting

For each category, we are going to look at some options, and I'm gonna explain my setup. This shouldn't be followed as the holy grail, but instead it's meant to be a starting point to build your own work flow. For the first part, we will look into static analysis.

Example Challenge

To illustrate some tools and other things, I'm going to use the x64 ret2csu challenge from ropemporium. You can read more about this specific challenge here. It should serve as a nice example since it uses its own library to decrypt the flag and comes without any sourcecode. This means we have to use LD_Preload and do some reversing. Let's get started.

Static Analysis Tools

The first thing you want to do is some static analysis. This means disassembling, or even decompiling the binary.

Disassembling vs. Decompiling

While diassembling will only translate the bytes of the binary into opcodes used by the underlying architecture, a decompiler will try to reconstruct the original code. However, due to the use of optimization during the compile process, this isn't the easiest process. Sometimes information is lost, or represented in a confusing way. When you use the disassembly however, you can see what the processor is doing step by step. So it's usually recommended to combine both approaches. You should therefor get somewhat comfortable reading assembly code, as well as weird pseudo c code.

For this you have several options, free and paid.

Objdump

Kind of an honorable mention, objdump is part of the GNU Binary Utilities and probably installed by default on your system. It can provide disassembly as well as symbols and sections of your binary.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
❯ objdump -T libret2csu.so

libret2csu.so:     file format elf64-x86-64

[...]
0000000000000000      DF *UND*      0000000000000000  GLIBC_2.2.5 exit
0000000000000000  w   D  *UND*      0000000000000000              _ITM_registerTMCloneTable
0000000000000000  w   DF *UND*      0000000000000000  GLIBC_2.2.5 __cxa_finalize
0000000000202078 g    D  .data      0000000000000000  Base        _edata
0000000000202088 g    D  .bss       0000000000000000  Base        _end
000000000000093a g    DF .text      0000000000000099  Base        pwnme
0000000000202078 g    D  .bss       0000000000000000  Base        __bss_start
0000000000000778 g    DF .init      0000000000000000  Base        _init
00000000000009d3 g    DF .text      00000000000002a7  Base        ret2win
0000000000000c7c g    DF .fini      0000000000000000  Base        _fini

❯ objdump -M intel --disassemble=pwnme libret2csu.so

libret2csu.so:     file format elf64-x86-64
[...]
000000000000093a <pwnme>:
93a:        55                      push   rbp
93b:        48 89 e5                mov    rbp,rsp
93e:        48 83 ec 20             sub    rsp,0x20
942:        48 8b 05 97 16 20 00    mov    rax,QWORD PTR [rip+0x201697]        # 201fe0 <stdout@GLIBC_2.2.5>
949:        48 8b 00                mov    rax,QWORD PTR [rax]
94c:        b9 00 00 00 00          mov    ecx,0x0
951:        ba 02 00 00 00          mov    edx,0x2
956:        be 00 00 00 00          mov    esi,0x0
95b:        48 89 c7                mov    rdi,rax
95e:        e8 bd fe ff ff          call   820 <setvbuf@plt>
963:        48 8d 3d 1e 03 00 00    lea    rdi,[rip+0x31e]        # c88 <_fini+0xc>
96a:        e8 31 fe ff ff          call   7a0 <puts@plt>
96f:        48 8d 3d 2a 03 00 00    lea    rdi,[rip+0x32a]        # ca0 <_fini+0x24>
976:        e8 25 fe ff ff          call   7a0 <puts@plt>
97b:        48 8d 45 e0             lea    rax,[rbp-0x20]
97f:        ba 20 00 00 00          mov    edx,0x20
984:        be 00 00 00 00          mov    esi,0x0
989:        48 89 c7                mov    rdi,rax
98c:        e8 3f fe ff ff          call   7d0 <memset@plt>
991:        48 8d 3d 10 03 00 00    lea    rdi,[rip+0x310]        # ca8 <_fini+0x2c>
998:        e8 03 fe ff ff          call   7a0 <puts@plt>
99d:        48 8d 3d 6e 03 00 00    lea    rdi,[rip+0x36e]        # d12 <_fini+0x96>
9a4:        b8 00 00 00 00          mov    eax,0x0
9a9:        e8 12 fe ff ff          call   7c0 <printf@plt>
9ae:        48 8d 45 e0             lea    rax,[rbp-0x20]
9b2:        ba 00 02 00 00          mov    edx,0x200
9b7:        48 89 c6                mov    rsi,rax
9ba:        bf 00 00 00 00          mov    edi,0x0
9bf:        e8 2c fe ff ff          call   7f0 <read@plt>
9c4:        48 8d 3d 4a 03 00 00    lea    rdi,[rip+0x34a]        # d15 <_fini+0x99>
9cb:        e8 d0 fd ff ff          call   7a0 <puts@plt>
9d0:        90                      nop
9d1:        c9                      leave
9d2:        c3                      ret

As you can see, you can easily list the exported symbols and disassemble functions of your choice. Just don't forget that -M intel argument, or you will be getting AT&T syntax. Without describing the difference here, just remember that intel syntax is the de facto standard as it is much easier readable.

Objdump is also useful if you just need to quickly grab the offset of a symbol. For example getting the offset for system in glibc

1
2
3
4
❯ objdump -T /lib/x86_64-linux-gnu/libc.so.6 | grep system
000000000012d600 g    DF .text      0000000000000063  GLIBC_2.2.5 svcerr_systemerr
0000000000048f20 g    DF .text      000000000000002d  GLIBC_PRIVATE __libc_system
0000000000048f20  w   DF .text      000000000000002d  GLIBC_2.2.5 system

Radare2

Another disassembly only tool, but with a few more features and helping visualizations to assist you during reversing. Radare2 is opensource and free. You can easily start it using r2 libret2csu.so and start the analysis by typing aaa . After that you can start reversing by listing all using afl and get the functions disassembly using pdf :

View of Radare2

As you can see, you get some nice visual helpers to see where a jump goes. You also get references and fancy colors.

Additionally radare offers a more GUI-like mode you can enter with v .

View of Radare2-Visual Mode

You can use ? to get a help screen with available commands.

Another feature of radare is the ability to use plugins. For example the retdec plugin will let you integrate the retdec decompiler into radare. Also, if you like radare, but want something more fancy looking than a terminal, take a look at Cutter. It gives you a relatively nice GUI and should ease the pain of learning how to use radare.

There are so many more features, and you could probably start your own blog just to go through them all. If you're actually interested, I recommend going through this great repo.

Ghidra

The first on our list to actually combine a disassembler and a decompiler. Released 2019(?) by the NSA. Yeah, that's right. THE NSA. But it has gained some traction and quite a few people use it, especially because it's free. To this day there are some conspiracy theories about possible backdoors, so you may want to use it in a VM.

View of Ghidra

As you can see, while not pretty, it gives you a pretty readable C-Like codeview to assist in you debugging. The tool itself is based on Java, which is why I don't really like to use it. The UI feels heavy and unintuitive, but that's really just preference.

As always, there are many more features (and plugins). Since it's free, it is probably your most complete bet, but also a little overkill for small CTF-like binaries. But grab your copy on the official website and see if it suits you.

IDA pro

IDA pro is the "big player" and standard in the industry. However, it's expensive (well into four figures) and also completely overkill for CTFs. I have never used it due to it's cost so I can't really say anything other than "it exists".

Binary Ninja

Another paid option, but not as expensive as IDA. But luckily binary ninja has a free cloud version, so you can try it out first. This is my tool of choice since it has a very usable UI and an incredibly well working disassembly engine. With it's recent introduction of the HLIL (High Level Intermediate Language) it also has some decompiling features.

View of Binary Ninja

The personal license costs 300$ per year, but as stated earlier there is a cloud version available. The cloud version has most of the features (even the HLIL) from the licensed version available and will be enough for CTF challenges. And the best is, it's free. With the support for plugins its easily worth the money and has become my favorite tool.

Loading plugins is dead simple using the integrated plugin manager. One plugin that will become handy for ctf is VulnFanatic. It helps to find simple bugs and highlights them for you.

Analyzing the Challenge

Now that you should have some kind of tool for static analysis running, let's take a look at the challenge binary: .. code-block:: assembly

main: 00400607 push rbp {__saved_rbp} 00400608 mov rbp, rsp {__saved_rbp} 0040060b call pwnme 00400610 mov eax, 0x0 00400615 pop rbp {__saved_rbp} 00400616 retn {__return_addr}

We can see that the main function of ret2csu just calls pwnme, which is in fact not part of the binary but linked as a symbol. We can easily guess that it has to be part of the provided library, so let's look at that next using binary ninjas awesome HLIL.

1
2
3
4
5
6
7
8
9
0000095e  setvbuf(fp: *stdout, buf: nullptr, mode: 2, size: 0)
0000096a  puts(str: "ret2csu by ROP Emporium")
00000976  puts(str: "x86_64\n")
0000098c  void var_28
0000098c  memset(&var_28, 0, 0x20)
00000998  puts(str: "Check out https://ropemporium.co…")
000009a9  printf(format: data_d12)
000009bf  read(fd: 0, buf: &var_28, nbytes: 0x200)
000009d2  return puts(str: "Thank you!")

In line 8 we can see that 0x200 bytes are being read into var_28. In this HLIL view we can't see how much space was actually reserved for that variable, but switching to disassembly shows us: 0000097b lea rax, [rbp-0x20 {var_28}] . So only 0x20 bytes were reserved on the stack, leading to an easy buffer overflow.

There is also this monster of a function:

Disassembly of ret2win

Looking closer we can see that the lower part is basically just opening and decrypting our flag file. This is what we want, so we should take a closer look at the branches to get there. And to show off how awesome binary ninja is, we use the HLIL representation:

Decompilation of ret2win

I mean, just compare those two images. Obviously this will save you a lot of time during a competetion. To be fair, radare with ret2dec and ghidra provide equally useful output on this binary. Anway, back to analyzing.

As you can see, the function just checks if RDI RSI and RDX are set to some magic values. This corresponds to the first three function arguments on x64. The check happens 3 times, so we can't just jump behind it and get straight to the "print flag" section. We have to actually set those arguments and call the function.

Counter Measures

Now that we a good grasps of what the binary does (Calling a function in a library), what the vulnerability is (simple buffer overflow) and what we have to do (set three registers and call ret2win), we should also look at what countermeasures are implemented. Since everything happens in the library and not the binary itself, it's enough to check the libret2csu.so file:

1
2
3
4
5
6
7
❯ checksec libret2csu.so
[*] '/root/dev/ctf/ret2csu/libret2csu.so'
    Arch:     amd64-64-little
    RELRO:    Partial RELRO
    Stack:    No canary found
    NX:       NX enabled
    PIE:      PIE enabled

We can see it has no canary, so we can easily overflow into the saved return pointer. But NX and PIE are enabled, so we are limited to partial overwrites and due to NX we can't just load some shellcode and execute it.

Next steps

We are now at a point where we can start our exploit development. For this we will need some scripting and debugging capabilities. But since this post is already long enough, I will end it here to be continued at a later time. I hope you got some insight into what tools are available for reversing a simple pwn challenge. There are many more articles from other great people comparing those tools and giving even more in-depth information. If you have any comments (or know other great tools I should check out) leave me a comment on Twitter