http://blog.oxff.net/#sw3uk6k6mvkn2ia2s5za

2011-05-02 13:47

Defcon CTF #18 Prequals: bin400

This year's Defcon CTF Prequals, I've played on the lollersk8erz dropping from r0flcopterz team & we made it fifth, so see you in Las Vegas (also presenting dirtbox, a highly scalable x86/Windows Emulator at Black Hat)! Thanks again to ddtek at this place.

The most interesting challenge to me was bin400 (Tora's writeup). All you got was a 216Kb stripped ELF binary for 32bit Linux and at first glance with IDA, the code graphs were huge. However after digging a little through main, some commandline help got obvious:

$ ./b400_a559d40c0233c314.bin -help
Usage: dvm <-options> <ddtekfile>
Options:
  -cp <filepath>
  -heapsize <size> (e.g. 65536 or 128k or 1M)

So we're dealing with a virtual machine here. Glancing at some other strings, I quickly got the impression, this might be some Java VM:

java/lang/NullPointerException
java/lang/IndexOutOfBoundsException
java/lang/ArrayIndexOutOfBoundsException
...

Googling for "DVM Java Virtual Machine" led me to the assumption, I was dealing with the Dalvik Virtual Machine for Android. However, I could not find any embedded dex file headers in the binary's memory dump and some other strings did not match the sourcecode I downloaded from Android OpenSource.

Continuing my search, I finally found out via a lot of googling that this VM was based on Sun CLDC, some mobile phone Java VM that is purely interpreter based. This VM supports a feature called ROMIZING where a .c file is generated from all class files to be included with the VM for faster loading and compiled in. Apparently, they did exactly this with some J2ME runtime and their static class Main for bin400.

Now jcc, the part of CLDC that does this inclusion, strips away all Class headers and basically only what is preserved is the Java bytecode somewhere in .rodata plus some meta information spread across interpeter oriented structures. Reconstructing the original class files appeared to complicated for me, so what would I do?

By looking at the KVM source (the right name for this supposedly DVM) and in IDA, I had quickly uncovered the main interpreter loop in the binary. Since I was using IDA in my Windows VM for remote debugging the application, I decided to stick with IDAPython for the following steps, too.

My idea was to basically just dump bytecodes from the interpreter, as it executes them. So I extracted some bytecode table from KVM and wrote a little IDA Python script, that would dump bytecodes plus a little extra information:

opcodes = [
    "NOP",                
    "ACONST_NULL",        
    "ICONST_M1",          
    "ICONST_0",           
    # ....
    "CHECKCAST_FAST",         
    "INSTANCEOF_FAST",        
    "CUSTOMCODE"
]

log = open('b400_trace.txt', 'at')

while True:
        RunTo(0x0804E01B)

if GetDebuggerEvent(WFNE_SUSP, -1) != 16:
                break

eax = GetRegValue('eax')
        esi = GetRegValue('esi')
        op = opcodes[eax]

info = ''

if op == 'INVOKEVIRTUAL_FAST':
                RunTo(0x804E827)

if GetDebuggerEvent(WFNE_SUSP, -1) != 16:
                        break

name = Dword(GetRegValue('eax') + 12)
                name = GetString(name + 8, -1, ASCSTR_C)

name2 = Dword(GetRegValue('eax') + 8)
                name2 = GetString(name2 + 8, -1, ASCSTR_C)

info = 'of %s/%s' % (name2, name)
        elif op == 'ARRAYLENGTH':
                array = Dword(GetRegValue('ebx'))
                length = Dword(array + 8)
                data = GetString(array + 12, length, ASCSTR_C)

info = ' %x["%s":%s]-> %x' % (array, data, data.encode('hex'), length)
        elif op == 'BIPUSH':
                val = Byte(esi+1)
                info = ' %x' % val

log.write('%x: %s %s\n' % (esi, op, info))
        log.flush()

Tracing this way through the binary was painfully slow, but setting HW read breakpoints on bytecode where I wanted to continue analysis, running the binary till there and then tracing again worked well.

When just run with no parameters, the binary opened a listening socket on :5566 and when connecting presented you with the following challenge:

$ nc localhost 5566
Please submit your key for verification:
TESTINPUT
Hmm, not quite right.
Please submit your key for verification:
quit
Goodbye

So what I did was simply tracing through the binary and trying to identify interesting parts (of course I developed the tracer during this, so instruction parameter interpretation was added, when I decided some bytecode might be interesting). Eventually, you'll reach the returns of Sun's receiver implementation and the main logic again:

807cac8: ARRAYLENGTH  936f3a4["AAAAAAAAAAAAAAAAAAAAAA":41414141414141414141414141414141414141414141]-> 16
807cac9: GETSTATICP_FAST 
807cacc: ARRAYLENGTH  936f544["Y<92>s4Q^E#<96>^?t$d7<96>t4o^UPp<95>\":5992cf73345105a6fcc92396fb7f74fffe24edfe64379674346f15e75070955ca1f4]-> 22
807cacd: IF_ICMPEQ 
807cad0: ICONST_0 
807cad1: IRETURN 

Apparently, our input length was compared to the length of some hardcoded constant, so I've tried to give it a 0x22 characters long input and reached the following sequence:

807b55e: ARRAYLENGTH  936e8cc["^A^B^C^D^E":0102030405]-> 100
807b55f: IF_ICMPGE 
807b562: ALOAD_0 
807b563: GETFIELDP_FAST 
807b566: ILOAD_2 
807b567: ILOAD_2 
807b568: I2B 
807b569: BASTORE 
807b56a: IINC 
807b56d: GOTO 
807b559: ILOAD_2 
807b55a: ALOAD_0 
807b55b: GETFIELDP_FAST 
807b55e: ARRAYLENGTH  936e8cc["^A^B^C^D^E^F":010203040506]-> 100

Which already hinted at RC4 encryption quite hardly (building up a 0x100 byte long permutation table). And indeed, it then used the string "ddtek" to shuffle around the table:

807b58a: ARRAYLENGTH  936e9f4["ddtek":646474656b]-> 5
...
807b577: ARRAYLENGTH  936e8cc["d^A^B^C^D^E^F^G^H^K^L^M^N^O^P^Q^R^S^T^U^V^W^X^Y^ZESC^\^]^^^_ !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcefghijklmnopqrstuvwxyz{|}~^?<80><81><82><83><84><85><86><87><88><89>
<8A><8B><8C><8D><8E><8F><90><91><92><93><94><95><96><97><98><99><9A><9B><9C><9D><9E><9F>
":640102030405060708090a0b0c0d0e0f101112131415161718191a1b1c1d1e1f202122232425262728292a2b2c2d2e2f303132333435363738393a3b3c3d3e3f404142434445464748494a4b4c4d4e4f505152535455565758595a5b5c5d5e5f6061626365666768696a6b6c6d6e6f707172737475767778797a7b7c7d7e7f808182838485868788898a8b8c8d8e8f909192939495969798999a9b9c9d9e9fa0a1a2a3a4a5a6a7a8a9aaabacadaeafb0b1b2b3b4b5b6b7b8b9babbbcbdbebfc0c1c2c3c4c5c6c7c8c9cacbcccdcecfd0d1d2d3d4d5d6d7d8d9dadbdcdddedfe0e1e2e3e4e5e6e7e8e9eaebecedeeeff0f1f2f3f4f5f6f7f8f9fafbfcfdfeff]-> 100

So I simply tried decrypting the previously mentioned 0x22 bytes constant with RC4 and key "ddtek" and this indeed gave me the correct solution to this level. What sounds quite easy here, took me about 6 hours to figure out and implement during the contest... :)