Hackatua - February 12, 2023

Basic Buffer Overflow Introduction

Basic C example of Buffer Overflow with high-level explanation

Among the vulnerabilities, one of the most frequently heard in the cybersecurity world is undoubtedly buffer overflow. This technique has been allowing cybersecurity experts to perform denials of service (DoS), change the expected behavior of programs and even execute commands remotely (RCE).

Although at a technical level it is very complex and requires a very detailed understanding of how memory works during program execution, at a conceptual level it is quite simple to understand. In order to explain it in a simple way we will make some simplifications that may not be entirely accurate, but that give us a sufficient level of understanding to comprehend it.

Demonstrative example

To understand it, let's first analyze a small C program and its behavior with different data inputs. The first thing we are going to do is to create a secretKeeper.c file with the following content:

#include <stdio.h>
#include <string.h>

int main(void) {
  int isAdmin = 0;
  char userPassword[16];

  printf("Enter the password: ");
  scanf("%s", userPassword);

  if(strcmp("SecretPass$%123", userPassword)) {
    printf("Invalid password!!!\n");
  }
  else {
    printf("Correct password!!!\n");
    isAdmin = 1;
  }

  if(isAdmin) {
    printf("The user is logged as admin!!!\n");
    printf("Here is the admin secret: I hate linux...\n");
  }

  return 0;
}

Although this program does not make much sense in the real world, it serves to demonstrate in a very simple way what buffer overflow is all about. The program does the following:

We import the dependencies needed to work with inputs and strings.
We declare the variable isAdmin with the value 0, and create the variable userPassword allocating space in memory for a text string of 16 characters.
We ask the user to enter the password and store it in the variable userPassword.
We compare the password entered by the user with the password of the program.
- If it is incorrect, we display a wrong password message.
- If it is correct we show a correct password message and update the isAdmin variable to 1 to indicate that the password was correct.
We check the value of isAdmin and, if it is not 0, we show the terrible secret of the system administrator.

As we can see, the program is quite simple and, in appearance, innocuous and harmless, so now we are going to compile it and test its operation. To do this you need to have installed the C compiler (gcc) and run the following command:

gcc secretKeeper.c -o secretKeeper -m32

With this command we simply tell it that we want to compile our secretKeeper.c file, that the result should be written in sercretKeeper (-o), and that it should compile it with 32 bits architecture (-m32).

Now that we have everything ready let's start with the tests. The first test we can do is to pass an incorrect password and check that it doesn't reveal the terrible secret:

┌─[parrot@parrot]─[~/Learning/buffer-overflow]
└──╼ $./secretKeeper
Enter the password: incorrectPass
Invalid password!!!

Perfect, we get the expected result. The password is not correct and does not reveal the secret.

Now we can enter the correct password to see if it reveals the secret:

┌─[parrot@parrot]─[~/Learning/buffer-overflow]
└──╼ $./secretKeeper
Enter the password: SecretPass$%123
Correct password!!!
The user is logged as admin!!!
Here is the admin secret: I hate linux...

And indeed, if the password is correct, it reveals the terrible and inadmissible secret of the administrator.

This is where things get strange. Now, we are going to set a very long password, specifically one character longer than the space we have allocated for the password entered by the user. If we go back to the program we see that we had reserved a length of 16, so we will use a password of 17 characters long:

┌─[parrot@parrot]─[~/Learning/buffer-overflow]
└──╼ $./secretKeeper
Enter the password: qwerqwerqwerqwerA
Invalid password!!!
The user is logged as admin!!!
Here is the admin secret: I hate linux...

What is going on here? On the one hand it is telling us that the password is incorrect, but on the other hand it is giving away the administrator's secret. This can only mean that somehow the value of isAdmin has been modified even though it was not entered in the condition.

To make sure of this, let's put a printf("isAdamin: %d", isAdmin); to show the value of isAdmin before the if(isAdmin) { that checks whether or not it has to show the admin secret. This way we can see if our theory is correct, so we make the change, compile again and re-run the above 3 cases and check:

┌─[parrot@parrot]─[~/Learning/buffer-overflow]
└──╼ $gcc secretKeeper.c -o secretKeeper -m32

┌─[parrot@parrot]─[~/Learning/buffer-overflow]
└──╼ $./secretKeeper
Enter the password: incorrectPass
Invalid password!!!
isAdamin: 0

┌─[parrot@parrot]─[~/Learning/buffer-overflow]
└──╼ $./secretKeeper
Enter the password: SecretPass$%123
Correct password!!!
isAdamin: 1
The user is logged as admin!!!
Here is the admin secret: I hate linux...

┌─[parrot@parrot]─[~/Learning/buffer-overflow]
└──╼ $./secretKeeper
Enter the password: qwerqwerqwerqwerA
Invalid password!!!
isAdamin: 65
The user is logged as admin!!!
Here is the admin secret: I hate linux...

In the first 2 cases we observe the expected behavior, if the password is incorrect the value of isAdmin is 0 and, if the password is correct the value is 1, consequently displaying the secret. However, in the third case, the value of isAdmin is not 0, it is 65 and, consequently, the password is also displayed. What is going on here? As you can imagine, what is happening is a buffer overflow.

And the question that occurs to me next is, what happens if I still enter a longer password? Well, let's try it:

┌─[✗]─[parrot@parrot]─[~/Learning/buffer-overflow]
└──╼ $./secretKeeper
Enter the password: qwerqwerqwerqwer12345678
Invalid password!!!
isAdamin: 875770417
The user is logged as admin!!!
Here is the admin secret: I hate linux...
Segmentation fault

Segmentation fault! In this case the execution of the program has detected that there has been a memory error, it seems that we have gone too far, but what is happening?

What is a buffer overflow?

During the execution of a program, the variables that the program needs are stored in memory, in what we call buffer. We can imagine this buffer as a bunch of little boxes in which the information necessary for the execution of the current function is stored. For example, in our case we have reserved 16 "little boxes" for the password entered by the user, and another one for the integer value of isAdmin:

Now if we look at the case where we have entered an incorrect password, where we have entered "incorrectPass", what we would have in the buffer would be the following:

As we can see, in this case the password entered by the user fits in the space reserved for it, so the program behaves as we expect, but what happens if we enter a longer password? If we represent what would happen if we use the password "qwerqwerqwerqwerqwerA":

As we can see there has been an overflow in which the content of the variable userPassword has overflowed the allocated space it had. When this happened, the isAdmin variable was overwritten. In addition, if you look at what it said to us that there was in that case in the variable, a 65, it matches the ASCII value of the letter "A". This is what is called buffer overflow.

And why does this overflow occur?

The reason why this memory overflow occurs is because in C and C++ there are several methods that do not check that the length of what they are going to write is less than or equal to the allocated space in the buffer, overwriting other registers in the buffer. Some of these methods are:

printf
sprintf
strcat
strcpy
gets
...

If we continue overwriting more registers, we reach the registers that indicate where the program execution has to continue, allowing us to gain control of the program to the point of, in the worst case, executing commands remotely.

In future posts we will dig into how memory works in more detail, explain the different types of buffer overflow, and how we can take advantage of this to gain control of the program and make it execute the instructions that we want. We will also explore the different protections that exist and how we can bypass them.