Basic Buffer Overflow Introduction
Basic C example of Buffer Overflow with high-level explanation
Among the vulnerabilities, one of the most frequently heard in the cybersecurity world is undoubtedly buffer overflow. This technique has been allowing cybersecurity experts to perform denials of service (DoS), change the expected behavior of programs and even execute commands remotely (RCE).
Although at a technical level it is very complex and requires a very detailed understanding of how memory works during program execution, at a conceptual level it is quite simple to understand. In order to explain it in a simple way we will make some simplifications that may not be entirely accurate, but that give us a sufficient level of understanding to comprehend it.
Demonstrative example
To understand it, let's first analyze a small C program and its behavior with different data inputs. The first thing we are going to do is to create a secretKeeper.c
file with the following content:
#include <stdio.h> #include <string.h> int main(void) { int isAdmin = 0; char userPassword[16]; printf("Enter the password: "); scanf("%s", userPassword); if(strcmp("SecretPass$%123", userPassword)) { printf("Invalid password!!!\n"); } else { printf("Correct password!!!\n"); isAdmin = 1; } if(isAdmin) { printf("The user is logged as admin!!!\n"); printf("Here is the admin secret: I hate linux...\n"); } return 0; }
Although this program does not make much sense in the real world, it serves to demonstrate in a very simple way what buffer overflow is all about. The program does the following:
- We import the dependencies needed to work with inputs and strings.
- We declare the variable
isAdmin
with the value0
, and create the variableuserPassword
allocating space in memory for a text string of 16 characters. - We ask the user to enter the password and store it in the variable
userPassword
. - We compare the password entered by the user with the password of the program.
- If it is incorrect, we display a wrong password message.
- If it is correct we show a correct password message and update the
isAdmin
variable to1
to indicate that the password was correct.
- We check the value of
isAdmin
and, if it is not0
, we show the terrible secret of the system administrator.
As we can see, the program is quite simple and, in appearance, innocuous and harmless, so now we are going to compile it and test its operation. To do this you need to have installed the C compiler (gcc) and run the following command:
gcc secretKeeper.c -o secretKeeper -m32
With this command we simply tell it that we want to compile our secretKeeper.c
file, that the result should be written in sercretKeeper
(-o
), and that it should compile it with 32 bits architecture (-m32
).
Now that we have everything ready let's start with the tests. The first test we can do is to pass an incorrect password and check that it doesn't reveal the terrible secret:
┌─[parrot@parrot]─[~/Learning/buffer-overflow] └──╼ $./secretKeeper Enter the password: incorrectPass Invalid password!!!
Perfect, we get the expected result. The password is not correct and does not reveal the secret.
Now we can enter the correct password to see if it reveals the secret:
┌─[parrot@parrot]─[~/Learning/buffer-overflow] └──╼ $./secretKeeper Enter the password: SecretPass$%123 Correct password!!! The user is logged as admin!!! Here is the admin secret: I hate linux...
And indeed, if the password is correct, it reveals the terrible and inadmissible secret of the administrator.
This is where things get strange. Now, we are going to set a very long password, specifically one character longer than the space we have allocated for the password entered by the user. If we go back to the program we see that we had reserved a length of 16, so we will use a password of 17 characters long:
┌─[parrot@parrot]─[~/Learning/buffer-overflow] └──╼ $./secretKeeper Enter the password: qwerqwerqwerqwerA Invalid password!!! The user is logged as admin!!! Here is the admin secret: I hate linux...
What is going on here? On the one hand it is telling us that the password is incorrect, but on the other hand it is giving away the administrator's secret. This can only mean that somehow the value of isAdmin
has been modified even though it was not entered in the condition.
To make sure of this, let's put a printf("isAdamin: %d", isAdmin);
to show the value of isAdmin
before the if(isAdmin) {
that checks whether or not it has to show the admin secret. This way we can see if our theory is correct, so we make the change, compile again and re-run the above 3 cases and check:
┌─[parrot@parrot]─[~/Learning/buffer-overflow] └──╼ $gcc secretKeeper.c -o secretKeeper -m32 ┌─[parrot@parrot]─[~/Learning/buffer-overflow] └──╼ $./secretKeeper Enter the password: incorrectPass Invalid password!!! isAdamin: 0 ┌─[parrot@parrot]─[~/Learning/buffer-overflow] └──╼ $./secretKeeper Enter the password: SecretPass$%123 Correct password!!! isAdamin: 1 The user is logged as admin!!! Here is the admin secret: I hate linux... ┌─[parrot@parrot]─[~/Learning/buffer-overflow] └──╼ $./secretKeeper Enter the password: qwerqwerqwerqwerA Invalid password!!! isAdamin: 65 The user is logged as admin!!! Here is the admin secret: I hate linux...
In the first 2 cases we observe the expected behavior, if the password is incorrect the value of isAdmin
is 0
and, if the password is correct the value is 1
, consequently displaying the secret. However, in the third case, the value of isAdmin
is not 0
, it is 65
and, consequently, the password is also displayed. What is going on here? As you can imagine, what is happening is a buffer overflow.
And the question that occurs to me next is, what happens if I still enter a longer password? Well, let's try it:
┌─[✗]─[parrot@parrot]─[~/Learning/buffer-overflow] └──╼ $./secretKeeper Enter the password: qwerqwerqwerqwer12345678 Invalid password!!! isAdamin: 875770417 The user is logged as admin!!! Here is the admin secret: I hate linux... Segmentation fault
Segmentation fault! In this case the execution of the program has detected that there has been a memory error, it seems that we have gone too far, but what is happening?
What is a buffer overflow?
During the execution of a program, the variables that the program needs are stored in memory, in what we call buffer. We can imagine this buffer as a bunch of little boxes in which the information necessary for the execution of the current function is stored. For example, in our case we have reserved 16 "little boxes" for the password entered by the user, and another one for the integer value of isAdmin
:
Now if we look at the case where we have entered an incorrect password, where we have entered "incorrectPass", what we would have in the buffer would be the following:
As we can see, in this case the password entered by the user fits in the space reserved for it, so the program behaves as we expect, but what happens if we enter a longer password? If we represent what would happen if we use the password "qwerqwerqwerqwerqwerA":
As we can see there has been an overflow in which the content of the variable userPassword
has overflowed the allocated space it had. When this happened, the isAdmin
variable was overwritten. In addition, if you look at what it said to us that there was in that case in the variable, a 65, it matches the ASCII value of the letter "A". This is what is called buffer overflow.
And why does this overflow occur?
The reason why this memory overflow occurs is because in C and C++ there are several methods that do not check that the length of what they are going to write is less than or equal to the allocated space in the buffer, overwriting other registers in the buffer. Some of these methods are:
- printf
- sprintf
- strcat
- strcpy
- gets
- ...
If we continue overwriting more registers, we reach the registers that indicate where the program execution has to continue, allowing us to gain control of the program to the point of, in the worst case, executing commands remotely.
In future posts we will dig into how memory works in more detail, explain the different types of buffer overflow, and how we can take advantage of this to gain control of the program and make it execute the instructions that we want. We will also explore the different protections that exist and how we can bypass them.