Buffer Overflow

From COMP15212 Wiki
Revision as of 15:42, 31 July 2019 by pc>Yuron
Depends on SecurityMemory

The buffer overflow vulnerability is a security issue worthy of particular mention because it is both a ‘classic’ attack and it illustrates several general principles, and some possible defences.

The boundary of a software structure, such as an array, will not be checked in hardware; it is too expensive to provide hardware checking on a word-by-word basis and hardware usually relies on page-level checks. Testing array limits in software is also (time) expensive; it may be done as part of the software environment (e.g. Java) or it may be left to the programmer (e.g. C). Programmers sometimes (make that “usually”) omit the checks because they ‘know’ the index is within the limits.

A classic security flaw is when this is a false assumption and an attacker can input more bytes than an allocated space will hold. The space is typically an input buffer and the input can overflow the area, unchecked.

A ‘classic’ example is the C language library gets() which fetches bytes from the stdin stream until the next ‘newline’ (or EOF). It is impossible to tell in advance how many characters (bytes) will be fetched so it is impossible to guarantee a big enough buffer. One infamous ‘hack’ in the past involved an exploitation of the Unix finger daemon which used this library call. Never use gets(). The problem is addressed by fgets() which has a similar function but includes an argument to limit the size. Thus fgets(p_buffer, buffer_size, stdin); provides a safe alternative.

The figure below shows a possible attack. The attacker knows the size of the buffer and something about the calling convention of the compiler used. By carefully overfilling the buffer the function return address can be overwritten with something else, causing the current function to ‘return’ to the ‘wrong’ place when it has completed. If the address of the stack is also known, this ‘wrong’ place could be within the buffer itself – which the attacker has loaded with new code. The attacker now has control of this process.

If this vulnerability is within a system call the attacker will have full O.S. privilege. This is a Bad Thing.

Buffer Overflow

There are defenses:

  • Don’t write code assuming that inputs will always be legal.
  • Use a “canary”. This is an extra, random value placed amongst the “Other stuff…” by the compiler. By checking this before the function return it is likely that the intrusion will be detected before the malicious code can be executed. Canary
  • Make the target stack space non-executable. Modern memory protection can allow the MMU to distinguish between a data read and an instruction fetch; forbidding the latter on the stack segment (which must have data read and write access) would cause a segmentation fault at “My address” in the above example.
  • Make the address to jump to (“My address”) impossible to predict. See ASLR: notice that “My address” was a specific value supplied by the attacker.

An overflow attack may not even need to run code. If the “Other stuff…” in the figure above contains variables (it almost certainly does!) then it may be possible to change something there which will (for example) change the outcome of:

if (operation == allowed) ...

from false to true.

Given that buffer overflows have been understood for decades, one might expect them not to occur any more. However, that is not the case. In fact, buffer overflows are still outright common - and one day, even you, the reader, might write code that's vulnerable to it at work.

Here’s an example from WhatsApp from 2019 … Or for the gamers among us, the Fusée Gelée exploit for the Nintendo Switch (and other Tegra X1 devices) also makes use of buffer overflow.

Address Space Layout Randomisation (ASLR)

Rather than loading code and data at fixed (and therefore predictable) addresses some systems will put these at different (hard to predict) places, different for each instance of execution. This makes it considerably harder for an attacker (which may be another program) to find a critical target to force an attack.

ASLR is supported by many modern operating systems. It can makes attack considerably more difficult by making targets harder to localise; it does not prevent attacks.

Does your computer so this? Try the following simple variant on an early exercise:

// Do I use ASLR?
#include <stdlib.h>		// Contains: exit()
#include <stdio.h>		// Contains: printf()

int global;			// Globally scoped variable

main (int argc, char *argv[])	// The 'root' program; execution start
int local;			// Variable local to 'main'

printf("           Code at %016X\n", &main);
printf("Global variable at %016X\n", &global);
printf(" Local variable at %016X\n", &local);

Running that will indicate the (virtual) addresses of the code, static data (global variables) and stack (local variables) of the process. Running it again may reveal that some of these change from run to run. If any, the most likely to change is the last, because the stack is typically a mixture of data and code pointers (mostly return addresses). If an attack can change a code pointer it may be able to divert execution to its own code. To do this successfully it also needs to where that code is. If this is always the same for some set of similarly configured computers, an attacker an determine the target statically, in advance. If the target is selected dynamically (at run-time) it is harder to hit!

Notice that the vulnerability derives from an error in the original application. A mistake in an application – such as not checking array bounds – might lead to a take-over of that process. This could then compromise anything the legitimate owner of the process could. This is not the fault of the operating system design although the O.S. may act to make a vulnerability harder to exploit.

Of course, if such code is running with ‘ root’ privileges it can then make other ‘holes’ in the defences.

Also refer to: Operating System Concepts, 10th Edition: Chapter 16.2.2, pages 628-631

Articles on User
"Everything is a File" • Application Binary Interface (ABI) • Arrays • Boot • Buffer Overflow • Containers • Daemons • Disk Partition • Dynamic Memory Allocation • Emulator traps • Environment Variables • Errors • Exceptions • File Attributes • File Locking • File Permissions • Introduction to Operating Systems • Journalling File System • Links • Locks • Man(ual pages in Unix) • Memory Mapped Files • Monitoring • Network File System (NFS) • PATH • Pipes • Pointers • Relocatable Code • Reset • SETUID • Shell • Sockets • Spooling and Buffering • Streams • Structures • Superuser • System Calls • Unix Signals • User • Using Peripherals