Race Conditions
Depends on | Memory |
---|
Race conditions are not an issue in single-threaded systems; nor are they an issue where only a single thread/process can write to a variable or claim a resource. However if there are multiple interacting threads (or processes, or processors) then this class of problem may emerge.
Race conditions can occur when write access is available to variables, records, resources, files … from different threads/processes because they are running asynchronously which means the overall behaviour of the system is non-deterministic. The result of the code can be altered by the way the threads are scheduled … which is not controlled by the programmer.
There’s a short (7 mins.) video introduction here.
Here is some pseudocode:
x := 1; // Shared variable
parallel
{ // Thread 1
<stuff>
x := x + 1;
<more stuff>
},
{ // Thread 2
<different stuff>
x := 2;
<yet more stuff>
}
print(x)
What is printed? Either the x := x + 1;
came first – in which case
the answer is 2
because the result is then overwritten, or the x := 2;
won the race in which case the answer is 3
. The point is
that the answer is not determined by the programmer and so is likely
to be an error and Bad Things might happen.
If spotted in advance it is typically possible to identify and guard critical sections of code with some form of mutual exclusion element. These force these sections to be deterministic.
Race hazards typically appear rarely and unpredictably, making them difficult to find and debug (so it’s best not to allow them in in the first place!). Whilst not confined to O.S. code, the multi-threaded nature of many system applications is ‘fertile ground’ for race conditions and an O.S. will often provide primitives for mutexes and semaphores to help with protecting against them.
Here is another definition (and example) you might find useful.
TOCTOU
Time Of Check to Time Of Use is a term used in the classification of certain bugs caused by race hazards. Fundamentally, the issue is that a test is made to see if something is permitted and then it is done. There is the potential for conditions to be changed between the check and the use unless the test and commit is atomic.
Here’s a illustrative anecdote.