Operating Systems Notes Part 3 – Synchronization and Deadlocks

Preface: I have absolutely no pretence to teach anyone about synchronization with this part, since this is over a year’s worth of material I had previously learned at grade 13 (Technician’s Degree), so if you want to learn from this, you’ll have to read a lot on Wikipedia while reading this.

A Critical Section is a section of code that must never allow more than one process inside it at all times, for fear of Race Conditions. To protect a critical section from this, a protection mechanism must abide by three rules:

  1. Mutual Exclusion – No two processes can co-exist in a critical section at once. In order to prove that a mechanism abides by this rule, you must assume that two processes are currently in the critical section and by backtracking their steps, find that this is an impossible state.
  2. Progress – A process that is not currently in a critical section must never be allowed to block another from entering. In order to prove that a mechanism abides by this rule, you must show that the condition xi (xi = true means the process i can enter the critical section) can not be changed indefinitely to false by a process j.
  3. Bounded Wait (aka No Starvation) – A process’s waiting time to enter the critical section is bounded by a finite time, that is that a process will never be locked out of a critical section indefinitely. In order to prove that a mechanism abides by this rule, you must assume a process is barred from entering the critical section and prove that it will eventually enter it.

Proven solutions for this problem using simple constructs include Peterson’s Algorithm, Lamport’s Bakery Algorithm (which is simply Peterson’s for n processes) and Dekker’s Algorithm.

A big problem that an algorithm must consider is that the scheduler can pre-emptively stop one process and allow execution to another. This causes a problem, since many instructions in 3rd generation languages, like C, are not atomic and can stop in the middle of execution. A TSL Instruction (Test-and-Set Lock) is one solution to this, but it’s pretty complicated, so I won’t be getting into it here.

Semaphores are small atomic structures. A semaphore has a value and two operations – Up (increase value by 1) and Down (wait until the value is greater than zero, then decrease it by 1). Semaphores operations are, by definition, atomic. A Mutex is simply a semaphore with a value of either 1 or 0 (continue or wait when downed). Unless you know what you’re doing, always use down and up on semaphores in a symmetrical (LIFO) manner (that is – the first downed semaphore will be the last upped one), otherwise you risk deadlocks. (Note: a semaphore might have negative values, but this is irrelevant for us).
Semaphores can achieve their atomicity in one of two ways: TSL Instructions (which are fair, but create a Busy-Waiting mechanism which takes resources) and Disabling Interrupts (which means that nothing will disturb the semaphore’s internal work; still unacceptable outside internal system implementation, but since this is an internal implementation, we forgive it).
Semaphores are usable in such problems as the Bounded Buffer and Producer-Consumer, but also in many of the next problems, where they are simply not the main player.

Message Passing can simulate semaphores, since the operation receive blocks until a message is received (that is, someone used send) in the same way that down blocks until someone has used up. The advantage in using this is that you can easily change the critical section’s protection to work in a distributed environment.

Monitors are a higher level construct, used in .NET (C#’s lock) and Java (synchronized), that allow operation declarations inside of them. These operations are all mutually exclusive (they are all the same critical section).
The two functions monitors offer to the operation declarations are wait(condition) and signal(condition) which act as a mutex’s down and up operations, both of which are only accessible from inside the monitor. Both of those internal operations can act on any of the conditions defined by the monitor (see examples below). Since signaling on a condition means that a waiting operation might wake up and start running, signaling is only allowed at the end of an operation or it will endanger the mutual exclusion. (Note: In the Java implementation of the producer-consumer problem with monitors, you must use notifyall and not notify, since notify will wake up one process, but it’s not necessarily the process we want to wake (there is only one condition); also, using notifyall, we have a competition over which process will be allowed to run, which would prevent starvation).
Monitors are usable in such problems as the Bounded Buffer and Producer-Consumer, but in many others as well.

Barriers are a synchronization mechanism that means that all processes must wait when reaching it. When the last one reaches it, they are all simultaneously released. No idea what they’re good for.

More problems I’m not going to talk about but are worth mentioning are the Sleeping Barber, Readers-Writers and One-Way Tunnel (only one car can be in the tunnel).

Deadlocks

A simple deadlock occurs when a process A holds a resource R1 and waits for R2, while a process B holds R2 and waits for R1. Systems that have the following characteristics are prone to deadlocks:

  1. Mutual Exclusion of resources – Each resource can only be used by one process.
  2. Hold and Wait – A process can hold a resource while waiting for another.
  3. No Pre-emption or resources – A resource can only be released by its holding process.
  4. Circular Wait – Two or more processes can wait for resources being held by other waiting processes.

If you model your processes (usually as circles) and resources (usually as squares) as vertices of a directed graph and add requests for and holding of resources as edges, a deadlock might happen if the graph has a circle in it. If the resources involved all have only one instance of them available for holding, a deadlock will happen, but if they have more than one instance, it’s not a sure thing.

How do we handle deadlocks then? There are several schools of thought:

  1. Prevent them from happening by removing one of the aforementioned conditions
    for deadlock-prone systems:

    1. Do not allow mutual exclusion of the devices themselves or use spooling (buffering), where only the spooler accesses the resource and many processes can access the spooler. Not very good for many devices and the spools may grow too big.
    2. Do not allow holding and waiting by either forcing a process to declare all of its required resources up-front and then arranging them in a way that won’t cause a deadlock (not really realistic in most cases) or by forcing a process to release every resource it holds before requesting a new one (now that’s just crazy talk… :P).
    3. Allow resource pre-emption, though it might cause incorrect execution (imagine your printer printing half a page from one application, half from another and then continuing with the two halves it has left).
    4. Prevent circular waiting by:
      • Allow processes to only hold one resource at a time (not a good idea).
      • Number resources and only allow allocating in ascending order or resources with higher numbers than those currently held (will work, but why do that?).

    All in all, no silver bullet.

  2. Avoid deadlocks by granting resources only if it’s safe to do so. You can use algorithms like the Banker’s Algorithm to make sure that resources are allocated for a process only if it can finish execution.
    Again, we need to know which resources each process wants in advance, which is not realistic most of the time. Avoidance is simply not practical.
  3. Detect deadlocks by running a circle detection algorithm on the graph model every k minutes and also whenever CPU utilization drops hard and there are at least two processes in hold and wait mode. Recover from them by creating synthetic pre-emption on the resource (rarely a good thing), rolling back (a checkpoint mechanism must be implemented first – slows everything down and takes space; implemented in DBMS) or simply killing one of the processes involved. Detection and Recovery is very hard and not really cost-effective.
  4. Ignore them (bury your head in the sand) and let the user handle the problem (reboot). It’s the simplest solution and most cost-effective, since deadlocks are really rare and the previous three methods of tackling them are either unrealistic or very CPU and I/O-intensive.

A classic deadlock is shown in the Dining Philosophers’ Problem.

Advertisements

9 thoughts on “Operating Systems Notes Part 3 – Synchronization and Deadlocks

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s