Lecture 10 - Deadlock


Agenda

Announcements

  • No more about monitors or synchronization problems in class for now. Be sure to read those sections in the text and the online notes carefully.
  • Deadlock

    The idea of deadlock has come up a few times so far:

    A group of processes is deadlocked if each process is waiting for an event that can be caused only by one of the other waiting processes.

    Consider a one-lane tunnel or underpass:

  • One-lane traffic inside the tunnel
  • The lane in the tunnel is a shared resource
  • If a deadlock like this one occurs, we must tell one of the cars it has to give up its right to the resource and rollback
  • This may mean more than one car has to back up
  • Starvation is possible if we keep allowing cars from one direction to enter
  • The traffic analogy can become more complex with two-way intersections - gridlock!

    We have seen potential for deadlock with semaphores:

    P0 P1
    wait(Q); wait(R);
    wait(R); wait(Q);
    ... ...
    signal(R); signal(Q);
    signal(Q); signal(R);
    ... ...

    This is a situation that came up with the original "solution" to the dining philosophers.

    We will consider processes that need access to more general "resources." These could be any non-preemptable resources, such as tape drives or CD burners, in addition to things like the semaphores.

    Requirements for Deadlock

    Four conditions must hold simultaneously for deadlock to arise:

    1. Mutual exclusion: only one process at a time can use a resource
    2. Hold and wait: a process is holding, but not necessarily using, at least one resource and is waiting to acquire additional resources held by other processes
    3. No preemption: a resource can be released only voluntarily by the process holding it, after that process has completed its task
    4. Circular wait: there exists a circular chain of waiting processes, each of which it waiting for a resource held by the next process in the chain

    If any one of these conditions does not hold, deadlock cannot occur.

    Resource Allocation Graphs

    We model systems for our study of deadlock using directed graphs.

    The vertices of our graph are processes, represented by circles,

    and resources, represented by squares, which may have a number of "dots" inside to indicate multiple equivalent instances of a resource.

    Edges from a process to a resource indicate a request by that process for an instance of that resource

    Edges from a resource to a process indicate that the resource has been granted to that process

    An example of a resource allocation graph:

    There are no cycles in this graph, so we are sure that there is no deadlock. There are two unfulfilled requests, P1's request for R1, and P2's request for R3, but we are sure things can continue. Since P3 isn't waiting for anything, it will eventually release R3. P2 can then do its thing, allowing it to release R1, which allows P1 to do its thing. But what if P3 requests an instance of R2?

    Deadlock! P3 cannot continue until it acquires an R2. But we know that P2 can't continue until it gets an R3, which isn't going to happen until P3 can go. But P1 is waiting for an R1, which isn't available until P2 can go... We have a problem. No process can continue.

    However, a cycle alone is not enough to guarantee deadlock:

    Here, there is a cycle, but P2 and P4 are not waiting for anything. That means that eventually they will free their resources, and the other processes can get what they need.

    If there is a single instance of each resource, a cycle in the resource allocation graph means we have deadlock. When there are multiple instances, it only means there is a possibility of deadlock.

    Dealing With Deadlocks

    Most common approach: ignore the problem! It probably doesn't happen that much and it's hard to deal with, so forget about it and if it happens we'll just chalk it up to an OS bug or something.

    But that's not good enough for us. We're going to look at prevention, avoidance, and detection and recovery.

    Deadlock Prevention

    Here, we make sure there is no deadlock by guaranteeing that one of the four necessary conditions cannot occur.

    Deadlock Avoidance

    If we have some additional information available, we can use it to avoid deadlock.

    For example, we could require that each process declare a priori the maximum number of each resource that it will ever need.

    These values, along with the current number of resources allocated to each process and the number of resources available, form a resource allocation state.

    We want to ensure that the system stays in a safe state. A safe state is one in which, given the current allocations, there is guaranteed to be an ordering of the processes that will allow all processes to get the resources they need and (eventually) run to completion. The first process to run must be able to satisfy its resource needs with what it is allocated plus what is available. The next process must be able to satisfy its needs with what it is allocated, with what is free, and what will become free when the first process releases its resources. And so on. We show that a state is safe by producing that safe sequence.

    An unsafe state does not guarantee deadlock, as even if no such sequence exists, it's possible that one or more processes will not request the maximum allocation of a resource. So a process could complete and free resources, but maybe not.

    By maintaining a safe state, we can ensure no deadlock.

    We can represent these "maximum claims" by augmenting our resource allocation graph to include claim edges:

    Here, P1 has been granted R1 and has a claim edge indicating that it may also need R2. P2 is currently requesting R1 and also may need R2.

    This is a safe state, as P1, P2 is a safe sequence.

    Now suppose P2 is granted R2:

    There is no deadlock yet, since there's a chance that P1 will not need R2 before completing, thereby freeing up R1 for P2. But the state is not safe, since we cannot guarantee a safe sequence. It's possible that P1 will need R2 before it releases R1, and deadlock would follow.

    For single-instance resources, we need only ensure that we do not grant any request that would introduce a cycle in the augmented resource graph.

    For multiple instances of resources, we need something a little more complicated:

    Banker's Algorithm (Dijkstra)

    The decision to grant a resource or make a process wait even if the resource is available is made by checking if the state would become unsafe.

    We use an example to see the Banker's Algorithm in action. (See class handout [PDF])

    Deadlock Detection and Recovery

    Since the Banker's Algorithm might be too expensive to run every time a resource is requested (in fact, as Tanenbaum points out, no one uses it), we might just let processes get resources, then "once in a while" check for deadlock by running the "safety algorithm" or by checking for cycles in the resource allocation graph.

    If we take this approach and discover that there is a deadlock, how to deal with it?

    We could try to preempt a resource, though that might force the process that had the resource to have to "undo" some of its computation and repeat it when it gets the resource back. This is complicated and can be costly (have you ever tried to "unwrite" a partially written CD?).

    Since that option is so complex, we might just terminate processes to free their resources and have them restart. How do we choose a victim process? We could kill all deadlocked processes, but probably just one at a time until we can break the deadlock. Ideally we try to minimize the cost - avoid terminating a process that is almost done, or one that doesn't have enough resources to make much of a difference, or the boss' process.