[nolan@nprescott.com] $  cat weblog archive feed

Postgres Deadlocks

2019-06-15

I had an interesting bug to work through this week, a deadlock between multiple concurrent threads writing into a Postgres database. Here I outline the gist of the problem and a solution.

Background Information

Django provides a context manager called transaction.atomic that allows for block-scoping a transaction as shown below:

with transaction.atomic():
__exit__
BEGIN;
COMMIT;

… intermediate states between the steps are not visible to other concurrent transactions

Postgres Manual, 3.4. Transactions

This means only that we won't see half-finished work. It does not imply any guards in place against concurrent writes and locks are visible across transactions.

To guard concurrent writes, we use the FOR UPDATE clause on a SELECT statement to induce row-level locking before any updates. However this has the potential to deadlock in the case of concurrent writes; as demonstrated below. The problem is the fact that row-level locks are acquired incrementally in the order of the query results, leaving open the opportunity for two or more transactions to interfere with each other.

Thread 1

def _update():
    with transaction.atomic():
        ids = (Widget
               .objects
               .select_for_update()
               .filter(widget_id_id__in=[1, 2, 3]))
        for id in ids:
            # update widget model
            # save model

deferToThread(_update)

In this scenario, the database returns widget model rows in the order 1, 3, 2.

Lock row 1

1foo
2foo
3qux
4bar
5qux

Lock row 3

1foo
2foo
3qux
4bar
5qux

Row 2 is locked, await thread 2

1foo
2foo
3qux
4bar
5qux

Thread 2

def _update():
    with transaction.atomic():
        ids = (Widget
               .objects
               .select_for_update()
               .filter(widget_id_id__in=[2, 3, 5]))
        for id in ids:
            # update widget model
            # save model

deferToThread(_update)

In this scenario, the database returns widget model rows in the order 2, 3, 5.

Lock row 2

1foo
2foo
3qux
4bar
5qux

Row 3 is locked, await thread 1

1foo
2foo
3qux
4bar
5qux

Deadlock

1foo
2foo
3qux
4bar
5qux

A Solution

The solution I've opted for here is to just introduce an ORDER BY clause on the SELECT statement, to guarantee that rows are locked in a consistent order even between transactions. In the above example, if the order were enforced as 1, 2, 3 and 2, 3, 5, thread 1 might acquire a lock, then thread 2, at which point thread 1 would correctly await the completion of thread 2, which is (in this example) never blocked. This is actually addressed in the documentation:

The best defense against deadlocks is generally to avoid them by being certain that all applications using a database acquire locks on multiple objects in a consistent order. In the example above, if both transactions had updated the rows in the same order, no deadlock would have occurred.

But, of course, you won't know to look for it until you understand the problem — which can sometimes feel like a chicken and egg problem!