background

Improved Deadlock Detection

The upcoming ArangoDB version 2.8 (currently in devel) will provide a much better deadlock detection mechanism than its predecessors.

The new deadlock detection mechanism will kick in automatically when it detects operations that are mutually waiting for each other. In case it finds such deadlock, it will abort one of the operations so that the others can continue and overall progress can be made.

In previous versions of ArangoDB, deadlocks could make operations wait forever, requiring the server to be stopped and restarted.

How deadlocks can occur

Here’s a simple example for getting into a deadlock state:

Transaction A wants to write to collection c1 and to read from collection c2. In parallel, transaction B wants to write to collection c2 and read from collection c1. If the sequence of operations is interleaved as follows, then the two transactions prevent each other from making progress:

  • transaction A successfully acquires write-lock on c1
  • transaction B successfully acquires write-lock on c2
  • transaction A tries to acquire read-lock on c2 (and must wait for B)
  • transaction B tries to acquire read-lock on c1 (and must wait for A)

Here’s these such two transactions being started from two ArangoShell instances in parallel (left is A, right is B):

deadlock a b

(note that this screenshot is from 2.8 and the automatic deadlock detection had already detected the deadlock and aborted one of the transactions)

In general, deadlocks can occur only when multiple operations (AQL queries or other transactions) try to access the same resources (collections) at the same time, and only if the operations already have already acquired some locks on these resources. And finally each operation needs to involve more than one collection, so there is the potential for already having acquired some locks but having to wait for others.

Dynamically added collections

Most operations will just work fine and will not cause any deadlocks. This is especially true for all operations that involve only a single collection. This leaves multi-collection AQL queries and multi-collection userland transactions.

Normally these will also work fine. This is because when a query or transaction starts, it will tell the transaction manager about the resources (collections) it will need. The transaction manager can then acquire the required resources in a deterministic fashion that prevents deadlocks. If all queries and transactions properly announce upfront which collections they will access, there will also be no deadlocks.

But for some operations its hard to predict at transaction start which collections will be accessed. This includes some AQL functions that can dynamically access collection data without having to specify the collection name anywhere in the query.

A good example for this is the GRAPH_EDGES AQL function, which will get a graph name as its first input parameter, but not the names of the underlying edge collection(s). When this function is used in an AQL query, the query parser will just find a function parameter containing a graph name but doesn’t know it’s a collection name.

GRAPH_EDGES("myEdges", [ { type: "friend" } ])

The "myEdges" graph name will look like any other string to the parser. It does not know about the contexts in which strings may have special meanings.

Note that even if this would be fixed, the problem won’t go away entirely: a function call parameter in a query isn’t necessarily a constant but can be an arbitrary expression:

FOR doc IN collection
  RETURN GRAPH_EDGES(CONCAT(doc.graphName, '-test'), [ doc.example ])

At least in this case the AQL query parser won’t find a collection name, so when the AQL query starts it is yet unknown which collections will be accessed. Only at runtime when the function is actually executed, the collection names will be looked up by finding the graph description in the _graphs system collection. Then the edge collections participating in the graph will be added to the query dynamically. Only this dynamic addition adds the potential for deadlock.

This dynamic addition of collections in unavoidable for conveniently querying data from collections whose names are unknown when the query starts.

Deadlock detection

Whenever transaction manager detects a deadlock in ArangoDB 2.8, it will automatically abort one of the blocking transactions. The transaction will be rolled back and all modifications it has made will be reverted. The operation will fail with error code 29 (deadlock detected) and raise an exception that the user can handle in the calling code.

Deadlocks will be found if two transactions mutually lock each other as seen in the screenshot above, but also for more complex setups. The following screenshot shows four parallel transactions that block each other indirectly.

threeway-deadlock

The top left window (transaction 1) will block the one in the top right (transaction 2), and is itself blocked by the transaction in the bottom left (transaction 3).

The transaction in the top right window (transaction 2) blocks the one in the bottom left (transaction 3), and is itself blocked by the one in the top left (transaction 1).

Transaction 3 (bottom left) is blocked by transaction 2 (top right). Transaction 4 (bottom right) does exactly the same as transaction 3.

With these transactions, we end up in this waiting state:

  • T1 waits for T3 and T4
  • T2 waits for T1
  • T3 waits for T2
  • T4 waits for T2

This waiting state is cyclic (T1 < T3 < T2 < T1) and therefore no progress can be made. This is exactly a situation in which the transaction manager will abort one of the transactions.

No configuration is required for the deadlock detection mechanism. It will always be active and cannot be configured or turned off.

Jan Steemann

Jan Steemann

After more than 30 years of playing around with 8 bit computers, assembler and scripting languages, Jan decided to move on to work in database engineering. Jan is now a senior C/C++ developer with the ArangoDB core team, being there from version 0.1. He is mostly working on performance optimization, storage engines and the querying functionality. He also wrote most of AQL (ArangoDB’s query language).

4 Comments

  1. ra0o0f on November 23, 2015 at 4:21 pm

    if i’m not wrong acquire lock on collection level is like table-level lock in relational db, it is the top level of locking and isolation behavior that relational databases use, the most that could result in deadlock. if you look at https://en.wikipedia.org/wiki/Isolation_(database_systems) you see this is like `Serializable` isolation level. i know in ArangoDB transactions are running server-side and are naturally faster than relational databases, but lower level of locking like document-lock could avoid deadlocks better

    • jsteemann on November 23, 2015 at 4:36 pm

      Right. However, even with document-level locks you will be able to produce deadlocks when transactions run concurrently and want to modify the same documents in different order. Easy to reproduce with row-level locking engines such as InnoDB, and unavoidable if you really want to run transactions in parallel and have consistency. An alternative is of course to always abort a transaction that cannot get a (document-level) lock instantly and let the client retry.

  2. H.P. Grahsl on November 25, 2015 at 7:00 pm

    Thanks for sharing good news. As I’ve done very basic deadlock detection years ago I would be interested in some details:
    1) Which deadlock detection algorithm is used? I once implemented a variant of the GoodLock (Lock Tree) algorithm.
    2) Which strategy is used to decide which transaction to abort / roll back?
    Maybe you can shed some light on this so that I don’t have do dig my way through the sources 🙂 THX in advance!

    • jsteemann on December 1, 2015 at 1:39 pm

      Sorry for getting back late. The algorithm used for deadlock detection is rather simple. I am not sure if it has a name (please let me know if you know!), but I’ll try to describe it.

      It needs two simple data structures, a writers table, and a lookup table. The writers table contains the writer threads that have successfully owned/locked a resource. The lookup table contains information about which operations are blocked on others. It is only populated if resources can be acquired instantly.

      The writers table is used as follows: every writer that successfully acquires a resource is entered into an internal writers table. When a writer releases a resource, it is removed from that writers table.

      The lookup table is used as follows: whenever either a reader or a writer cannot instantly acquire a resource, it will be entered into another lookup table, together with the id of the writer that blocks it. If that lookup table is otherwise empty nothing more will happen, and once the writer that blocks is finished, the waiting operations can proceed. If however the lookup table is not empty, we’ll check if the writer we’re waiting for is itself blocked by other operations. That is done recursively until we either find a cycle (then it’s a deadlock situation) or are done (in this case it’s not a deadlock and eventually there’ll be progress).

      Example scenario: we have two threads T1 and T2 which want to perform operations on resources R1 and R2 in this chronological order:

      1) T1 write-locks R1
      2) T2 write-locks R2
      3) T1 wants to read-lock R2
      4) T2 wants to read-lock R1

      In this scenario T1 would block at 3) because it waits for T2, and T2 would block at 4) because it waits for T1.

      With the above algorithm that would result in
      1) T1 is entered into writers table for R1
      2) T2 is entered into writers table for R2
      3) T1 is entered into lookup table for R2, waiting for T2 (the writer that currently owns R2). lookup table is recursively traversed starting at T2. T2 itself does not block on anything else, so everything’s still fine
      4) T2 is entered into lookup table for R1, waiting for T1 (the writer that currently owns R1). lookup table is recursively traversed starting at T1. this lookup will find that T1 is itself blocked on T2 (which is ourselves). So we have a cycle and a deadlock!

      That should also work with more that two concurrent operations.

      To answer the second question: whenever a deadlock is detected using the above algorithm, the transaction that detects the deadlock is getting rolled back.

      I hope this helps.

Leave a Comment





Get the latest tutorials, blog posts and news: