home shape

An implementation of phase-fair reader/writer locks

We were in search for some C++ reader/writer locks implementation that allows a thread to acquire a lock and then optionally pass it on to another thread. The C++11 and C++14 standard library lock implementations std::mutex and shared_mutex do not allow that (it would be undefined behaviour – by the way, it’s also undefined behaviour when doing this with the pthreads library).

Additionally, we were looking for locks that would neither prefer readers nor writers, so that there will be neither reader starvation nor writer starvation. And then, we wanted concurrently queued read and write requests that compete for the lock to be brought into some defined execution order. Ideally, queued operations that cannot instantly acquire the lock should be processed in approximately the same order in which they queued.

Interested in trying out ArangoDB? Fire up your cluster in just few clicks with ArangoDB ArangoGraph: the Cloud Service for ArangoDB. Start your free 14-day trial here.

We finally found some nice paper describing a variety of locks, and we were intrigued by the properties of Phase-fair RW locks (section 3.2.4, page 9 in the paper):

“A RW lock is phase-fair iff it has the following properties:

PF1 reader phases and writer phases alternate;
PF2 writers are subject to FIFO ordering, but only with regard to other writers;
PF3 at the start of each reader phase, all currently-unsatisfied read requests are
satisfied (exactly one write request is satisfied at the start of a writer phase); and
PF4 during a reader phase, newly-issued read requests are satisfied only if there are
no unsatisfied write requests pending”

So we decided to build a C++ implementation based on what is described in the paper.

The paper provides some pseudo-code for a ticket-based lock-less implementation using multiple atomic variables. We tried that out, and it used to work fine in practice. But then we ran into the problem that we would not only need the `lock` and `unlock` operations: we also needed an operation that tries to lock and return if the lock cannot be acquired instantly (`try_lock`). And we needed that combined with a configurable timeout after which `try_lock` would give up. And we wanted this `try_lock` operation to properly queue until it can acquire the lock successfully or times out.

That turned out to be rather complex to implement with the multiple atomic variables approach suggested in the paper: the approach there suggests that for acquiring the lock in write mode, two atomic variables need to be changed (page 14 in the paper):


procedure writelock(L : ˆ pftlock)
var ticket, w : unsigned integer;
ticket := fetch and add(&L−>win, 1);
await ticket = L−>wout;
w := PRES or (ticket and PHID);
ticket := fetch and add(&L−>rin, w);
await ticket = L−>rout

For the `try_lock` operation the lock acquisition will fail if the lock cannot be acquired. Ideally, a `try_lock` operation will not change the state of the lock if it cannot be acquired instantly. But with the multiple atomic variable approach, this will become tricky, given that there are several atomic variables that will need to be read and modified for entering the lock in write mode.

So we decided that even if it is 2018 and lock-free programming is definitely on the advance, sticking with a completely lock-free implementation will make the lock code a lot more complex. And for a lock implementation that controls critical parts of the application, it is definitely an advantage to have a solution that is more simple and robust than having a more efficient but complex solution.

So we went with an implementation that uses some `std::mutex` under the hood, and `std::condition_variable` to wake up already queued waiters. So we could rely on the correctness of `std::mutex` and `std::condition_variable` for all our modifications of the internal lock state and notifications. Given the fact that it is a read-write lock and may lock out other operations anyway (in case there is another writer), we thought this would be a good initial tradeoff.

That approach allowed us to come up with rather simple code for the `lock` and `unlock` operations. And the `try_lock` implementation was only a bit more complicated because we needed `try_lock` to work with a configurable timeout, and have it run a configurable callback function to notify the application in case the lock cannot be acquired instantly but will go into a waiting state.

We have run some initial experiments with this, and it seems that it indeed provides the semantics (especially the fairness) we desire. We will now run additional tests with that lock implementation to see how it performs.

Our current lock implementation code is in an experimental feature branch, which can be found on Github.

As the rest of the ArangoDB source code, the code is open source and licensed under the Apache 2 license, so please feel free to use it in other projects or modify it.

Read here the comparison: Lockless programming with atomics in C++ 11 vs. mutex and RW-locks

We are also interested in comments from other users of this code and eager to get feedback on how it worked out and performs in other contexts! Let us know on Community Slack.

Jan Steemann

Jan Steemann

After more than 30 years of playing around with 8 bit computers, assembler and scripting languages, Jan decided to move on to work in database engineering. Jan is now a senior C/C++ developer with the ArangoDB core team, being there from version 0.1. He is mostly working on performance optimization, storage engines and the querying functionality. He also wrote most of AQL (ArangoDB’s query language).

Leave a Comment





Get the latest tutorials, blog posts and news: