home shape

Throughput Enhancements: Boosting ArangoDB Performance

We’ve recently been working on improving ArangoDB’s throughput, especially when using the ArangoDB’s interface.

In this post, I will show some of the improvements already achieved, though the work is not yet finished. Therefore, the results shown here are still somewhat preliminary.

We wanted to measure improvements for ArangoDB’s HTTP interface, and so we used wrk as an external HTTP load generator.

During the tests, wrk called some specific URLs on a local ArangoDB instance on an otherwise idle machine. The test was run with ArangoDB 2.6 and devel. The ArangoDB instances were started with their default configuration.

wrk was invoked with varying amounts of client connections and threads, so the tests cover serial and concurrent/parallel requests:

bash invoking wrk

wrk -c $CONNECTIONS -t $THREADS -d 10 $URL

The number of connections ($CONNECTIONS) and threads ($THREADS) were both varied from 1 to 8. wrk requires at least as many connections as threads.

The first URL tested was a route in a simple Foxx application that inserts the data shipped in the HTTP request into a collection on the server. The internals of the route should not matter here, as this post focuses on the throughput improvements.

Following are the results for calling the route with wrk, comparing the stable ArangoDB version (2.6.3) with the current development version (head of devel branch as of today). The table shows the number of documents that were inserted during the 10 seconds the wrk client ran:

Threads      Connections        2.6       devel
      1                1      12569       20157 
      1                2      28094       36031   
      1                4      46310       66524 
      1                8      46798       80667

As can be seen above, devel was able to handle much more requests than 2.6 even with a single connection (i.e. serial client requests). Throughput was about 60 % higher for this case.

When increasing the number of client connections, the number of requests handled by devel ws also higher than that of 2.6, with improvements between around 25 and 70 %.

When increasing the number of client load generation threads, the picture doesn’t change much. Here’s the full table of results:

Threads      Connections        2.6       devel
      1                1      12569       20157 
      1                2      28094       36031   
      1                4      46310       66524 
      1                8      46798       80667

      2                2      28931       36326    
      2                4      47181       67654    
      2                8      47594       88617 

      4                4      46553       67585   
      4                8      47531       86935 

      8                8      46431       91953 

The next test consisted of inserting documents into a collection again, but using the built-in HTTP API for creating documents instead of a user-defined Foxx application. Throughput is expected to be higher than in the Foxx case because the built-in method is hard-wired and only serves a single purpose, whereas the Foxx route is user-definable and capable of doing fancy things, such as validating data, restricting access etc.

Here are the results for calling the hard-wired insertion route, again for 2.6 and devel:

Threads      Connections        2.6       devel
      1                1     102133      112843 
      1                2     185529      210795 
      1                4     335607      373070
      1                8     518354      576034

      2                2     181237      196482 
      2                4     345455      363255
      2                8     474558      550835

      4                4     318331      355328
      4                8     483388      516100

      8                8     482369      527395

devel provides higher throughput than 2.6 for this route as well. Improvements fell into the range of between 5 and 15 %. That’s not as impressive as in the Foxx route case above, but still a welcome improvement.

And of course we’ll try to improve the throughput further.

This article was published first on Jan’s Blog – Throughput Enhancements.

Jan Steemann

Jan Steemann

After more than 30 years of playing around with 8 bit computers, assembler and scripting languages, Jan decided to move on to work in database engineering. Jan is now a senior C/C++ developer with the ArangoDB core team, being there from version 0.1. He is mostly working on performance optimization, storage engines and the querying functionality. He also wrote most of AQL (ArangoDB’s query language).

Leave a Comment

Get the latest tutorials, blog posts and news: