home shape

Benchmark: PostgreSQL, MongoDB, Neo4j, OrientDB and ArangoDB

In this blog post – which is a roundup of the performance blog series – I want to complete the picture of our NoSQL performance test and include some of the supportive feedback from the community. First of all, thanks for all your comments, contributions and suggestions to improve this open source NoSQL performance test (Github). This blog post describes a complete overhaul of the test with no need to read all the previous articles to get the picture – have a look at the appendix below to get all the details on hard- and software, the dataset and tests, used in this NoSQL performance comparison.

In response to many requests, I have now added PostgreSQL to the comparison, a popular RDBMS that supports a JSON data type. The relational data model is a perfect addition to our test suite, now covering common project use cases (read/write and ad-hoc queries) as well as some social network related – implemented in tables, documents and/or graphs. How does a multi-model approach perform against their generic counterparts?

For this edition of the performance test I have also updated the software sources, replacing the custom preview/snapshot versions with the latest available products (releases or release candidates) of the particular databases and a NodeJS version bumped to 4.1.1. In response to a user feedback I have also added another test – returning the whole profile data when requesting neighbors of neighbors and increased the number of test cases for shortest path (40 instead of 19) and aggregation (1,000 instead of 500 vertices) due to performance improvements of all databases in the test field.

Before I dig into performance numbers and test details:

This is a vendor-initiated test that – of course – wants to show that his database is competitive by setting the scene and choosing the weapons: here – nodejs as the driver of choice and an in-memory enabling setting, using a 16 core machine on GCE with 60GB of RAM. Nevertheless, the setup is not chosen to just benefit ArangoDB, but to enable a comparable basis for the tests with basic use cases and nodejs as a (not that uncommon) client that is supported by every vendor (@see: Appendix).

ArangoDB currently works best when the data fits completely into memory. The performance will suffer if the dataset is much bigger than the memory. We are working on this issue and will provide an improved storage engine in the near future.

I want to be as trustworthy as possible, so I have published all the data, settings and test scripts in a public Github repository nosql-tests. No magic, no tricks – check the code and make your own tests!

Interested in trying out ArangoDB? Fire up your cluster in just few clicks with ArangoDB ArangoGraph: the Cloud Service for ArangoDB. Start your free 14-day trial here.

Brief Test Description and Results

The following performance tests compare the same types of queries in different databases.

For these tests, I’ve used a dataset that enables us to test basic db operations as well as graph related queries – a social network site with user profiles and a friendship relation – Pokec from Stanford University SNAP. I won’t measure every possible database operation. Rather, we focus on queries that are sensible for nearly every project and some typical for a social network. We perform single reads and writes of profiles, we compute an ad-hoc aggregation to get an overview of the age distribution, we ask for friends of friends, or we ask for shortest friendship paths. These queries are run for all tested databases, irrespective of the data model they are using internally. As a result, we have a performance comparison between specialized solutions and multi-model databases.

The throughput measurements on the test machine for ArangoDB define the baseline (100%) for the comparisons. Lower percentages point to higher throughput and accordingly, higher percentages indicate lower throughput.

I performed the following tests, all implemented in JavaScript running in node.js 4.1.1:

  • single read: single document reads of profiles (100,000 different documents)
  • single write: single document writes of profiles (100,000 different documents)
  • aggregation: ad-hoc aggregation over a single collection (1,632,803 documents).
    Here, we compute statistics about the age distribution for everyone in the network, simply counting which age occurs how often.
  • neighbors: finding (distinct) direct neighbors plus the neighbors of the neighbors, returning IDs (for 1,000 vertices)
  • neighbors with data: finding (distinct) direct neighbors plus the neighbors of the neighbors and return their profiles (for 100 vertices)
  • shortest path: finding 40 shortest paths (in a highly connected social graph). This answers the question how close to each other two people are in the social network.

For our tests we run the workloads 5 times, averaging the results. Each test starts with an individual warm-up phase that allows databases to load data in memory and every test iteration starts from scratch to prevent a cache comparison test.

New to multi-model and graphs? Check out our free ArangoDB Graph Course.

NoSQL performance Test – Overall Results

NoSQL Performance Test

NoSQL Performance Test

The tests show that multi-model databases can compete with single model databases. MongoDB is faster at single document reads but couldn’t compete when it comes to aggregations or 2nd neighbors selections. Note: the shortest path query was not tested for MongoDB as it would have to be implemented completely on the client side.

table v207

Let’s go a step further and have a look what exactly I have tested in certain use cases so that you can understand what happens. Perhaps, the single read / single write is not that difficult to understand so I concentrate on aggregation and graph functionality here.

What does the age-distribution look like in the social network?

Test: Aggregation

In this test, we aggregate over a single collection (1,632,803 documents). We compute statistics about the age distribution for everyone in the network by simply counting which age occurs how often. We did not put a secondary index for this attribute on any of the databases, so they all have to perform a full collection scan and do a counting statistics – that’s a typical ad-hoc query to perform.

aggregation_v207

The aggregation in ArangoDB is efficient, using 1.25 sec. in AVG for the 1.6M documents which defines the baseline of 100%. Only an explicit table column age in PostgreSQL is – as expected – much faster, processing the aggregation in 0.61 sec. Of course, that’s a good use case for RDBMS. As PostgreSQL offers the JSON data type as well, you might want to check the performance here…: Nope, 17.5 sec. is beyond everything you want to accept. All other databases are much slower than ArangoDB, from factor x2.5 in MongoDB to x20 in case of OrientDB.

Who is part of my extended Friend network?

Test: Neighbors Search

Finding direct neighbors plus the neighbors of the neighbors for 1000 vertices.

neighbor_v207

Looks like a case for graph databases, but isn’t necessarily. At least Neo4j and OrientDB can’t stand out in this test – despite it’s a simple graph traversal. ArangoDB is really fast using just 464ms in AVG, no graph database comes close. That’s because lookups in a graph of a known length is faster with an Index lookup than using outbound links in every vertex.

How many people are between [me] and [Barack Obama]?

Test: Shortest Path

Finding 40 shortest paths (in a highly connected social graph). This answers the question how close to each other two people are in the social network.

shortest-path_v207

The shortest path is a speciality of graph databases so I don’t even tried to implement something similar in PostgreSQL or MongoDB. ArangoDB needs 61ms in AVG to process the 40 shortest paths.

Conclusion

The test results show that ArangoDB can compete with leading databases in their fields and also with the other multi-model database, OrientDB. Memory is our pain point and it will be addressed in the next major release. With a flexible data model, you can use a multi-model database in many different situation without the need to learn new technologies. A short selection of real life tasks has been given here.

Please have a look at our repository, do your own tests, and share the results. Different hardware – different results: Your mileage may vary and your requirements differ – so use this repo as a boilerplate and extend it with your own tests. If you want to verify our results, please use the same hardware configuration.

I appreciate your contribution and trust in open-source benchmarks.

Learn how to speed up your AQL queries. Get A Primer on Query Performance Optimization in ArangoDB

Appendix – Details about data, machines, software and tests

The data

Pokec is the most popular online social network in Slovakia. I used a snapshot of its data provided by the Stanford University SNAP. It contains profile data from 1,632,803 people. The corresponding friendship graph has 30,622,564 edges. The profile data contain gender, age, hobbies, interest, education etc., but the individual JSON documents are very diverse, because many fields are empty for many people. Profile data are in the Slovak language. Friendships in Pokec are directed. The uncompressed JSON data for the vertices need around 600 MB and the uncompressed JSON data for the edges requires around 1.832 GB. The diameter of the graph (longest shortest path) is 11, but the graph is highly connected, as is normal for a social network. This makes the shortest path problem particularly hard.

The hardware

All benchmarks were done on a virtual machine of type n1-standard-16 in Google Compute Engine with 16 virtual cores (on these, a virtual core is implemented as a single hardware hyper-thread on a 2.3 GHz Intel Xeon E5 v3 (see Haswell)) and altogether 60 GB of RAM. The data was stored on a 256 GB SSD drive, directly attached to the server. The client was an n1-standard-8 (8 vCPU, 30 GB RAM) in the same network.

The software

I wanted to use a client/server model, thus I needed a language to implement the tests, and I decided that it has to fulfill the following criteria:

  • Each database in the comparison must have a reasonable driver.
  • It is not one of the native languages our contenders are implemented in, because this would potentially give an unfair advantage for some. This ruled out C++ and Java.
  • The language must be reasonably popular and relevant in the market.
  • The language should be available on all major platforms.

This essentially left JavaScript, PHP, Python and Ruby. I decided to use JavaScript with node.js 4.1.1, because it’s popular and known to be fast, in particular with network workloads.

For each database I used the most up-to-date JavaScript driver that was recommended by the respective database vendor.

I have used

  • ArangoDB V2.7.0 RC2 for x86_64 (Driver: arangojs@3.9.1)
  • MongoDB V3.0.6 for x86_64, using the WiredTiger storage engine (Driver: mongodb@2.0.45)
  • Neo4j Enterprise Edition V2.3.0 M3 running on JDK 1.7.0_79 (Driver: neo4j@2.0.0-RC2)
  • OrientDB 2.2 alpha – Community Edition (Driver: orientjs@2.1.0)
  • PostgreSQL 9.4.4 (Driver: pg-promise@1.11.0)

All databases were installed on the same machine, I have done our best to tune the configuration parameters best, I have for example switched off transparent huge pages and configured up to 40,000 open file descriptors for each process. Furthermore, I’ve adapted community and vendor provided configuration parameters from Michael Hunger (Neo4j) and Luca Garulli (OrientDB) to improve individual settings.

The tests

I have made sure for each experiment that the database has a chance to load all relevant data into RAM. Some DBs allow explicit load commands for collections, others not. Therefore, I have increased cache sizes accordingly where relevant and used full collection scans as a warm-up procedure.

I don’t want to benchmark query caches or likewise – a databases might need a warm-up phase, but you can’t compare databases based on cache size / efficiency. Whether a cache is useful or not depends highly on the individual use case, executing a certain query multiple times.

For the single document tests, I use individual requests for each document but use keep-alive and allow multiple simultaneous connections, since I wanted to test throughput rather than latency.

Whenever the driver allowed to configure this, I chose to use a TCP/IP connection pool of up to 25 connections. Note that the ArangoDB driver does not use HTTP pipelining, whereas the MongoDB driver seems to do a corresponding thing for its binary protocol, which can help to increase throughput. For more detailed information about each individual database see below.

I discuss each of the six tests separately:

single document reads (100,000 different documents)

In this test we store 100,000 ids of people in the node.js client and try to fetch the corresponding profiles from the database, each in a separate query. In node.js, everything happens in a single thread but asynchronously. To fully load the database connections we first submit all queries to the driver and then await all the callbacks using the node.js event loop. We measure the wallclock time from just before we start sending queries until the last answer has arrived. Obviously, this measures throughput of the driver/database combination and not latency, therefore we give as a result the complete wallclock time for all requests.

single document writes (100,000 different documents)

For this test we proceed similarly: We load 100,000 different documents into the node.js client and then measure the wallclock time needed to send all of them to the database, using individual queries. We again first schedule all requests to the driver and then wait for all callbacks using the node.js event loop. As above, this is a throughput measurement.

single document writes sync (100,000 different documents)

Same as before, but the latter waits until the write has synced to disk – which is the default behavior of Neo4j. To be fair we have introduced this additional test to the comparison.

aggregation over a single collection (1,632,803 documents)

In this test we do an ad-hoc aggregation over all 1,632,803 profile documents and count how often each value of the AGE attribute occurs. We did not put a secondary index for this attribute on any of the databases, so they all have to perform a full collection scan and do a counting statistics. We only measure a single request, since this is enough work to get an accurate measurement. The amount of data scanned should be more than any CPU cache can hold, so we should see real RAM accesses but usually no disk accesses because of the above warm-up procedure.

finding the neighbors and the neighbors of the neighbors (distinct, for 1,000 vertices)

This is the first test related to the network use case. For each of altogether 1,000 vertices we find all neighbors and all neighbors of all neighbors, which achieves finding the friends and friends of the friends of a person and return a distinct set of friend id’s. This is a typical graph matching problem considering paths of length 1 or 2. For the non-graph database MongoDB, we can use the aggregation framework to compute the result. In PostgreSQL we can use a relational table with id from / id to backed by an index. In the used Pokec dataset we get 18,972 neighbors and 852,824 neighbors of neighbors for our 1,000 queried vertices.

finding the neighbors and the neighbors of the neighbors with profile data (distinct, for 100 vertices)

As there was a complaint that for a real use case we need to return more than IDs, I’ve added a test case neighbors with profiles that addresses this concern and returns the complete profiles. In our test case we retrieve 84,972 profiles from the first 100 vertices we query. The complete set of 853k profiles (1,000 vertices) would have been too much for nodejs.

finding 40 shortest paths (in a highly connected social graph)

This is a pure graph test with a query that is particularly suited for a graph database. We ask the database in 40 different requests to find a shortest path between two given vertices in our social graph. Due to the high connectivity of the graph, such a query is hard, since the neighborhood of a vertex grows exponentially with the radius. Shortest path is notoriously bad in more traditional databases, because the answer involves an a priori unknown number of steps in the graph, usually leading to an a priori unknown number of joins.

Originally we picked 20 random pairs of vertices but it turned out that for one of the pairs there is not path in the graph at all. We excluded that one for the first measurements because Neo4j, which did altogether quite well at shortest paths, was exceedingly slow to notice that there is no such path. After the first published performance test the vendors improved their tools so that we could increase the number of shortest paths to 40 – which are enough to get an accurate measurement. Note however, that the time for different pairs varies considerably, because it depends on the length of the shortest path as well as sometimes on the order in which edges are traversed.

We finish the description with a few more detailed comments for each individual database:

ArangoDB:

ArangoDB allows to specify the value of the primary key attribute _key, as long as the unique constraint is not violated. It automatically creates a primary hash index on that attribute, as well as an edge index on the _from and _to attributes in the friendship relation (edge collection). No other indexes were used.

MongoDB:

Since MongoDB treats the edges just as documents in another collection, I have helped it for the graph queries by creating two more indexes on the _from and _to attributes of the friendship relation. Due to the absence of graph operations I did neighbors of neighbors using the aggregation framework as suggested by Hans-Peter Grasl and did not even try to do shortest paths.

Please note: The write performance of MongoDB 3.0.6 declined significantly. I re-validated the test with MongoDB 3.0.3 and measured the known fast results from the previous tests. (103 sec vs. 324 sec. for 100,000 single writes synced). Nevertheless, as there is no indication that it’s a bug in one of those versions, I stay with the latest release.

Neo4j:

In Neo4j the attribute values of the profile documents are stored as properties of the vertices. For a fair comparison, I created an index on the _key attribute. Neo4j claims to use “index-free adjacency” for the edges, so I did not add another index on edges.

I’ve got the configuration parameters from the vendor (thanks to Michael Hunger) and added the writes with sync to disk test as this is the default (and only) behavior Neo4j supports. After the first performance test I’ve also got a custom built Neo4j 2.3 Snapshot from Michael Hunger that improved the performance of Neo4j. With the Enterprise Version 2.3 (M3) it looks like the improvements found their way into the official releases so that everyone can benefit. Open-source is such a cool thing.

OrientDB:

OrientDB 2.0.9 was the 4th best database in most disciplines of the first test. The developers used the published results to analyze some bottlenecks and improved the performance of OrientDB within two weeks after the first published blog post (2.1 RC4). I could now switch from the provided 2.2 preview snapshot to the current 2.2 alpha which seems to includes all the performance improvements of the snapshot.

Please note: There is an OrientDB blog post in response, but it compares apples with oranges by activating / implementing query caches – just in OrientDB – to improve the results.

Postgres:

I have used PostgreSQL with the user profiles stored in a table with two columns, the Profile ID and a JSON data type for the whole profile data. In a second approach, I used a classical relational data modelling with all profile attributes as columns in a table – just for comparison.

Resources and Contribution

All code used in this test can be downloaded from my Github repository and all the data is published in a public Amazon S3 bucket. The tar file consists of two folders data (database) and import (source files).

Everybody is welcome to contribute by testing other databases and sharing the results.

admin

60 Comments

  1. Caner on October 13 2015, at 4:43 pm

    Can you add RethinkDB to the tests, please?

    • Claudius Weinberger on October 13 2015, at 5:02 pm

      Claudius from ArangoDB here. You are not the first who asked us to include RethinkDB. I started to look into it but I’m not very familiar with RethinkDB. If somebody from the community would help and could have a look at the queries that would be great. Just ping me at claudius (at) arangodb.com

  2. Tom on October 13 2015, at 5:26 pm

    You claim that “I have done our best to tune the configuration parameters best” but could you show the actual configuration for each database?

    • Claudius Weinberger on October 13 2015, at 5:49 pm

      Good idea. We will add the config files to the repos as well.

  3. Claudius Weinberger on October 13 2015, at 5:56 pm

    I will also include Postgresql with JSONB. This will take 1 or 2 days because I have to rerun the complete test. The difference in daily performance in the GCE are to big to just run the additional test with Postgresql and JSONB. It will be interesting to see the difference in time and space between these two formats.

    • Barry Jones on October 13 2015, at 6:06 pm

      Thanks, I was about to hop on and ask for the same thing. I saw where you said you weren’t using indexes on any solutions which is fine for a test but doesn’t really provided a great “in a production environment” measure.

      Would be really interesting to add a “with index” benchmark in addition to the rest of these to get a closer to real world picture. That should be reflected in write time with an index as well as read time.

      • Claudius Weinberger on October 13 2015, at 7:12 pm

        Claudius the author of the post.

        Sorry, but I have to disagree regarding the indexes. First of all the relations have indexes on the from and to fields. The profiles table/collection has a primary index. The requests to this tables are primarily going to this fields. Only the AGE field does not have an index for the aggregation. In my experiences of the last 20 years, I often was in the situation that a system in production had to handle ad hoc queries, too. This I hat in mind when I designed the test suite.

      • Scott Brickey on October 13 2015, at 7:13 pm

        I would disagree with your statement about lacking indexes.

        As mentioned in the second paragraph, this is about ad-hoc queries. While the examples may be common enough to justify adding to the application as built-in metrics (or lookups), the intention of the article and performance metrics are focused on unexpected lookups.

        That said, I’ll grant you that GetDocument(id) and SetDocument() operations aren’t exactly ad-hoc lookups… I assume they were added because you can’t use a system that provides awesome ad-hoc queries if the base needs suck hard.

        All this said, I do think there’s validity in recognizing that an OLTP system and a reporting system *can* be separate systems. While I’m sure many people strive to build and prove a single “all purpose” system is possible, I think most people would agree that the tradeoffs between possible and practical can justify separate systems

    • Chris Lee on October 13 2015, at 6:51 pm

      I was going to ask for the same thing, also as Barry said indexes are used in Postgres. I like how you guys are being responsive and open with these tests – even if the results don’t show ArangoDB to be the clear winner, the way you are handling this creates trust.

  4. Pinaraf on October 13 2015, at 6:26 pm

    I see no index in your PostgreSQL test (at least the tabular one on github), did I miss something ?

    • Caio Hamamura on May 11 2016, at 3:36 pm

      Relational databases without indexes aren’t relational databases… that said I stick with PostgreSQL, since even without them it was faster or comparable to the others.

      • Claudius Weinberger on May 17 2016, at 10:13 pm

        You are completely right. All test did run against indexes excluding the aggregation. The aggregation is an example for an ad hoc query.

  5. Raivo Laanemets on October 13 2015, at 6:28 pm

    There is no nodejs 4.4.1 released yet. Article possibly meant 4.1.1

  6. Konstantin on October 13 2015, at 7:31 pm

    What about Aerospike? Have you tested it?

  7. Evan Summers on October 14 2015, at 12:37 pm

    great work – thanks

    as fan of Redis, I’d like to see Redis figures – but this would probably make everyone else look slow? 😉 Because Redis is all in-memory. You do say that is sweet spot for Arango though. To be fair, disk-based storage engines will always be slower, but that is a typical and deliberate trade-off i.e. to support large databases on disk, in a persistent and ACID manner.

    Keep up great work on ArangoDB – looks like an exciting product, in a tremendously exciting and innovative field.

    • Claudius Weinberger on October 14 2015, at 1:40 pm

      This is already on my list. First I have to understand how to model the data in Redis for our use case.

      • s.molinari on October 17 2015, at 7:08 am

        I don’t think Redis fits the use case. It isn’t made to be anything close to a graph database. I am even surprised Mongo and Postgres are in the mix.

        • Claudius Weinberger on October 17 2015, at 4:48 pm

          In this case you didn’t understand the intention of my test. I really think that graph databases have to compete with other databases if they want to claim a general purpose approach.

  8. Claudius Weinberger on October 14 2015, at 2:58 pm

    Max,

    I have to disagree. But let me explain it in more detail, please.

    My aim was to show how performant the database itself is. This was the reason why I defined to not use caches in all use-cases. When I understand correctly what you describe then Neo4J has something like a query plan cache. That’s great but in the end it’s a cache. Every database has to do something similar for a request. It has to analyse the query and execute it. So if we would use this feature from Neo4J we had also to use caches on other databases, too. To add the same shortest paths as we use for the test to the warmup would be more a cheat than an optimisation. Michael Hunger already made an optimisation and put some shortest path queries to the warmup but not the same as in the test.

    Also to use Java instead of Node.JS would be an unintended optimisation. In this case, we had to use for every database its native language. E.g. we have to use C++ and you can be sure our results would also much better. In the appendix, I described already the criteria why I chose Node.JS.

    “I wanted to use a client/server model, thus I needed a language to implement the tests, and I decided that it has to fulfill the following criteria:

    – Each database in the comparison must have a reasonable driver.
    – It is not one of the native languages our contenders are implemented in, because this would potentially give an unfair advantage for some. This ruled out C++ and Java.
    – The language must be reasonably popular and relevant in the market.
    – The language should be available on all major platforms.

    This essentially left JavaScript, PHP, Python and Ruby. I decided to use JavaScript with node.js 4.1.1, because it’s popular and known to be fast, in particular with network workloads.”

    To answer your last question: No, ArangoDB can’t run embedded. I personally think it’s a bad idea to run a database embedded. But this here is not the right place to discuss this but nevertheless I wanted to mention two reasons. First is scaling and second is language agnostic.

    • stann on December 16 2015, at 12:17 pm

      It will be nice to see the results with cache enabled for all the databases where available

  9. Veniamin Gvozdikov on October 15 2015, at 11:18 am

    Hi, Can you add Tarantool (http://tarantool.org) to the next benchmarks?

  10. AskAsya on October 16 2015, at 12:04 am

    It appears that while you set ArangoDB to sync the wal (write-ahead-log) every second, for MongoDB you set the full set of datafiles to sync every second. You probably want to use journalCommitInterval to make them more “equivalent”.

    • Claudius Weinberger on October 19 2015, at 1:33 pm

      I will look into this and will make a test. If it makes a difference I will include it in the next update. Thanks for the hint!

      • TerryS on October 20 2015, at 5:22 pm

        DataStax Acquired Aurelius in february: https://groups.google.com/forum/#!topic/aureliusgraphs/WTNYYpUyrvw

        They will still work on the project, but not as much as they used to: “Titan isn’t going anywhere and we are still working on it. The development velocity and community activity will be significantly less in the future unless the community steps in and helps.”

  11. s.molinari on October 17 2015, at 7:09 am

    I’d like to suggest testing Titan too. It is actually a more popular graph database than OrientDB or Arango.

    • Claudius Weinberger on October 17 2015, at 5:02 pm

      I put it on my list. Currently the test is a single machine environment and I’m not sure if this a good fit for Titan. I will have a look. Do you believe that Titan is still alive? I’ve noticed that there are not that many changes in the last update of Titan. Does Titan still have an active community?

      • TerryS on October 21 2015, at 9:32 am

        DataStax Acquired Aurelius in february: https://groups.google.com/forum/#!topic/aureliusgraphs/WTNYYpUyrvw

        They will still work on the project, but not as much as they used to: “Titan isn’t going anywhere and we are still working on it. The development velocity and community activity will be significantly less in the future unless the community steps in and helps.”

        (apparently I posted this as a reply to another post, so I’m moving it to the right place)

  12. Claudius Weinberger on October 23 2015, at 11:30 pm

    I wrote this because you were surprised that MongoDB and PostgreSQL are in the mix. In this case, you didn’t understand the intention.

    In terms of Redis, I asked directly Salvatore because I’m not sure at this point. I think he can judge it best

    • s.molinari on October 24 2015, at 7:16 am

      Yeah, I wrote what I wrote not reading what I had wrote. Hehehe…

      Ok. I understand having MongoDB and Postgres in the mix. It shows a graph database can do graph data AND also covers the other requirements one might have from an application datastore.

      Still, I say don’t waste your time with Redis and trying to make the data fit in Redis. That effort right there says a lot about the use case and how Redis simply doesn’t match as a comparable datastore.

  13. Clément Prévost on October 29 2015, at 4:30 pm

    Awesome post ! I’m surprised by OrientDB and Neo4j results and I’m also impressed by Arangodb’s perfs. As far as I know, OrientDB is also an in-memory graph DB, so what design decision makes ArangoDB this efficient ?

    As an heavy PostgreSQL user I also must ask: Do you think that PostgreSQL’s pgRouting extension would be relevant for the scope of this post in shortest-path and neighbors of neighbors computation ? If yes, I would be glad to PR the repo with some code.

    • Max Neunhöffer on October 30 2015, at 2:56 pm

      Disclaimer: I am one of the core developers of ArangoDB, so I may be able to provide some answers to your first question.

      The performance difference between ArangoDB and OrientDB of single writes and single reads could be explained by the fact that ArangoDB is written in C++ and OrientDB in Java, which can easily explain a factor of 2 in performance. Since we send an individual request for each document, it is likely that differences in the whole chain from DB driver to the storage engine play a larger role than the actual database engine here, because the whole test is probably I/O or network bound rather than CPU bound.

      So let’s look at the other tests, for example aggregation.

      In ArangoDB, we keep the actual JSON documents in the database in a form that allows rapid access to subvalues. If we, for example, need the AGE attribute of all 1.6M documents, then we

      – find the actual document with a constant time lookup
      – and then find the binary value of the attribute again with a constant time lookup in the document

      This design dramatically increases the aggregation performance in connection with a query execution engine that organises a batched pipeline to shoot documents as fast as possible through the execution engine. Thus we have made sure that the two crucial ingredients for a fast aggregation are OK: Quick access to the data and a fast query execution engine.

      Looking at the absolute time results, aggregating 1.6M values of the AGE attribute in 1.25s means roughly 781ns per value (and that is single-threaded), which is reasonable for a generic query engine. Spending more than 10us per value seems a bit over the top too me.

      For the shortest path test other things matter. In ArangoDB we have paid detailed attention to the edge index. It guarantees to deliver the k adjacent edges of a given vertex in time O(k) with a very small constant. We essentially need one hash lookup and can then follow a doubly linked list. Our data structure in addition allows constant time insertion and deletion of individual edges. We strongly believe that these features matter most for performance of basically all graph algorithms, rather than any arguments involving “index-free adjacency”, which I shall not repeat here needlessly.

      On top of this functionality we use a shortest path algorithm that starts searching from both sides at the same time and uses a good priority queue inside to decide which vertex to work on next. We have chosen a binary heap which is not best possible w.r.t. complexity, but performs best in practice because smaller constants. Furthermore it allows a neat trick for the special case that the edges have no “length” and we only want to find a path with the smallest number of edge-steps. Namely, we start using a deque and only “transform” it (at zero cost!) into a binary heap, when the first insertion happens that is not at the end, which does not happen at all in the special case. Thus, in the published benchmark we run entirely in the deque case and can enjoy amortised constant time for both insertion and taking off the first one. For cases with proper edge lengths we fall back to the binary heap getting logarithmic complexity for finding the next vertex work on, which is still OK, as other tests we conducted have confirmed.

      Note that the social graph at hand is highly connected, so that the neighbourhood size increases exponentially with distance, leading to particularly hard shortest path problems.

      Finally, the friends of friends test again stresses only the neighbour lookup features and the query engine, which I have discussed above at length.

      Hopefully, this can shed some light on the performance results. Please feel free to ask me personally (max at arangodb.com) if you need more information.

  14. Frédéric Mériot on November 18 2015, at 1:04 pm

    Can you add Titan as a graph database to compare. I know that it’s a bit more complexe to install but I’ll be interested to see the comparision.

  15. Frédéric Mériot on November 18 2015, at 4:37 pm

    Can you compare graph performance with Titan ?

  16. Nate Gardner on May 3 2016, at 10:29 pm

    I’m interested in comparing ArangoDB to Teradata. I’ll be working on creating a test, but if you want to add it to your list, it would be another good example comparing to tabular.

    • Claudius Weinberger on May 17 2016, at 10:09 pm

      Nate, that would be great. Let me know when you finish it.

    • Claudius Weinberger on August 23 2016, at 12:01 pm

      Nate,

      any news on that?

      • Nate Gardner on August 23 2016, at 4:54 pm

        I have done informal experiments showing in some cases, ArangoDB is over 2000 times faster than Teradata. However, I have not yet completed the formal benchmark. Part of the question is how prepared the data can be in Teradata. The argument I received from Teradata experts is essentially that with carefully crafted SQL, Teradata can effectively do everything as a single read after the data has been prepared. I don’t know how much data prep is fair for the benchmark, but in any case, I’ve found Teradata to be tremendously slow even for simple SELECT * FROM queries because it really seems more oriented to data warehousing than data retrieval. I’ve also had to deal with errors in Teradata running out of spool space. These were enough to persuade my colleagues of ArangoDB’s potential as our backend. So far I’ve been very impressed with watching various complex graph traversals be improved by tens of times againt Solr equivalent joins, and thousands of times against raw Teradata queries.

        I’ll let you know when the benchmark is done

  17. Caio Hamamura on May 11 2016, at 8:11 pm

    The select statement for neighbors in postgresql is done in the wrong way!

    You don’t do:
    select _to from relations where _from = $1 union distinct select _to from relations where _to != $1 and _from in (select _to from relations where _from = $1);

    You should do:
    select _to from relations r1 where _from=$1 union all select r2._to from relations r1 JOIN relations r2 ON r1._to = r2._from where r1._from=$1;

    This makes a HUGE difference. In my machine:
    WHERE IN: 20300 ms
    JOIN: 12ms

    • Claudius Weinberger on May 17 2016, at 11:17 pm

      Thanks for that.

      I tried out your suggested query but it did not the same.

      Your query:
      ===========================================================
      executing distinct neighbors of 1st and 2nd degree for 1000 elements
      total number of neighbors2 found: 1118899
      PostgreSQL: neighbors2, 1118899 items
      Total Time for 1118899 requests: 2399 ms
      average: 0.0021440719850495888 ms
      ===========================================================

      My query:
      ===========================================================
      executing distinct neighbors of 1st and 2nd degree for 1000 elements
      total number of neighbors2 found: 852824
      PostgreSQL: neighbors2, 852824 items
      Total Time for 852824 requests: 2073 ms
      average: 0.0024307477275498815 ms
      ===========================================================

      As you can see the total number should be 852824.

  18. Dário Marcelino on June 22 2016, at 10:20 am

    Please add BlazeGraph, it seems to have been Wikidata’s choice after a comprehensive selection process: https://docs.google.com/spreadsheets/d/1MXikljoSUVP77w7JKf9EXN40OB-ZkMqT8Y5b2NYVKbU/edit#gid=0

    • Claudius Weinberger on June 23 2016, at 5:36 pm

      Interesting. I will have a look. But also the Wikidata sheet is interesting. They missed a lot about ArangoDB and even about ArangoDB 3.

      • Dário Marcelino on June 24 2016, at 3:36 pm

        Yes I agree, their analysis on ArangoDB did not seem very thorough.

  19. Peter Lyapin on August 4 2016, at 1:09 pm

    It is a good comparison for one operation a time, but what about performance under the load?

    If you run the same kind of workload but using multiple client connections at the same time, performing this workload how the result would look like? I have a case where I use both PostgreSQL and ArangoDB to store same data and when I run load test for 10 to 100 concurrent users with PostgreSQL response time is always less then a second while response time for ArangoDB was in direct ratio with the amount concurrent users. It looks like ArangoDB is processing every single queries one after another and all the other queries are waiting in a long queue.

    Can you recommend something please?

  20. Claudius Weinberger on August 23 2016, at 11:56 am

    I will but at the moment I struggled with other stuff. Sorry for the delay.

    • John on August 23 2016, at 12:24 pm

      No problem! I was just curious. I know you’ve been busy with 3.0 this summer. Thanks again

  21. Jean-Marc Autexier on March 22 2017, at 4:43 pm

    It’s now beginning of 2017, and titan resurrected as JanusGraph http://janusgraph.org/

  22. Tristan Wibberley on October 7 2017, at 3:32 am

    How does this compare to Dgraph (https://dgraph.io/)?

    • ArangoDB database on January 10 2018, at 9:01 pm

      Sorry, we can’t say anything about the performance of Dgraph. 2018 version of our benchmark will be released soon (planned Feb 2018) but Dgraph will not be part of it, because we want to show that we multi-model can at least compete with the leading solutions on their respected home turf and neo4j is the representing the graph model

  23. yehosef on October 17 2017, at 11:00 am

    Could we get an updated test with newer/newest version of these databases? Even if the improvements are not as dramatic, the multi-model and other benefits of Arrango still make it very compelling. But it’s still valuable to see how it compares for a particular mode/use case.

    • ArangoDB database on January 10 2018, at 7:48 pm

      Sorry for the very late reply! We will publish the 2018 version of the benchmark soon. Planned is February 2018. Stay tuned

  24. ArangoDB database on January 10 2018, at 9:07 pm

    Sorry for the late reply! Thanks for the feedback. We will think about your idea for the next version of the benchmark. We tried our best to show the results as neutral as possible.

  25. ArangoDB database on January 10 2018, at 9:14 pm

    Sorry for the late approval of your comment! Update of the performance benchmark with most recent GA versions of all databases is planned for Feb 2018

  26. ArangoDB database on January 10 2018, at 9:41 pm

    Sorry for the late reply! We will release the 2018 version soon (planned release Feb 2018)

  27. ArangoDB database on January 10 2018, at 9:46 pm

    Sorry for the late reply. Update with the 2018 version will be published soon (planned Feb 2018)

Leave a Comment





Get the latest tutorials, blog posts and news: