home shape

Multi-Model Benchmark Round 1: Database Evaluation

The latest edition of the NoSQL Performance Benchmark (2018) has been released. Please click here

It’s time for another update of my NoSQL performance blog series. This hopefully concludes the first part of this series with the initial databases ArangoDB, MongoDB, Neo4J and OrientDB and I can now start to check out other databases. I’m getting a lot of requests to test others as well and I’ll try to add them as soon as possible. Pull requests to my repository are also more than welcome. Remember it is all open-source.

The first set of benchmarks was started as a proof that multi-model can compete with specialized solutions and I started with the corresponding top dogs (Neo4J and MongoDB) for graphs and documents. After the first blog post, we were asked by the community to include OrientDB as the other multi-model database, too, which makes sense and therefore I expanded the initial lineup.

Concluding the tests did take a bit longer than expected, because vendors took up the challenge and improved their products with impressive results – as we asked them to do. Still, for each iteration we needed some time to run all tests, see below. However, on the upside, everyone can benefit from the improvements, which is an awesome by-product of the benchmark tests.

For the impatient awaiting new test results: Doing the tests takes a lot of time – especially with cloud machines. But cloud machines are the only way to allow everyone to verify the results. I’ve made the experience that the server performance can indeed fluctuate from day to day easily by more than 10% or that the underlying hardware gets upgraded from one day to another rendering old results useless. Therefore in order to get comparable results, I needed to rerun all tests for all databases. Each run of the tests consists of 5 cycles – which takes a lot of wall-clock time. We tried to react to improvements suggested by other as fast as possible given the above restrictions (and restrictions imposed by our other projects).

The tests should be as reproducible as possible. We want people to use and check the results for themselves. Therefore we needed to create scripts to set up the data files, sometimes this is a simple call to an import program provided by the database, sometimes this is an ETL definition file. There have been complaints that we do not react immediately. For instance, with OrientDB we started out with a node importer, which was faulty, when it restarted. With the help of Luca we got an ETL script on the 3rd of July, but we still needed to sort out import issues with UTF-8 and reports about missing nodes (aka Profiles) reported on the 5th of July. Again it is not a trivial tasks to check 30 million edges. On the other hand, even a single missing edge can have an incredible impact, if it changes a shortest path. Therefore these inconsistencies needed to be sorted out. It turned out that the original dataset is clean, that our parallel node importer created too many edges (we did not notice this at first, which is a shame – thus we paid extra attention when creating them anew and updated the data files as soon as they became stable), while the parallel ETL script lost nodes when running. The final solution is now to run the ETL script single threaded – this solution has been published in my repository as a convenient import script. Note that a single run takes 1.5 hours.

The discussion regarding the OrientDB node.js did escalate in a flame-war. In retrospective the discussion about the driver was a complete waste of time, because the only change to the driver has been that the name of the original author has been replaced by a different name – which feels a bit eerie for an open-source project. All the impressive improvements have been made inside the server. You can think about benchmark tests what you like, they must always be taken with a grain of salt. But even with this simple test, the newest version of OrientDB is now a factor of thousand better than the old one for the shortest path.

In the meantime a blog by someone else has been published. We feel a bit like Project Voldemort, because any mention of ArangoDB has been anonymised in this new blog. I’m not that afraid to mention the name: OrientDB, OrientDB, OrientDB 😉 But the results are a bit strange. I tried to reproduce the results, using both our freshly created database and the database provided for download in that blog, as well as the version 2.1Rc5 and 2.2alpha provided there and the current version 2.2 from the git repository. The results are not fully understandable. The aggregation time given in the blog is 5ms. In order to achieve such a result without any query cache, even in assembler, would be hard. With 8 cores and 8 hyper-cores at 2.3 GHz (GCE spec), it gives you 37 million instructions per milli-second. With 1.6 million profiles, that is roughly 115 instructions per profile entry. Not taking into account loading data into the L1 cache, missing parallelism with Hyper-Cores, optimal distribution to 16 cores.

Therefore I used the OrientDB console to verify the result for the aggregation:

OrientDB console v.2.2-SNAPSHOT (build UNKNOWN@r; 2015-07-09 15:36:50+0000) www.orientdb.com 
orientdb {db=pokec}> select AGE,count(*) from Profile group by AGE; 
114 item(s) found. Query executed in 32.668 sec(s).

This matches the number of items expected and the timings of the node driver and not the 5ms published in the blog. Unfortunately, the OrientDB blog does not allow for comments, so we cannot ask what the magic trick is nor are we allowed to post to the Google group. Therefore we currently stick to the observed runtimes on the Google machine. We would love to get feedback, if you can confirm the 5ms for the aggregation using either the console or the benchmark. Our blog is open for comments, but moderated to avoid flamewars and pure marketing articles. Feel free to publish your results here.

Further optimizations of ArangoDB

We have optimized our shortest path in ArangoDB 2.7.0 alpha 1 as well. The results make us very proud. We believe this shows, how much improvements are possible with C++. The calculation of the shortest paths needs only 34ms in ArangoDB 2.7. The neighbors have also become 20% faster. We have not yet exploited all possibilities so far. So, for us as well as all others, the NodeJS driver is currently the bottleneck. We are working on a new solution here. But other areas will also show that many improvements are still possible.

We do not want to remain behind other offers. If someone has performance problems with ArangoDB, we will be happy to help and I am sure that together we will find a solution. I do not advertise this like a shopping channel solution with a money-back guarantee. It comes free of charge for ArangoDB if it is a bug – that is what we expect of ourselves.

Finally a short disclaimer

We are mostly in memory, you need at least 14 GByte of free memory to run the tests. If you use a smaller machine, the results will be different and worse for us. Neo4J has a very small memory footprint when running the tests and will “benefit” from less memory. We use a typical client / server setup, if your application allows you to embed a Java based database, you might get a performance boost by avoiding network communication. On the other hand, if you can embed a C++ driver, ArangoDB will be much faster than the node driver. So, you have to decide what your target architecture will look like before basing any decision on this benchmark.

Updated Results

We used ArangoDB 2.7.0 Alpha, OrientDB 2.2 Alpha, Neo4J 2.3 Snapshot provided by Michael Hunger and MongoDB 3.0.3.

ArangoDB Chart.001 1024x576

ArangoDB Chart.002d 1024x320


Update

OrientDB has revealed the secret behind the unbelievable performance improvements in 2.2 alpha. For computing the aggregation they only need no more than 5ms, that is faster than any analytic engine. The secret is that OrientDB has implemented with 2.2 alpha a kind of query cache (see the update at the bottom in “Our Take on NoSQL DBMS Benchmarks”).

We did not test the query caches and have explained why at the start of our performance series, because we wanted to test the performance of the databases and their algorithms and not the efficiency of the query cache implementations (see the first blog post “Native multi-model can compete with pure document and graph databases”).

Other tested products have also implemented such functionality, namely Neo4J and ArangoDB 2.7. We have explicit not used these caches. If only one product uses a query cache and then advertises it as a incredible performance improvement, this is comparing apples and oranges. If one adds such features and use it for a benchmark, it should of course be communicated clearly to all users or potential users. Or even better, run two comparisons – one with and one without cache enabled in all products.


admin

13 Comments

  1. Nicolas Harraudeau on July 15, 2015 at 12:02 pm

    OrientDB “memory usage” should be in green in the last table.

    • Claudius Weinberger on July 15, 2015 at 1:22 pm

      That’s right, sorry. I’ve changed the image.

      To be honest it should be orange because I now use OrientDB with a fixed schema. This a deviation from the initial test definition but I want to avoid another point in the discussion. Luca provided only this version and before he starts blaming the data-set again, I’ve used his data-set (it does not make that much of a difference for the performance anyhow, but it saves memory).

      • Nicolas Harraudeau on July 15, 2015 at 1:36 pm

        Good to know.

        My remark was not an accusation, just a simple remark.
        Even if there is always room for criticism in such a task, I personally liked the way you updated these benchmarks with other companies suggestions. Seemed fair enough.
        Thank you again for your work.

  2. Ben on July 15, 2015 at 3:28 pm

    I really took the time and efforts to run the tests, thanks for open sourcing them and answering so quick to some additional questions I had around the set up – saved me a lot of time. Since multi model and especially the graph functionality are currently my requirements for a DB, I just focused on ArangoDB, Neo4J and OrientDB.

    First some context info:
    – I used the machines as in the description
    – Even though I would never use alphas/betas in production, I went for ArangoDB 2.7 alpha, OrientDB 2.2 alpha and Neo4J Community 2.3.0-M02 (I could not get the Neo4J version used in the test)
    – I ran the test 3 times per DB
    – Memory usage is not my main concern and the set up for measuring was kind of time consuming, so I dropped it for my benchmark.

    My results are quite in the range of this post. One striking difference was just on the aggregation by Neo4J (my used version) which was worse than in the test. The other results are similar and definitely within an acceptable and normal deviation of the results in the post. Here my results:

    ArangoDB

    shortest: 0.036
    neighbors 2nd: 0.191
    single read: 22.230
    single write: 23.177
    single write sync: 135.949
    aggregation: 1.761

    Neo4J

    shortest: 0.474
    neighbors 2nd: 3.170
    single read: 165.320
    single write:
    single write sync: 227.201
    aggregation: 12.407

    OrientDB

    shortest: 0.276
    neighbors 2nd: 3.475
    single read: 51.150
    single write: 20.617
    single write sync:
    aggregation: 31.092

  3. Dário Marcelino on July 15, 2015 at 4:16 pm

    It’s great to see you guys are still updating the benchmark and even better you’ll add more DBs. As other people I would love to see how, for example, Postgres compares.

    I have a question which has been a controversy point in the past: is the latest version of neighbours2 fetching whole documents?

    BTW, even though I understand why beta and alpha versions are being compared here, and it’s interesting to see the progress made, using production versions is the definitive test since it’s what people actually use. Beta and alpha versions are useful in giving a glimpse into the future but they can lose some of their performance while maturing to production grade.

    • Claudius Weinberger on July 15, 2015 at 4:32 pm

      Yes, we will included some highly asked databases in the next round. And yes, we will extend the neighbors tests to include the documents as well with the next round of databases. We have not included it now, because we were waiting for an officially sanctioned OrientDB SQL to do it. We are not allowed to post in the OrientDB mailling list, but someone asked for a statement there. If we get an answer we will include OrientDB as well.

      • Dário Marcelino on July 20, 2015 at 12:12 pm

        I see, I assume the SQL in question is:

        > SELECT set(out_Relation.key, out_Relation.out.Relation._key) FROM Profile WHERE _key = :key

        I don’t see anything wrong with it but I’m not an OrientDB SQL expert, I mainly use the Node.js driver. Hopefully someone more knowledgable will intervene.

        Thanks

        • fceller on July 21, 2015 at 5:05 pm

          @dmarcelino:disqus the select statement returns only the keys. We were asked to do a second test, where we return the whole profile documents (instead of just the keys).

  4. disqus_8NK4qKyTJQ on July 16, 2015 at 9:48 pm

    In comparing the drivers, in which order is the perfomance improved. Like, C++, nodejs, javascript, blueprint/gremlin, go and the other drivers. I’m interested in one of the most promising drivers.Thank you

    • Claudius Weinberger on July 19, 2015 at 11:47 am

      That depends on the database, the language the database is programmed, your infrastructure, architecture and more. So it’s hard to answer your question.

  5. CoDEmanX on July 18, 2015 at 4:22 pm

    Unbelievable how much they lie at Orient Technologies… They call the dataset dirty although there was just a problem with the import, they change benchmark conditions and don’t mention that they used caches (hello?! It needs to be computed at some point…), they pretend they improved their DB without acknowledging your benchmark that started the discussion and they say their new Node.js driver would do any better, although they did nothing but a name change. Honestly, I hope you consider anti-competition charges because of false advertising messages if they continue like that.

  6. Zack on January 7, 2016 at 9:19 am

    Hi, I am really sold on the idea of arangodb and would like to use the database, however from my tests I have found arangodb aggregation to be slow. For example if you replace the count aggregation in the aggregation test with calculating the average or sum, arango compares slow to mongodb.

  7. Alan Plum on January 26, 2016 at 6:29 pm

    ArangoDB is a database system, so it is an alternative to MySQL, PostrgreSQL, MongoDB and so on, not something you install on top of them. What driver you use depends on what programming language you want to use to build your application.

Leave a Comment





Get the latest tutorials, blog posts and news: