Speeding Up Array Operations: ArangoDB Performance Tips

Last week some further optimization slipped into 2.6. The optimization can provide significant speedups in AQL queries using huge array/object bind parameters and passing them into V8-based functions.

It started with an ArangoDB user reporting a specific query to run unexpectedly slow. The part of the query that caused the problem was simple and looked like this:

FOR doc IN collection
  FILTER doc.attribute == @value
  RETURN TRANSLATE(doc.from, translations, 0)

In the original query, translations was a big, constant object literal. Think of something like the following, but with a lot more values:

{
  "p1" : 1,
  "p2" : 2,
  "p3" : 40,
  "p4" : 9,
  "p5" : 12
}

The translations were used for replacing an attribute value in existing documents with a lookup table computed outside the AQL query.

The number of values in the translations object was varying from query to query, with no upper bound on the number of values. It was possible that the query was running with 50,000 lookup values in the translations object.

(more…)

Arango Weekly 26: OrientDB Benchmark & Latest Updates

Last week we’ve published a benchmark post: Native multi-model can compete with pure document and graph databases. An article that attracted some great attention on HN / social media. Many asked us to add the multi-model database OrientDB to the benchmark post. We did and published Performance comparison between ArangoDB, MongoDB, Neo4j and OrientDB today.

We would appreciate and welcome your feedback.

(more…)

Performance Comparison: ArangoDB vs MongoDB, Neo4j, OrientDB

The latest edition of the NoSQL Performance Benchmark (2018) has been released. Please click here

My recent blog post “Native multi-model can compete” has sparked considerable interest on HN and other channels. As expected, the community has immediately suggested improvements to the published code base and I have already published updated results several times (special thanks go to Hans-Peter Grahsl, Aseem Kishore, Chris Vest and Michael Hunger).

Please note: An update is available (June ’15) and a new performance test with PostgreSQL added.

Here are the latest figures and diagrams:

chart performance r105

The aim of the exercise was to show that a multi-model database can successfully compete with special players on their own turf with respect to performance and memory consumption. Therefore it is not surprising that quite a few interested readers have asked, whether I could include OrientDB, the other prominent native multi-model database.

(more…)

Working with ArangoDB: Insights from Francis at Boostport

As an open-source project we are always happy when we learn about new projects that use ArangoDB and we are thankful for any feedback; on how working with ArangoDB and/or interacting with the team – has helped your projects to develop. If you have a story you want to share, please get in touch.

Recently we have received a nice feedback from Francis (Boostport) that reached us with the launch of his new product. Parts of Boostport are realized using Foxx-JavaScript extensions on ArangoDB:

“I really enjoyed working on ArangoDB. It’s very stable, well-documented and the API docs are very clear, as I use the REST api. Support is also top-notch. Bugs were often fixed hours or days after discovery and in one case, after I submitted an enhancement request for custom AQL functions, Jan implemented them over the next few days.

For me, the most important feature is being able to build custom AQL functions in javascript. This allowed me to easily perform analytics on social data, generate the appropriate views and send them back to the client for consumption. Finally, I also really liked how minimal configuration is needed to get it up and running. As a plus, I like how you can set up replication using the REST interface. :)”

(more…)

Public Key Infrastructure: Setup Guide for Debian & Ubuntu

We want to have a full chain of trust for our debian packages. Therefore the Suse Open Build Service (OBS) service signs them. We publish the key alongside the repository.

However, one can do better and do the validation right on apt-get install arangodb. Here’s how: (more…)

Arango Weekly 25: Updates, Releases & Insights

This week we’ve released the third alpha for ArangoDB 2.6. Using this alpha3 release we’ve done our first benchmark article. Claudius wrote a blog post: Native multi-model can compete with pure document and graph databases – which shows that ArangoDB is really not that bad. We would love to hear your feedback.
(more…)

Multi-Model Benchmark: Assessing ArangoDB’s Versatility

Claudius Weinberger, CEO ArangoDB

TL;DR Native multi-model databases combine different data models like documents or graphs in one tool and even allow to mix them in a single query. How can this concept compete with a pure document store like MongoDB or a graph database like Neo4j? I myself and a lot of folks in the community asked that question.

So here are some benchmark results: 100k reads → competitive; 100k writes → competitive; friends-of-friends → superior; shortest-path → superior; aggregation → superior.

Feel free to comment, join the discussion on HN and contribute – it’s all on Github.

The latest edition of the NoSQL Performance Benchmark (2018) has been released. Please click here

(more…)

Getting Unique Values: Efficient Data Retrieval in ArangoDB

While paging through the issues in the ArangoDB issue tracker I came across issue #987, titled “Trying to get distinct document attribute values from a large collection fails”.

The issue was opened around 10 months ago when ArangoDB 2.2 was around. We improved AQL performance somewhat since then, so I was eager to see how the query would perform in ArangoDB 2.6, especially when comparing it to 2.2.

For reproduction I quickly put together some example data to run the query on:

var db = require("org/arangodb").db; 
var c = db._create("test"); 
for (var i = 0; i < 4 * 1000 * 1000; ++i) {
  c.save({ _key: "test" + i, value: (i % 100) }); 
}
require("internal").wal.flush(true, true);

(more…)

ArangoDB 2.6 Alpha3: Testing New Features & Performance

The 2.6 release preparations are on track: with a 3rd alpha release available for testing purposes today. Please download the latest alpha build and provide us your valuable feedback.

We put great efforts in speeding-up core ArangoDB functionality to make AQL queries perform much better than in earlier versions of ArangoDB.

The queries that improved most in 2.6 over 2.5 include:

  • FILTER conditions: simple FILTER conditions we’ve tested are 3 to 5 times faster
  • simple joins using the primary index (_key attribute), hash index or skiplist index are 2 to 3.5 times faster
  • sorting on a string attribute is 2.5 to 3 times faster
  • extracting the _key or other top-level attributes from documents is 4 to 5 times faster
  • COLLECT statements: simple COLLECT statements we’ve tested are 7 to 15 times faster

More details on the performance improvements and the test-setup will be published in a follow-up blog post. For now, try out 2.6 alpha3 version – we’ve done our very best to make ArangoDB a lot faster. ; )

What’s new in ArangoDB 2.6

For a full list of changes and improvements please consult the change-log. Over the next week we might also add some more functionality to 2.6, mainly some improvements in the shortest-path implementation and other graph related AQL queries.
(more…)

MERII Hummingbird A80 Optimus Cluster: ArangoDB Deployment

For running ArangoDB in clusters doing performance tests we wanted to have a non virtualized set of descent hardware with fast ethernet connection, enough RAM (since thats what Arango needs) and multicore CPU. Since you need a bunch of them, cheap ARM devel boards come to mind. The original Raspberry PI (we have those) is out of the game due to V8 is not supporting it anymore. The now available PI 2 doesn’t cut it, since its ethernet NIC is connected via USB (as on the original PI). The Odroid series only have one of both: Fast ethernet or enough RAM. The Cubieboard 4 wasn’t available yet, but its Allwinner A80 SOC seemed a good choice. Then we met the Merii Optimus board, which seems to be almost the same as the PCDuino (now renamed to Arches) with the A80. While we got a bunch of them for a descent price over at Pollin, the upstream support wasn’t that good.

However, with some help of the SunXi-Linux Project we started flashing OS images to replace the preloaded Android image with the Merii Linux image. Since the userland of the Merii image is pretty sparse, we wanted something more useable. There is already a how-to on running Ubuntu which requires running a Windows host. We prefer a Linux host and want to run a Debian. Since the new Pi2 is also able to run regular Debian with ArmV7, we pick the root fs from sjoerd.

IMG_7640 (more…)

Get the latest tutorials, blog posts and news: