Upcoming ArangoDB 3.7 and Storage Engines

Estimated reading time: 4 minutes

TL;DR

ArangoDB has supported two storage engines for a while: RocksDB and MMFiles. While ArangoDB started out with just the MMFiles storage engine in its early days, RocksDB became the default storage engine in the 3.4 release. Due to its drawbacks ArangoDB 3.6 deprecated the old MMFiles storage engine and with the upcoming 3.7 release we plan to fully remove support. This blog post will provide the background of why storage engines matter, why we chose to deprecate the MMFiles storage engine, and what you should be aware of when migrating from MMFiles to the RocksDB storage engine. Read more

More info...

ArangoDB 3.4: Enhancements in RocksDB Storage Engine

With ArangoDB 3.4 we finally made the RocksDB storage engine the default. This decision was made after a year of constant improvements to the engine to make it suitable for all our customer’s use cases. Read more

More info...

ArangoDB 3.4 GA
Full-text Search, GeoJSON, Streaming & More

The ability to see your data from various perspectives is the idea of a multi-model database. Having the freedom to combine these perspectives into a single query is the idea behind native multi-model in ArangoDB. Extending this freedom is the main thought behind the release of ArangoDB 3.4.

We’re always excited to put a new version of ArangoDB out there, but this time it’s something special. This new release includes two huge features: a C++ based full-text search and ranking engine called ArangoSearch; and largely extended capabilities for geospatial queries by integrating Google™ S2 Geometry Library and GeoJSON.  Read more

More info...

Speeding Up Dump Restore in ArangoDB: Enhanced Data Recovery

Many ArangoDB users rely on our `arangodump` and `arangorestore` tools as an integral part of their backup and recovery procedures. As such, we want to make the use of these tools, especially `arangodump`, as fast as possible. We’ve been working hard toward this goal in preparation for the upcoming 3.4 release.

We’ve made a number of low-level server-side changes to significantly reduce overhead and improve throughput. Additionally, we’ve put some work into rewriting much of the code for the client tools to allow dumping and restoring collections in parallel, using a number of worker threads specified by `–threads n`. Read more

More info...

Index types and how indexes are used in ArangoDB: Part II

In the first part of this article we dived deep into what indexes are currently available in ArangoDB (3.2 and 3.3), also briefly looking at what improvements are coming with ArangoDB 3.4. Read Part I here.

In this Part II, we are going to focus on how to actually add indexes to a data model and speed up specific queries.

Adding indexes to the data model

The goal of adding an extra index to the data model is to speed up a certain query or even multiple queries.

One of the first things that should be done during development of AQL queries should be to review the output of the explain command. A query can be explained using ArangoDB’s WEB UI or from the ArangoShell. In the ArangoShell it is as simple as db._explain(query), where query is the AQL query string. To explain a query which also has bind parameters, they need to be passed separately into the command, e.g. db._explain(query, bindParameters).
Read more

More info...

How We Wronged Neo4j & PostgreSQL: Update of ArangoDB Benchmark 2018

Recently, we published the latest findings of our Performance Benchmark 2018 including Neo4j, PostgGreSQL, MongoDB, OrientDB and, of course, ArangoDB. We tested bread & butter tasks in a client/server setup for all databases like single read/write and aggregation, but also things like shortest path queries which are a speciality for graph databases. Our goal was and is to demonstrate that a native multi-model database like ArangoDB can at least compete with the leading single model databases on their home turf.

Traditionally, we are transparent with our benchmarks, learned plenty from community feedback and want to keep it that way. Unfortunately, we did something wrong in our latest version and this update will explain what happened and how we fixed it. Read more

More info...

Index types and how indexes are used in ArangoDB: Part I

As in other database systems, indexes can be used in ArangoDB to speed up data retrieval queries, sometimes by many orders of magnitude. Getting the indexes set up the right way is essential for good query performance, so this is an important topic that affects most ArangoDB installations.

This is Part I of how indexes are used by ArangoDB where we discuss what types of indexes are available in the database. In Part II, we will dig deeper into how to actually add indexes to a data model and speed up specific queries. Read Part II here. Read more

More info...

NoSQL Performance Benchmark 2018 – MongoDB, PostgreSQL, OrientDB, Neo4j and ArangoDB

ArangoDB, as a native multi-model database, competes with many single-model storage technologies. When we started the ArangoDB project, one of the key design goals was and still is to at least be competitive with the leading single-model vendors on their home turf. Only then does a native multi-model database make sense. To prove that we are meeting our goals and are competitive, we run and publish occasionally an update to the benchmark series. This time we included MongoDB, PostgreSQL (tabular & JSONB), OrientDB and Neo4j.
Read more

More info...

Performance Impact of Meltdown and Spectre V1 Patches on ArangoDB

To investigate the impact of the Meltdown and Spectre patches on the performance of ArangoDB, we ran benchmark tests with the two storage engines available in ArangoDB (MMFiles & RocksDB). We used the arangobench benchmark and test tool for these tests.

The tests include 10 different test cases with changing test parameters like concurrency, batch requests and asynchronous execution. Read more

More info...

ArangoDB | RocksDB Integration: Performance Enhancement

I have varying levels of familiarity with Google’s original leveldb and three of its derivatives. RocksDB is one of the three. In each of the four leveldb offerings, the code is optimized for a given environment. Google’s leveldb is optimized for a cell phone, which has much more limited resources than a server. RocksDB is optimized for flash arrays on a large servers (per various Rocksdb wiki pages). Note that a flash array is a device of much higher throughput than a SATA or SSD drive or array. It is a device that sits on the processor’s bus. RocksDB’s performance benchmark page details a server with 24 logical CPU cores, 144GB ram, and two FusionIO flash PCI devices. Each FusionIO device cost about $10,000 at the time of the post. So RocksDB is naturally tuned for extremely fast and expensive systems. Here is an example Arangodb import on a machine similar to the RocksDB performance tester: Read more

More info...

Get the latest tutorials,
blog posts and news: