AQL Optimizer Improvements in ArangoDB 2.8

With the 2.8 beta phase coming to an end it’s time to shed some light on the improvements in the 2.8 AQL optimizer. This blog post summarizes a few of them, focusing on the query optimizer. There’ll be a follow-up post that will explain dedicated new AQL features soon. Read more

More info...

AQL Function Speedups: ArangoDB 2.8 Enhancements

While working on the upcoming ArangoDB 2.8, we have reimplemented some AQL functions in C++ for improved performance. AQL queries using these functions may benefit from using the new implementation of the function.

The following list shows the AQL functions for which a C++ implementation has been added in 2.8. The other C++-based AQL function implementations added since ArangoDB 2.5 are also still available. Here’s the list of functions added in 2.8: Read more

More info...

Using Multiple Indexes per Collection in ArangoDB

The query optimizer in ArangoDB 2.8 has been improved in terms of how it can make use of indexes. In previous versions of ArangoDB, the query optimizer could use only one index per collection used in an AQL query. When using a logical OR in a FILTER condition, the optimizer did not use any index for the collection in order to ensure the result is still correct.

This is much better in 2.8. Now the query optimizer can use multiple indexes on the same collection for FILTER conditions that are combined with a logical OR. Read more

More info...

Index Speedups in ArangoDB 2.8: Enhancements

The upcoming 2.8 version of ArangoDB will provide several improvements in the area of index usage and query optimization.

First of all, hash and skiplist indexes can now index individual array values. A dedicated post on this will follow shortly. Second, the query optimizer can make use multiple indexes per collection for queries with OR-combined filter conditions. This again is a subject for another post. Third, there have been some speed improvements due to changes in the general index handling code. This is what this post is about. Read more

More info...

Using Bind Parameters in the AQL Editor: ArangoDB

The AQL editor in the web interface is useful for running ad hoc AQL queries and trying things out. It provides a feature to explain the query and inspect its execution plan. This can be used to check if the query uses indexes, and which.

So far the AQL editor only supported using query string literals, but it lacked support for bind parameters. Queries issued by application code however often will use bind parameters for security reasons. Often enough this prevented copying & pasting queries from the application code into the AQL editor and vice versa without making manual adjustments. Read more

More info...

ArangoDB 2.7 GA: Significant Improvements

Long awaited and now we´ve finished it! New major release of ArangoDB 2.7 is ready for download. First of all a big thanks to our community for your great support! We´ve implemented a lot of your ideas! After your feedback to RC1 and RC2 we are happy to bring a new major release to the world. With ArangoDB 2.7 we increased our performance even further and improved query handling a lot. Read more

More info...

Building AQL Query Strings: Tips and Best Practices | ArangoDB Blog

I recently wrote two recipes about generating AQL query strings. They are contained in the ArangoDB cookbook by now:

After that, Github user tracker1 suggested in Github issue 1457 to take the ES6 template string variant even further, using a generator function for string building, and also using promises and ES7 async/await.

We can’t use ES7 async/await in ArangoDB at the moment due to lacking support in V8, but the suggested template string generator function seemed to be an obvious improvement that deserved inclusion in ArangoDB.

Basically, the suggestion is to use regular JavaScript variables/expressions in the template string and have them substituted safely.

With regular AQL bind parameters, a query looks like this:

var bindVars = { name: "test" };
var query = `FOR doc IN collection 
         FILTER doc.name == @name 
         RETURN doc._key`;
db._query(query, bindVars);

This is immune to parameter injection, because the query string and the bind parameter value are passed in separately. But it’s not very ES6-y.

(more…)

More info...

AQL Object Literal Simplification: ArangoDB Query Optimization

ArangoDB’s devel branch recently saw a change that makes writing some AQL queries a bit simpler.

The change introduces an optional shorthand notation for object attributes in the style of ES6’s enhanced object literal notation.

For example, consider the following query that groups values by age attribute and counts the number of documents per distinct age value:

FOR doc IN collection
  COLLECT age = doc.age WITH COUNT INTO length
  RETURN { age: age, length: length } 

The object declaration in the last line of the query is somewhat redundant because one has to type identical attribute names and values:

RETURN { age: age, length: length } 

In this case, the new shorthand notation simplifies the RETURN to:

RETURN { age, length }

In general, the shorthand notation can be used for all object literals when there is an attribute name that refers to a query variable of the same name.

It can also be mixed with the longer notation, e.g.:

RETURN { age, length, dateCreated: DATE_NOW() }
More info...

Mastering AQL: Return Distinct Values | ArangoDB Blog

Last week saw the addition of the RETURN DISTINCT for AQL queries. This is a new shortcut syntax for making result sets unique.

For this purpose it can be used as an easier-to-memorize alternative for the already existing COLLECT statement. COLLECT is very flexible and can be used for multiple purposes, but it is syntactic overkill for making a result-set unique.

New to multi-model and graphs? Check out our free ArangoDB Graph Course.

The new RETURN DISTINCT syntax makes queries easier to write and understand.

Here’s a non-scientific proof for this claim:

Compare the following queries, which both return each distinct age attribute value from the collection:

FOR doc IN collection
  COLLECT age = doc.age
  RETURN age

With RETURN DISTINCT:

FOR doc IN collection
  RETURN DISTINCT doc.age

Clearly, the query using RETURN DISTINCT is more intuitive, especially for AQL beginners. Apart from that, using RETURN DISTINCT will save a bit of typing compared to the longer COLLECT-based query.

Internally both COLLECT and RETURN DISTINCT will work by creating an AggregateNode. The optimizer will try the sorted and the hashed variants for both, so they should perform about the same.

However, the result of a RETURN DISTINCT does not have any guaranteed order, so the optimizer will not insert a post-SORT for it. It may do so for a regular COLLECT.

As mentioned before, COLLECT is more flexible than RETURN DISTINCT. Notably, COLLECT is superior to RETURN DISTINCT when the result set should be made unique using more than one criterion, e.g.

FOR doc IN collection
  COLLECT status = doc.status, age = doc.age, 
  RETURN { status, age }

This is currently not achievable via RETURN DISTINCT, as it only works with a single criterion.

More info...

The Great AQL Shootout: ArangoDB 2.5 vs 2.6 Comparison

For the ArangoDB 2.6 release from last week we’ve put some performance tests together. The tests will compare the AQL query execution times in 2.5 and 2.6.

The results look quite promising: 2.6 outperformed 2.5 for all tested queries, mostly by factors of 2 to 5. A few dedicated AQL features in the tests got boosted even more, resulting in query execution time reductions of 90 % and more. Finally, the tests also revealed a dedicated case for which 2.6 provides a several hundredfold speedup.

Also good news is that not a single of the test queries ran slower in 2.6 than in 2.5.

(more…)

More info...

Get the latest tutorials,
blog posts and news: