Mastering AQL: Return Distinct Values | ArangoDB Blog

July 27 2015,/Query Language

Last week saw the addition of the RETURN DISTINCT for AQL queries. This is a new shortcut syntax for making result sets unique.

For this purpose it can be used as an easier-to-memorize alternative for the already existing COLLECT statement. COLLECT is very flexible and can be used for multiple purposes, but it is syntactic overkill for making a result-set unique.

New to multi-model and graphs? Check out our free ArangoDB Graph Course.

The new RETURN DISTINCT syntax makes queries easier to write and understand.

Here’s a non-scientific proof for this claim:

Compare the following queries, which both return each distinct age attribute value from the collection:

FOR doc IN collection
  COLLECT age = doc.age
  RETURN age

With RETURN DISTINCT:

FOR doc IN collection
  RETURN DISTINCT doc.age

Clearly, the query using RETURN DISTINCT is more intuitive, especially for AQL beginners. Apart from that, using RETURN DISTINCT will save a bit of typing compared to the longer COLLECT-based query.

Internally both COLLECT and RETURN DISTINCT will work by creating an AggregateNode. The optimizer will try the sorted and the hashed variants for both, so they should perform about the same.

However, the result of a RETURN DISTINCT does not have any guaranteed order, so the optimizer will not insert a post-SORT for it. It may do so for a regular COLLECT.

As mentioned before, COLLECT is more flexible than RETURN DISTINCT. Notably, COLLECT is superior to RETURN DISTINCT when the result set should be made unique using more than one criterion, e.g.

FOR doc IN collection
  COLLECT status = doc.status, age = doc.age, 
  RETURN { status, age }

This is currently not achievable via RETURN DISTINCT, as it only works with a single criterion.

Jan Steemann

After more than 30 years of playing around with 8 bit computers, assembler and scripting languages, Jan decided to move on to work in database engineering. Jan is now a senior C/C++ developer with the ArangoDB core team, being there from version 0.1. He is mostly working on performance optimization, storage engines and the querying functionality. He also wrote most of AQL (ArangoDB’s query language).

July 27 2015,Jan Steemann

Webinar on Fireside chat with Chief Product and Technology Officer. Watch Now

Mastering AQL: Return Distinct Values | ArangoDB Blog

Jan Steemann

Leave a Comment Cancel Reply

Tags

Quick Links

Info

About Us

Stay In Touch