ArangoDB 2.8: New Features and Enhancements

January 26 2016,/Releases

We welcome 2016 with our first big news yet – the release of ArangoDB 2.8!

Now you can use new AQL keywords to traverse a graph even more convenient – a big deal for those who like to get the maximum out of their connected data. ArangoDB is getting faster with every iteration, in this release we have implemented several AQL functions and arithmetic operations in super-fast C++ code, optimizer rules and indexing improved further to help you getting things done faster. Download ArangoDB 2.8 here.

Array Indexes

The added Array Indexes are a major improvement to ArangoDB that you will love and never want to miss again. Hash indexes and skiplist indexes can now be defined for array values as well, so it’s freaking fast to access documents by individual array values. Let assume you want to retrieve articles that are tagged with “graphdb”, you can now use an index on the tags array:

  { 
    text: "Here's what I want to retrieve...",
    tags: [ "graphdb", "ArangoDB", "multi-model" ] 
  }

An added hash-index on tags (ensureHashIndex("tags[*]")) can be used for finding all documents having "graphdb" somewhere in their tags array using the following AQL query:

  FOR doc IN documents 
    FILTER "graphdb" IN doc.tags[*] 
    RETURN doc

Have fun with these new indexes!

AQL Graph Traversal

Next, the mentioned AQL graph traversals. The query language AQL adds the keywords GRAPH, OUTBOUND, INBOUND and ANY for use in graph traversals. Using plain AQL in ArangoDB 2.8 you can create a shopping list for your friends birthday gifts, related to products they already own and up to 5 ideas ordered by price.

FOR friend IN OUTBOUND @me isFriendOf
  LET toBuy = (
  FOR bought IN OUTBOUND friend hasBought     
    FOR combinedProduct IN OUTBOUND bought combinedProducts
      SORT combinedProduct.price
      LIMIT 5
      RETURN combinedProduct
)
RETURN { friend, toBuy }

You can improve this list by limiting the result to friends that have a birthday within the next 2 months (assuming

birthday: "1970-01-15").

  LET maxDate = DATE_ADD(DATE_NOW(), 2, 'months')
  ...
  FILTER DATE_ISO8601(DATE_YEAR(DATE_NOW()),DATE_MONTH(friend.birthday),DATE_DAY(friend.birthday)) < maxDate

graph4

Using Dave as a bind parameter for @me, we get the following result for our shopping tour:

[
  {
    "friend": {
      "name": "Julia",
      "_id": "users/Julia",
      "_rev": "1868379126",
      "_key": "Julia"
    },
    "toBuy": [
      {
        "price": 12,
        "name": "SanDisk Extreme SDHC UHS-I/U3 16GB Memory Card",
        "_id": "products/SanDisk16",
        "_rev": "2012820470",
        "_key": "SanDisk16"
      },
      {
        "price": 21,
        "name": "Lightweight Tripod 60-Inch with Bag",
        "_id": "products/Tripod",
        "_rev": "2003514358",
        "_key": "Tripod"
      },
      {
        "price": 99,
        "name": "Apple Pencil",
        "_id": "products/ApplePencil",
        "_rev": "2019177462",
        "_key": "ApplePencil"
      },
      {
        "price": 169,
        "name": "Smart Keyboard",
        "_id": "products/SmartKeyboard",
        "_rev": "2020160502",
        "_key": "SmartKeyboard"
      }
    ]
  },
  {
    "friend": {
      "name": "Debby",
      "city": "Dallas",
      "_id": "users/Debby",
      "_rev": "1928803318",
      "_key": "Debby"
    },
    "toBuy": [
      {
        "price": 12,
        "name": "Lixada Bag for Self Balancing Scooter",
        "_id": "products/LixadaScooterBag",
        "_rev": "2018194422",
        "_key": "LixadaScooterBag"
      }
    ]
  }
]

Usage of these new keywords as collection names, variable names or attribute names in AQL queries will not be possible without quoting. For example, the following AQL query will still work as it uses a quoted collection name and a quoted attribute name:

FOR doc IN `OUTBOUND`
  RETURN doc.`any`

Please have a look in the documentation for further details.

Syntax for managed graphs:

FOR vertex[, edge[, path]] IN MIN [..MAX] OUTBOUND|INBOUND|ANY startVertex GRAPH graphName

Working on collection sets:

FOR vertex[, edge[, path]] IN MIN[..MAX] OUTBOUND|INBOUND|ANY startVertex edgeCollection1, .., edgeCollectionN

AQL COLLECT … AGGREGATE

Additional, there is a cool new aggregation feature that was added after the beta releases. AQL introduces the keyword AGGREGATE for use in AQL COLLECT statements.

Using AGGREGATE allows more efficient aggregation (incrementally while building the groups) than previous versions of AQL, which built group aggregates afterwards from the total of all group values.

AGGREGATE can be used inside a COLLECT statement only. If used, it must follow the declaration of grouping keys:

FOR doc IN collection
  COLLECT gender = doc.gender AGGREGATE minAge = MIN(doc.age), maxAge = MAX(doc.age)
  RETURN { gender, minAge, maxAge }

or, if no grouping keys are used, it can follow the COLLECT keyword:

FOR doc IN collection
  COLLECT AGGREGATE minAge = MIN(doc.age), maxAge = MAX(doc.age)
  RETURN { minAge, maxAge }

Only specific expressions are allowed on the right-hand side of each AGGREGATE assignment:

on the top level the expression must be a call to one of the supported aggregation functions LENGTH, MIN, MAX, SUM, AVERAGE, STDDEV_POPULATION, STDDEV_SAMPLE, VARIANCE_POPULATION, or VARIANCE_SAMPLE
the expression must not refer to variables introduced in the COLLECT itself

Within the last weeks we have already published blog posts on several new features and enhancements in ArangoDB 2.8. So have a look at AQL function speedups, automatic deadlock detection (which is backported to 2.7.5 as well). The blog post about using multiple indexes per collection is worth to read, as well as the index speedups article. In the web interface you can now use bind parameters in the AQL editor.

There is a lot more to read in the changelog of ArangoDB 2.8 and we will proceed with the presentation of some features in detailed blog posts. You can find the latest documentation on docs.arangodb.com.

Ingo Friepoertner

Ingo is dealing with all the good ideas from the ArangoDB community, customers and industry experts to improve the value provided by the company’s native multi-model approach. In former positions he worked as a product owner and tech consultant, building custom software solutions for large companies in various industries. Ingo holds a diploma in business informatics from FHDW University of Applied Sciences.

January 26 2016,Ingo Friepoertner

4 Comments

Jens Petter Abrahamsen on January 29 2016, at 3:28 pm

Does the new traversal syntax (INBOUND, OUTBOUND) support any options? E.g. how to treat duplicates, visitor function and such?

Reply
- Michael Hackstein on February 4 2016, at 9:29 am
  
  Hi, the intention of the new traversal syntax is to be more simple and use less options than the function-style traversal. This simplification allows us internally to use several shortcuts and optimizations and allows for an easier entry point for a user. Also we think they cover a lot of use-cases already.
  
  Therefore there not (yet) any plan for further options except the direction and the steps for these traversals. The “visitor” function should be implemented in the later AQL statements (which covers every visitor just returning or counting attributes). Duplicates can be removed using the DISTINCT modifier.
  
  However if these features are not powerful enough you can still use the function-style traversals which are indeed more generic and powerful, but also more complicated to configure.
  
  Reply
  - Jens Petter Abrahamsen on February 5 2016, at 1:23 pm
    
    About the visitor function, just to make it clear, are you saying that one would e.g. have code like:
    let vertices = (AQL using outbound/inbound/any with optional DISTINCT and filter etc)
    Then just run a regular for v in vertices VISITOR(v)
    
    something like that?
    
    Reply
    - Michael Hackstein on February 5 2016, at 3:07 pm
      
      If you want to use DISTINCT this is exactly the way to go, DISTINCT is executed in the return step so you need to save the distinct set of vertices before executing the VISITOR on all of them.
      
      If you do not want to use DISTINCT you can also write:
      FOR v IN OUTBOUND @start @@edge FILTER … RETURN VISITOR(v)
      which will be more efficient
      
      Reply

ArangoDB 2.8: New Features and Enhancements

Array Indexes

AQL Graph Traversal

AQL COLLECT … AGGREGATE

Ingo Friepoertner

4 Comments

Leave a Comment Cancel Reply

Tags

Quick Links

Info

About Us

Stay In Touch