home shape

ArangoDB 2.6 Alpha3: Testing New Features & Performance

The 2.6 release preparations are on track: with a 3rd alpha release available for testing purposes today. Please download the latest alpha build and provide us your valuable feedback.

We put great efforts in speeding-up core ArangoDB functionality to make AQL queries perform much better than in earlier versions of ArangoDB.

The queries that improved most in 2.6 over 2.5 include:

  • FILTER conditions: simple FILTER conditions we’ve tested are 3 to 5 times faster
  • simple joins using the primary index (_key attribute), hash index or skiplist index are 2 to 3.5 times faster
  • sorting on a string attribute is 2.5 to 3 times faster
  • extracting the _key or other top-level attributes from documents is 4 to 5 times faster
  • COLLECT statements: simple COLLECT statements we’ve tested are 7 to 15 times faster

More details on the performance improvements and the test-setup will be published in a follow-up blog post. For now, try out 2.6 alpha3 version – we’ve done our very best to make ArangoDB a lot faster. ; )

What’s new in ArangoDB 2.6

For a full list of changes and improvements please consult the change-log. Over the next week we might also add some more functionality to 2.6, mainly some improvements in the shortest-path implementation and other graph related AQL queries.

Some main features:

  • Added AQL UPSERT statement to AQL that is a combination of both INSERT and UPDATE / REPLACE. The UPSERT will search for a matching document using a user-provided example. If no document matches the example, the insert part of the UPSERT statement will be executed. If there is a match, the update / replace part will be carried out:
    UPSERT { page: 'index.html' }               /* search example */
    INSERT { page: 'index.html', pageViews: 1 } /* insert part */
    UPDATE { pageViews: OLD.pageViews + 1 }     /* update part */
    IN pageViews
    
  • ArangoDB now provides a dedicated collection export API, which can take snapshots of entire collections The export API is available at endpoint POST /_api/export?collection=.... The API has the same return value structure as the already established cursor API (POST /_api/cursor). Read more in the export documentation.
  • Alternative implementation for AQL COLLECT that uses a hash table for groupingThe alternative method uses a hash table for grouping and does not require its input elements to be sorted. It will be taken into account by the optimizer for COLLECT statements that do not use an INTO clause.
  • make fulltext index also index text values contained in direct sub-objects of the indexed attribute.Previous versions of ArangoDB only indexed the attribute value if it was a string. Sub-attributes of the index attribute were ignored when fulltext indexing.

    Now, if the index attribute value is an object, the object’s values will each be included in the fulltext index if they are strings. If the index attribute value is an array, the array’s values will each be included in the fulltext index if they are strings.

    For example, with a fulltext index present on the translations attribute, the following text values will now be indexed:

    var c = db._create("example");
    c.ensureFulltextIndex("translations");
    c.insert({ translations: { en: "fox", de: "Fuchs", fr: "renard", ru: "лиса" } });
    c.insert({ translations: "Fox is the English translation of the German word Fuchs" });
    c.insert({ translations: [ "ArangoDB", "document", "database", "Foxx" ] });
    
    c.fulltext("translations", "лиса").toArray();       // returns only first document
    c.fulltext("translations", "Fox").toArray();        // returns first and second documents
    c.fulltext("translations", "prefix:Fox").toArray(); // returns all three documents
    
  • Added batch document removal and lookup commands
    collection.lookupByKeys(keys)
    collection.removeByKeys(keys)
    

There are several more great features and improvements in 2.6 alpha2, just read on in the changelog.
We have a couple of changes in the API as well as changed behavior in 2.6, also stated in the changelog.

Known issues in alpha2

The following issues in alpha2 are already fixed in devel and will be part of the next alpha release:

  • (already fixed in devel): starting the server may print errors about non-existing collections when there exist multiple databases
  • (already fixed in devel): memleak in server queues manager, leading to the server continuously leaking a small amount of memory
  • (already fixed in devel): when arangod is run with authentication turned on: Cookie authentication for web interface may fail after a restart of the server. web interface may not work properly when authentication is turned on
  • (already fixed in devel): cluster crashes when creating an edge collection
Ingo

Ingo Friepoertner

Ingo is dealing with all the good ideas from the ArangoDB community, customers and industry experts to improve the value provided by the company’s native multi-model approach. In former positions he worked as a product owner and tech consultant, building custom software solutions for large companies in various industries. Ingo holds a diploma in business informatics from FHDW University of Applied Sciences.

8 Comments

  1. CoDEmanX on June 1 2015, at 11:15 pm

    Both windows installers appear to be broken, or is there a dependency such as Visual C++ Redist 2015 RC?

    Also, the a blank new server installation won’t start because it misses the subfolders in var for the journal files and arangodb-apps.

  2. Wilfried Gösgens on June 2 2015, at 9:49 am

    Hm, it seems replacing those NSIS binaries you suggested doesn’t work (this way) Since we did our best to shorten path names, lets try again with the original one. I don’t know whether these need other dlls, their wiki didn’t say anything like that…

  3. Wilfried Gösgens on June 2 2015, at 3:31 pm

    The .exe packages have been replaced; the new ones should work again.

    • CoDEmanX on June 3 2015, at 10:45 am

      I can confirm that the new 64bit installer can be started without problems.

      What about the folder issues? If there were blank folders in the zip packages, there should be no problem. On the other hand, I would actually expect arangod to create these based on the given configuration if they do not exist and enough permissions are available for the target path.

  4. Jeff Pang on June 5 2015, at 9:39 am

    Hi,

    Will there be a HTTP API for the new UPSERT command. Right now, I am using the batch request HTTP API and I can’t find a solution to doing an “UPSERT” operation in batch

    • jsteemann on June 5 2015, at 11:46 am

      Note: unfortunately, code formatting does not work seem to work at all here…

      `UPSERT` is currently an AQL keyword, so upserts can be used within AQL queries. There is no dedicated `db.collection.upsert()` methods nor a REST API for it. The reason for this is that there is no “natural” way to refer to an existing value (in case of the UPDATE) within JSON. For example, consider the following (potential) upsert command running in the ArangoShell:

      db.collection.upsert(
      { city: “Cologne” }, /* search value */
      { city: “Cologne”, views: 1 }, /* INSERT case */
      { views: views + 1, lastUpdate: Date.now() } /* UPDATE case, won’t work */
      );

      The question is: how to refer to an existing value in the UPDATE case? I guess we would need to come up with a special syntax for this, e.g.

      {

      views: {

      “$inc” : 1

      },

      lastUpdate: Date.now()

      }

      (where the “$inc” would indicate that an existing attribute is to be increased, a bit like in MongoDB).

      This would need to be implemented for each type of operation and operand present in AQL. For example, in AQL you can do (contrived example): “UPSERT … INSERT … UPDATE { views: OLD.views + OLD.previousViews * 2 }”. In this case, there is no again no natural way to express this. Something I could come up with for this is the following, but it looks very clumsy and error-prone:

      {

      views: {

      “$set” : {
      “$add” : [

      { “$get”: “views” },

      [ “$mul”: [

      { “$get”: “previousViews” },

      2

      ]

      ]
      }

      }
      }

      Any ideas for expressing the UPDATE case elegantly in JSON are welcome!

      • Jeff Pang on June 18 2015, at 4:54 am

        Sorry if I sound a little ignorant but is it that writing it similar to AQL’s Json expression not possible? meaning Using OLD as a special keyword/object like the Date object is a special class to access datetime functions. i.e

        db.collection.upsert(
        { city: “Cologne” }, /* search value */
        { city: “Cologne”, views: 1 }, /* INSERT case */
        { views: OLD.views + OLD.previousViews * 2 , lastUpdate: Date.now() }
        );

        If this is possible, the added benefit would be that you don’t need to learn 2 separate ways to do the same thing (AQL and Shell)

        • jsteemann on June 18 2015, at 9:51 am

          We could introduce a special object named “OLD” in the shell.
          Though that would prohibit that someone uses their own variable named “OLD” in the shell, but that probably would be no major show stopper.

          More severely, the expression “OLD.views + OLD.previousViews * 2” would still be evaluated in JavaScript in the shell, before it will be sent to the server for evaluation. As operators can’t be overloaded in JavaScript, the + and * operations will be carried out in JavaScript, and will not take into account the specialness of the OLD objects.

          JavaScript will execute the above upsert command like this (simplified to using just the “views” attribute):

          // build update value
          let temp0 = getObjectProperty(globalScope, “OLD”);
          let temp1 = getObjectProperty(temp0, “previousViews”);
          let temp2 = mul(temp1, 2);
          let temp3 = getObjectProperty(temp, “views”);
          let temp4 = add(temp1, temp3);
          let temp5 = { };
          setObjectProperty(temp5, “views”, temp4);

          // get object
          let temp6 = getObjectProperty(globalScope, “db”);
          let temp7 = getObjectProperty(temp6, “collection”);

          // build function call arguments
          let temp8 = [ ];
          arrayPush(temp8, temp5);

          // call function
          callFunctiion(temp7, “upsert”, temp8);

          Prefetching the value for OLD from the server before evaluating the expressions won’t work either, because the “upsert” function will be executed only after all its arguments are fully evaluated.
          So I don’t see a good way of implementing this with a pseudo “OLD” object in the shell I’m afraid.

          And when thinking of the REST API, there would need to be some representation of “OLD” in JSON. { “views”: OLD.views } won’t work, and { “views”: “OLD.views” } won’t work either (because cannot be distinguished from a regular string).

Leave a Comment





Get the latest tutorials, blog posts and news: