ArangoDB 2.6 Alpha3: Testing New Features & Performance
The 2.6 release preparations are on track: with a 3rd alpha release available for testing purposes today. Please download the latest alpha build and provide us your valuable feedback.
We put great efforts in speeding-up core ArangoDB functionality to make AQL queries perform much better than in earlier versions of ArangoDB.
The queries that improved most in 2.6 over 2.5 include:
FILTER
conditions: simpleFILTER
conditions we’ve tested are 3 to 5 times faster- simple joins using the primary index (
_key
attribute), hash index or skiplist index are 2 to 3.5 times faster - sorting on a string attribute is 2.5 to 3 times faster
- extracting the
_key
or other top-level attributes from documents is 4 to 5 times faster COLLECT
statements: simpleCOLLECT
statements we’ve tested are 7 to 15 times faster
More details on the performance improvements and the test-setup will be published in a follow-up blog post. For now, try out 2.6 alpha3 version – we’ve done our very best to make ArangoDB a lot faster. ; )
What’s new in ArangoDB 2.6
For a full list of changes and improvements please consult the change-log. Over the next week we might also add some more functionality to 2.6, mainly some improvements in the shortest-path implementation and other graph related AQL queries.
Some main features:
- Added AQL
UPSERT
statement to AQL that is a combination of bothINSERT
andUPDATE
/REPLACE
. TheUPSERT
will search for a matching document using a user-provided example. If no document matches the example, the insert part of theUPSERT
statement will be executed. If there is a match, the update / replace part will be carried out:UPSERT { page: 'index.html' } /* search example */ INSERT { page: 'index.html', pageViews: 1 } /* insert part */ UPDATE { pageViews: OLD.pageViews + 1 } /* update part */ IN pageViews
- ArangoDB now provides a dedicated collection export API, which can take snapshots of entire collections The export API is available at endpoint
POST /_api/export?collection=...
. The API has the same return value structure as the already established cursor API (POST /_api/cursor
). Read more in the export documentation. - Alternative implementation for
AQL COLLECT
that uses a hash table for groupingThe alternative method uses a hash table for grouping and does not require its input elements to be sorted. It will be taken into account by the optimizer forCOLLECT
statements that do not use anINTO
clause. - make fulltext index also index text values contained in direct sub-objects of the indexed attribute.Previous versions of ArangoDB only indexed the attribute value if it was a string. Sub-attributes of the index attribute were ignored when fulltext indexing.
Now, if the index attribute value is an object, the object’s values will each be included in the fulltext index if they are strings. If the index attribute value is an array, the array’s values will each be included in the fulltext index if they are strings.
For example, with a fulltext index present on the
translations
attribute, the following text values will now be indexed:var c = db._create("example"); c.ensureFulltextIndex("translations"); c.insert({ translations: { en: "fox", de: "Fuchs", fr: "renard", ru: "лиса" } }); c.insert({ translations: "Fox is the English translation of the German word Fuchs" }); c.insert({ translations: [ "ArangoDB", "document", "database", "Foxx" ] }); c.fulltext("translations", "лиса").toArray(); // returns only first document c.fulltext("translations", "Fox").toArray(); // returns first and second documents c.fulltext("translations", "prefix:Fox").toArray(); // returns all three documents
- Added batch document removal and lookup commands
collection.lookupByKeys(keys) collection.removeByKeys(keys)
There are several more great features and improvements in 2.6 alpha2, just read on in the changelog.
We have a couple of changes in the API as well as changed behavior in 2.6, also stated in the changelog.
Known issues in alpha2
The following issues in alpha2 are already fixed in devel and will be part of the next alpha release:
- (already fixed in devel): starting the server may print errors about non-existing collections when there exist multiple databases
- (already fixed in devel): memleak in server queues manager, leading to the server continuously leaking a small amount of memory
- (already fixed in devel): when arangod is run with authentication turned on: Cookie authentication for web interface may fail after a restart of the server. web interface may not work properly when authentication is turned on
- (already fixed in devel): cluster crashes when creating an edge collection
8 Comments
Leave a Comment
Get the latest tutorials, blog posts and news:
Both windows installers appear to be broken, or is there a dependency such as Visual C++ Redist 2015 RC?
Also, the a blank new server installation won’t start because it misses the subfolders in var for the journal files and arangodb-apps.
Hm, it seems replacing those NSIS binaries you suggested doesn’t work (this way) Since we did our best to shorten path names, lets try again with the original one. I don’t know whether these need other dlls, their wiki didn’t say anything like that…
The .exe packages have been replaced; the new ones should work again.
I can confirm that the new 64bit installer can be started without problems.
What about the folder issues? If there were blank folders in the zip packages, there should be no problem. On the other hand, I would actually expect arangod to create these based on the given configuration if they do not exist and enough permissions are available for the target path.
Hi,
Will there be a HTTP API for the new UPSERT command. Right now, I am using the batch request HTTP API and I can’t find a solution to doing an “UPSERT” operation in batch
Note: unfortunately, code formatting does not work seem to work at all here…
`UPSERT` is currently an AQL keyword, so upserts can be used within AQL queries. There is no dedicated `db.collection.upsert()` methods nor a REST API for it. The reason for this is that there is no “natural” way to refer to an existing value (in case of the UPDATE) within JSON. For example, consider the following (potential) upsert command running in the ArangoShell:
db.collection.upsert(
{ city: “Cologne” }, /* search value */
{ city: “Cologne”, views: 1 }, /* INSERT case */
{ views: views + 1, lastUpdate: Date.now() } /* UPDATE case, won’t work */
);
The question is: how to refer to an existing value in the UPDATE case? I guess we would need to come up with a special syntax for this, e.g.
{
views: {
“$inc” : 1
},
lastUpdate: Date.now()
}
(where the “$inc” would indicate that an existing attribute is to be increased, a bit like in MongoDB).
This would need to be implemented for each type of operation and operand present in AQL. For example, in AQL you can do (contrived example): “UPSERT … INSERT … UPDATE { views: OLD.views + OLD.previousViews * 2 }”. In this case, there is no again no natural way to express this. Something I could come up with for this is the following, but it looks very clumsy and error-prone:
{
views: {
“$set” : {
“$add” : [
{ “$get”: “views” },
[ “$mul”: [
{ “$get”: “previousViews” },
2
]
]
}
}
}
Any ideas for expressing the UPDATE case elegantly in JSON are welcome!
Sorry if I sound a little ignorant but is it that writing it similar to AQL’s Json expression not possible? meaning Using OLD as a special keyword/object like the Date object is a special class to access datetime functions. i.e
db.collection.upsert(
{ city: “Cologne” }, /* search value */
{ city: “Cologne”, views: 1 }, /* INSERT case */
{ views: OLD.views + OLD.previousViews * 2 , lastUpdate: Date.now() }
);
If this is possible, the added benefit would be that you don’t need to learn 2 separate ways to do the same thing (AQL and Shell)
We could introduce a special object named “OLD” in the shell.
Though that would prohibit that someone uses their own variable named “OLD” in the shell, but that probably would be no major show stopper.
More severely, the expression “OLD.views + OLD.previousViews * 2” would still be evaluated in JavaScript in the shell, before it will be sent to the server for evaluation. As operators can’t be overloaded in JavaScript, the + and * operations will be carried out in JavaScript, and will not take into account the specialness of the OLD objects.
JavaScript will execute the above upsert command like this (simplified to using just the “views” attribute):
// build update value
let temp0 = getObjectProperty(globalScope, “OLD”);
let temp1 = getObjectProperty(temp0, “previousViews”);
let temp2 = mul(temp1, 2);
let temp3 = getObjectProperty(temp, “views”);
let temp4 = add(temp1, temp3);
let temp5 = { };
setObjectProperty(temp5, “views”, temp4);
// get object
let temp6 = getObjectProperty(globalScope, “db”);
let temp7 = getObjectProperty(temp6, “collection”);
// build function call arguments
let temp8 = [ ];
arrayPush(temp8, temp5);
// call function
callFunctiion(temp7, “upsert”, temp8);
Prefetching the value for OLD from the server before evaluating the expressions won’t work either, because the “upsert” function will be executed only after all its arguments are fully evaluated.
So I don’t see a good way of implementing this with a pseudo “OLD” object in the shell I’m afraid.
And when thinking of the REST API, there would need to be some representation of “OLD” in JSON. { “views”: OLD.views } won’t work, and { “views”: “OLD.views” } won’t work either (because cannot be distinguished from a regular string).