Full-Text Index Enhancements: ArangoDB Search Optimization
This post is about improvements for the fulltext index in ArangoDB 2.6. The improvements address the problem that non-string attributes were ignored when fulltext-indexing.
Effectively this prevented string values inside arrays or objects from being indexed. Though this behavior was documented, it was limited the usefulness of the fulltext index much. Several users requested the fulltext index to be able to index arrays and object attributes, too.
Finally this has been accomplished, so the fulltext index in 2.6 supports indexing arrays and objects!
Read on in Jan’s blog post about Fulltext Index Enhancements.
String Comparison Performance: ArangoDB Query Optimization
We’ve been using Callgrind with its powerful frontend KCachegrind for quiet some time to analyse where the hot spots can be found inside of ArangoDB. One thing always accounting for a huge chunk of the resource usage was string comparison. Yes, string comparison isn’t as cheap as one may think, but its been even a bit more than one would expect. And since much of the business of a database is string comparison, its used a lot.
ArangoDB and V8 use the ICU Library for these purposes (with no alternatives on the market) – so basically we heavily rely on the performance of the ICU library. However, one line in the ICU change-log – ‘Performance: string comparisons significantly faster’ – made us listen up.
So it was a crystal clear objective to take advantage of these performance improvements. As we use the ICU bundled with V8, we had to make sure it would work smooth for it first ;-). After enrolling the upgrade, we wanted to know whether everything was working fine with valgrind etc, and get some figures how much the actual improvement is.
(more…)
ArangoDB at Strata + Hadoop World London (5-7 May)
Max Neunhöffer from ArangoDB shows an use case for multi-model NoSQL databases on Strata + Hadoop World 2015 in London. Join his Session on 6th of May.
Multi-model databases and the art of aircraft maintenance
We describe a case study in aircraft fleet management, where we needed a database that would store data about all the different parts and subunits of an aircraft. A single aircraft already contains some 6,000,000 parts, not counting components.
The particular structure of the queries arising naturally from day-to-day processes quickly led to the insight that no single data model was sufficient to ensure satisfactory performance. Whichever data model we tried, there was always at least one important query that would either take ages to complete or be unbearably complicated, or indeed both.
(more…)
Return Value Optimization for AQL: ArangoDB Query Efficiency
While in search for further AQL query optimizations last week, we found that intermediate AQL query results were copied one time too often in some cases.
Precisely, the data that a query’s ReturnNode
will return to the caller was copied into the ReturnNode
’s own register. With ReturnNode
’s never modifying their input data, this demanded for something that is called return-value optimization in compilers.
2.6 will now optimize away these copies in many cases, and my blog post Return Value Optimization for AQL shows performance benefits of 10-25% that can be expected due to the optimization.
ArangoDB 2.5.3: Maintenance Release for Enhanced Stability
This version is deprecated. Download the new version of ArangoDB
The third maintenance release for ArangoDB 2.5 is available for download. This maintenance release is to address some issues in ArangoDB 2.5 and to support future releases. (more…)
ArangoDB Team in Silicon Valley: Innovation and Collaboration
ArangoDB’s outpost in the Bay area is getting more and more crowded. CTO Frank @fceller has joined the team of our CEO Claudius @weinberger, and ArangoDB´s lead developers: Max @neunhoef & Michael @mchacki. The latter are in San Francisco for a while already.
You can meet our team by attending several Meetups, the Collision Conf in Vegas (5-7 May) or at the @GeekdomSF office at Folsom Street #100, near Moscone Center. Get in touch, grab’ a coffee and join the discussions about NoSQL and multi-model databases. We are here to stay – at least until the end of May. (more…)
Exporting Data for Offline Processing in PHP: ArangoDB Guide
A few weeks ago I wrote about ArangoDB’s specialized export API.
The export API is useful when the goal is to extract all documents from a given collection and to process them outside of ArangoDB.
The export API can provide quick and memory-efficient snapshots of the data in the underlying collection, making it suitable for extract all documents of the collection. It will be able to provide data much faster than with an AQL query that will extract all documents.
In this post I’ll show how to use the export API to extract data and process it with PHP.
Please read the full blog post Exporting Data for Offline Processing.
AQL Functions Enhancements: Boosting ArangoDB Query Capabilities
Waiting for a git pull
to complete over an 8 KiB/s internet connection is boring. So I thought I’d rather use the idle time and quickly write about some performance improvements for certain AQL functions that were recently completed and that will become available with ArangoDB 2.6.
The improvements affect the following AQL functions:
UNSET()
: remove specified attributes from an object/documentKEEP()
: keep only specified attributes of an object/documentMERGE()
: merge the attributes of multiple objects/documents
This blog post shows a few example queries that will benefit from 50 to more than 60 % reductions in query execution times due to the changes done to these functions.
Efficient Data Collection with Hash Tables: ArangoDB Insights
ArangoDB 2.6 will feature an alternative hash implementation of the AQL COLLECT
operation. The new implementation can speed up some AQL queries that can not exploit indexes on the COLLECT
group criteria.
This blog post provides a preview of the feature and shows some nice performance improvements. It also explains the COLLECT
-related optimizer parts and how the optimizer will decide whether to use the new or the traditional implementation.
Creating Multi-Game Highscore Lists: ArangoDB Techniques
I just came across a question about how to create highscore lists or leaderboards in ArangoDB, and how they would work when compared to Redis sorted sets.
This blog post tries to give an answer on the topic and also detailed instructions and queries for setting up highscore lists with ArangoDB. The additional section “Extensions” explains slightly more advanced highscore list use cases like multi-game highscore lists, joining data and maintaining a “last updated” date.
(more…)
Get the latest tutorials,
blog posts and news:
Thanks for subscribing! Please check your email for further instructions.