Using Multiple Indexes per Collection in ArangoDB
The query optimizer in ArangoDB 2.8 has been improved in terms of how it can make use of indexes. In previous versions of ArangoDB, the query optimizer could use only one index per collection used in an AQL query. When using a logical OR in a FILTER condition, the optimizer did not use any index for the collection in order to ensure the result is still correct.
This is much better in 2.8. Now the query optimizer can use multiple indexes on the same collection for FILTER conditions that are combined with a logical OR.
For all following queries, I have set up a collection named test
, which has two isolated hash indexes on the attributes value1
and value2
, and a skiplist index on attribute value3
.
Let’s first try an AQL queries that uses a logical OR on two different attributes of the collection:
FOR doc IN test
FILTER doc.value1 == 11 || doc.value2 == 19
RETURN doc
The execution plan for this query in 2.7 reveals that query will perform a full collection scan and cannot use indexes because of the logical OR on two different attributes:
Execution plan:
Id NodeType Est. Comment
1 SingletonNode 1 * ROOT
2 EnumerateCollectionNode 0 - FOR doc IN test /* full collection scan */
3 CalculationNode 0 - LET #1 = doc.`value1` == 11 || doc.`value2` == 19
4 FilterNode 0 - FILTER #1
5 ReturnNode 0 - RETURN doc
Indexes used:
none
Optimization rules applied:
none
Running the same query in 2.8 / devel will produce a much better execution plan:
Execution plan:
Id NodeType Est. Comment
1 SingletonNode 1 * ROOT
6 IndexNode 2 - FOR doc IN test /* hash index scan, hash index scan */
3 CalculationNode 2 - LET #1 = doc.`value1` == 11 || doc.`value2` == 19
4 FilterNode 2 - FILTER #1
5 ReturnNode 2 - RETURN doc
Indexes used:
By Type Collection Unique Sparse Selectivity Fields Ranges
6 hash test false false 100.00 % [ `value1` ] doc.`value1` == 11
6 hash test false false 100.00 % [ `value2` ] doc.`value2` == 19
Optimization rules applied:
Id RuleName
1 use-indexes
Multiple indexes will also be used if different index types are accessed, or for non-equality filter conditions. For example, the following query will make use of the two hash indexes and also the skiplist index:
FOR doc IN test
FILTER doc.value1 == 11 || doc.value2 == 19 || doc.value3 > 42
RETURN doc
Here is its execution plan from 2.8:
Execution plan:
Id NodeType Est. Comment
1 SingletonNode 1 * ROOT
6 IndexNode 3 - FOR doc IN test /* hash index scan, hash index scan, skiplist index scan */
3 CalculationNode 3 - LET #1 = doc.`value1` == 11 || doc.`value2` == 19 || doc.`value3` > 42
4 FilterNode 3 - FILTER #1
5 ReturnNode 3 - RETURN doc
Indexes used:
By Type Collection Unique Sparse Selectivity Fields Ranges
6 hash test false false 100.00 % [ `value1` ] doc.`value1` == 11
6 hash test false false 100.00 % [ `value2` ] doc.`value2` == 19
6 skiplist test false false n/a [ `value3` ] doc.`value3` > 42
Optimization rules applied:
Id RuleName
1 use-indexes
For comparison, here is the non-optimized plan from 2.7 for the same query:
Execution plan:
Id NodeType Est. Comment
1 SingletonNode 1 * ROOT
2 EnumerateCollectionNode 0 - FOR doc IN test /* full collection scan */
3 CalculationNode 0 - LET #1 = doc.`value1` == 11 || doc.`value2` == 19 || doc.`value3` > 42
4 FilterNode 0 - FILTER #1
5 ReturnNode 0 - RETURN doc
Indexes used:
none
Optimization rules applied:
none
Still the query optimizer will not be able to use any indexes on a collection when there are multiple FILTER conditions combined with logical OR and at least one of them is not satisfiable by an index of the collection. In this case it has no other choice but to do a full collection scan.
For queries that combine multiple FILTER conditions with a logical AND, the optimizer will still try to pick the most selective index for the query and use it for the collection.
3 Comments
Leave a Comment
Get the latest tutorials, blog posts and news:
how to do full text search in a collection with multiple attributes ? plsssssss some expert answer ???
A fulltext index can be created on a single attribute. However, if the given attribute value is an array or object, the fulltext index will index the direct subvalues, e.g.
{ “text” : “some value” } => will index “some value”
{ “text” : [ “some value”, “foobar” ] } => will index “some value” and “foobar”
{ “text” : { “what” : “some value”, “else” : “foobar” } } => will index “some value” and “foobar”
how will it handle indexing if I have an array and in that array I have another array which have my desired key:value……that means I have to index the value which lies inside a nested array……?????