Sort-Limit Optimization in AQL
Sometimes we want sorted output from a query and, for whatever reason, cannot use an index to do the sorting. In ArangoDB, we already cover this critical case with finely tuned query execution code. Sometimes though, we do not need to return all SORT
LIMIT
In many cases though, it is possible to maintain just a reduced set of data in memory, equivalent in size to that required by LIMIT
Example Sort-Limit query
Consider the following simple example.
FOR doc IN records
SORT doc.value
LIMIT 100, 10
RETURN doc
Suppose that records
value
records
records
This saves us a great deal of memory usage, and can in fact save us quite a bit of time as well. Suppose we run essentially the basic query above with a few different configurations as a benchmark.
Performance improvement results
First, let’s set up our collection with the following query:
FOR i IN 1..@collectionSize
INSERT {value: RAND()}
INTO records
Then let’s test the lookup performance with this query:
FOR doc IN records
SORT doc.value
LIMIT @limit
RETURN doc
Tested on a developer laptop with a 7th-gen Core i7, we get the following results.
Note that as the number of items retained approaches the total number of input elements, the speedup drops, but we still get considerable memory savings. As the number gets even closer to the input, we may actually get some small slowdown. In the cases where the memory savings will still be significant, we the optimizer may choose to enable the “sort-limit” rule even if there will be a small slowdown. If the memory savings will be trivial, however, the optimizer will not apply the “sort-limit” rule, in order to avoid such a slowdown.
If you experience an undesired slowdown on a particular
Feel free to check out the optimization for yourself with one of our recent nightly Docker builds, or look forward to improved performance with our 3.5 release!
Get the latest tutorials, blog posts and news: