To put the results into perspective, note that each query returns one result per document in the test collection. The three queries make 2, 3, and 6 document lookups times the collection size, respectively – that’s e.g. 200k, 300k and 600k lookups for a collection with 100k documents. The result set will, for all three queries, be of the same size as the whole collection, and the collection will be read multiple times in full. Thus the execution times are relatively high; this is so we get reliable and comparable results for the performance test. In principle, speedups are similar for smaller result sets as well.
Single server tests
All single server tests are run against a collection with 100.000 documents.
testname | collection | runs | avg (s) | avg speedup |
---|---|---|---|---|
Self join, without splicing | 100k | 5 | 0.5680 | |
Self join, with splicing | 100k | 5 | 0.4875 | 1.17 |
Side-by-side joins, without splicing | 100k | 5 | 1.6362 | |
Side-by-side joins, with splicing | 100k | 5 | 0.9170 | 1.78 |
Nested joins, without splicing | 100k | 5 | 3.2399 | |
Nested joins, with splicing | 100k | 5 | 2.7808 | 1.17 |
Cluster tests
The cluster tests are done on a cluster with three machines, each running one Agent, Coordinator and DB-Server. The test collection contains 10.000 documents. Note that the lookups done in the collection cannot make use of any document locality, so every lookup has to be done on every DB-Server.
testname | collection | runs | avg (s) | avg speedup |
---|---|---|---|---|
Self join, without splicing | 10k | 5 | 36.4326 | |
Self join, with splicing | 10k | 5 | 1.2663 | 28.8 |
Side-by-side joins, without splicing | 10k | 5 | 76.1388 | |
Side-by-side joins, with splicing | 10k | 5 | 2.7961 | 27.2 |
Nested joins, without splicing | 10k | 5 | 196.8784 | |
Nested joins, with splicing | 10k | 5 | 7.7322 | 25.5 |