k Shortest Paths Queries in AQL
The AQL feature k Shortest Paths allows you to query not just a shortest path between two documents in an ArangoDB database, but the list of paths between them, sorted by length, or weight.
To walk you through how to use this feature, we will setup a small graph that represents train connections in Europe, giving travel time as weight on each edge, and then query for some short connections using k Shortest Paths queries in AQL.
See a quick feature overview here:
Setup
Create a new graph TrainGraph
with document collection places
, and an edge collection connections
;
and copy and paste the following two queries to setup the example graph
FOR p IN [ "Inverness", "Aberdeen", "Leuchars", "StAndrews", "Edinburgh", "Glasgow", "York", "Carlisle", "Birmingham", "London", "Brussels", "Cologne", "Toronto" ] INSERT { _key: p } INTO "places"
FOR c IN [ { _from: "places/Inverness", _to: "places/Aberdeen", travelTime: 3 } , { _from: "places/Aberdeen", _to: "places/Inverness", travelTime: 2.5 } , { _from: "places/Aberdeen", _to: "places/Leuchars", travelTime: 1.5 } , { _from: "places/Leuchars", _to: "places/Aberdeen", travelTime: 1 } , { _from: "places/Leuchars", _to: "places/Edinburgh", travelTime: 1.5 } , { _from: "places/Edinburgh", _to: "places/Leuchars", travelTime: 3 } , { _from: "places/Edinburgh", _to: "places/Glasgow", travelTime: 1 } , { _from: "places/Glasgow", _to: "places/Edinburgh", travelTime: 1 } , { _from: "places/Edinburgh", _to: "places/York", travelTime: 3.5 } , { _from: "places/York", _to: "places/Edinburgh", travelTime: 4 } , { _from: "places/Glasgow", _to: "places/Carlisle", travelTime: 1 } , { _from: "places/Carlisle", _to: "places/Glasgow", travelTime: 1 } , { _from: "places/Carlisle", _to: "places/York", travelTime: 2.5 } , { _from: "places/York", _to: "places/Carlisle", travelTime: 3.5 } , { _from: "places/Carlisle", _to: "places/Birmingham", travelTime: 2.0 } , { _from: "places/Birmingham", _to: "places/Carlisle", travelTime: 1 } , { _from: "places/Birmingham", _to: "places/London", travelTime: 1.5 } , { _from: "places/London", _to: "places/Birmingham", travelTime: 2.5 } , { _from: "places/Leuchars", _to: "places/StAndrews", travelTime: 0.2 } , { _from: "places/StAndrews", _to: "places/Leuchars", travelTime: 0.2 } , { _from: "places/York", _to: "places/London", travelTime: 1.8 } , { _from: "places/London", _to: "places/York", travelTime: 2.0 } , { _from: "places/London", _to: "places/Brussels", travelTime: 2.5 } , { _from: "places/Brussels", _to: "places/London", travelTime: 3.5 } , { _from: "places/Brussels", _to: "places/Cologne", travelTime: 2 } , { _from: "places/Cologne", _to: "places/Brussels", travelTime: 1.5 } ] INSERT c INTO "connections"
We can now explore the resulting graph in ArangoDB’s graph viewer:
Example Queries
Suppose we want to plan a trip from Leuchars to Cologne. If we want to know which connection has the fewest stations, we could use the following query:
FOR v,e IN OUTBOUND SHORTEST_PATH "places/Leuchars" TO "places/Cologne" GRAPH "TrainGraph" RETURN [v,e]
which gives the connection Leuchars, Edinburgh, York, London, Brussels, Cologne. We can obtain the same result by using K_SHORTEST_PATHS
with LIMIT 1
, which restricts the result to one returned path:
FOR path IN OUTBOUND K_SHORTEST_PATHS "places/Leuchars" TO "places/Cologne" GRAPH "TrainGraph" LIMIT 1 RETURN path
Note that K_SHORTEST_PATHS
returns every path as a JSON object with components vertices
, edges
, and weight
.
WARNING: you should always use LIMIT
with K_SHORTEST_PATHS
, as not giving a limit can lead to an expensive, exhaustive search through the graph)
Increasing the LIMIT
will give alternative routes that might be a little longer than the shortest route, for example
FOR path IN OUTBOUND K_SHORTEST_PATHS "places/Leuchars" TO "places/Cologne" GRAPH "TrainGraph" LIMIT 3 RETURN path
Here, we also get Leuchars, Edinburgh, York, Carlisle, Birmingham, London, Brussels, Cologne, and*Leuchars, Edinburgh, Glasgow, Carlisle, Birmingham, London, Brussels, Cologne.
What if we want to take travel time into account? We can tell K_SHORTEST_PATHS
the name of an attribute that contains the travel time!
FOR path IN OUTBOUND K_SHORTEST_PATHS "places/Leuchars" TO "places/Cologne" GRAPH "TrainGraph" OPTIONS { weightAttribute: "travelTime", defaultWeight: 1000 } LIMIT 3 RETURN path
Which returns the routes Leuchars, Edinburgh, York, London, Brussels, Cologne with weight 11.3,
Leuchars, Edinburgh, Glasgow, Carlisle, Birmingham, London, Brussels, Cologne with weight 11.5, and
Leuchars, Edinburgh, York, Carlisle, Birmingham, London, Brussels, Cologne with weight 12.3.
Should we try planning a route that does not exist, such as a trip from London to Toronto
FOR path IN OUTBOUND K_SHORTEST_PATHS "places/London" TO "places/Toronto" GRAPH "TrainGraph" OPTIONS { weightAttribute: "travelTime", defaultWeight: 1000 } LIMIT 3 RETURN path
We can get lucky and get the answer that there is no such route quickly, but we might also have to wait for a long time for ArangoDB to conclude that there is no such path: It might have to search the whole of the European train network, and the whole of the Canadian train network, to find that there is no train connection between those two.
Example Queries
k Shortest Paths are a useful tool when you want to query your graph database for alternative paths to the shortest path that are just deviating a bit in terms of cost.
The K_SHORTEST_PATHS
keyword also supports the INBOUND
and ANY
modifier to specify which direction edges should be considered in. You can find its documentation here.