## In Transit to Constant Time Shortest-Path Queries in Road Networks. Bast, Funke, Matijevic, Sanders, Schultes. Algorithm Engineering and Experiments (ALENEX)

1. Beats previous best results by two orders of magnitude <was also published in Science, but the article is so short I’m not taking notes on it>
2. In worst-case Dijkstras cannot be below linear because it may have to examine all nodes and vertices
1. The only way to beat this is by preprocessing
3. Constant query time can be achieved by superlinear storage <Upon looking up that reference, it really looks linear.  The response may be slightly suboptimal (but this amount is constant in the size of the graph)>
4. The improvements in quality and speed here exploit the fact that roads have particular properties, like low branching factors, fact that highways are designed to be used for long distance travel…
5. Idea of transit node is “that for every pair of nodes that are ‘not too close’ to each other, the shortest path between them passes through at least one of these transit nodes.”
1. Additionally, even when travelling very far, the number of transit nodes that must be used is very small (like 10)
6. In the US road data they consider, 24 million nodes, and 10,000 transit nodes
7. Example (paper has figure): “Finding the optimal travel time between two points (ﬂags) somewhere between Saarbr¨ucken and Karlsruhe amounts to retrieving the 2 × 4 access nodes (diamonds), performing 16 table lookups between all pairs of access nodes, and checking that the two disks deﬁning the locality ﬁlter do not overlap. Transit nodes that are not relevant for the depicted query are drawn as small squares.”
8. Mentions that there are special classes of algorithms for working on planar graphs, which roads (basically) are
9. They have a previous paper that hierarchically builds paths based on increasingly long paths <should check out http://algo2.iti.kit.edu/schultes/hwy/esa06HwyHierarchies.pdf>
1. This paper is by the same authors, and ends up doing something very similar to transit nodes (they are just nodes high in the hierarchy).  In the earlier paper access nodes were computed on the fly whereas here they are precomputed
2. It also doesn’t deal with how to figure out the distance in the distance table is indeed the shortest distance
10. A* is actually one of the best methods for real world planning (when done around landmarks and the the triangle inequality).  Here, access nodes can be used as the landmarks
11. The hierarchy approach also needs to be grounded in actual geographic information to work <at small scales?>
12. Nodes are labeled either close/not close to individual access nodes
13. Access nodes are found by backwards search by Dijkstra’s, and do this until all paths converge on another transit node, then take all nodes found through this search that do not reach another transit node
14. There are a number of ways to implement a locality filter; the one they recommend is to examine everything that is reachable within a cluster in the hierarchy
15. This information allows for what is effectively constant time queries for shortest paths
16. There is another way to implement the algorithm that is grid based – uses 2 grids of different resolution
1. Needs some optimization to function efficiently
2. Computing access nodes in grid-based implementation is very simple
17. With the optimizations, long-distance queries via transit nodes is actually faster (1 million x) than local queries.  Although on average local queries that dont go through access nodes are about 1% of queries <not sure how this is measured>  it still dominates the running time
1. There is a scheme also that allows speedups for local queries as well
18. In the grid setting, there are tradeoffs between space, number of transit nodes <both increasing with resolution> and #local queries that don’t go through transit nodes <decreasing with resolution>
20. The exact algorithm used to compute access nodes and transit nodes is confusing, and their diagram doesn’t help either.  Its written in English but would be much more understandable written in math (or at least pseudocode)
21. Queries are super-easy to run (if nodes are more than 4 cells apart, path must go through a transit node, which already have pairwise distances computed)
22. Can do hierarchical <seems to be just multiple griddings of different resolutions but I think there is another paper that may go into it more thoroughly>
23. Do a greedy set cover to optimize storage for access nodes
24. Implemented on data with 24m nodes, 60m edges
25. Run on a ’07 dual-proc machine, 99% of queries have an avg time of 12 microseconds (adding the extra 1% moves it to 63 msec – are those the ones that don’t go through transit nodes?)
26. Computation times are not “Dijkstra rank dependent”
27. <Ok that’s not the end of the paper – just for the grid part, now moving on to a more fully hierarchical method>
28. Highway hierarchies
29. An edge is a highway edge is an edge that is in the shortest path between two vertices, but not in the local neighborhood of those vertices
30. They also may contract edges based on a bypassbility criterion, but that isn’t yet discussed at length
31. The hierarchy is then constructed by recursively building highway networks on top of each preceeding layer
32. With increasing path distance, paths will go increasingly higher in the highway hierarchy
33. So the nodes existing at some level in the hierarchy can be used as the transit nodes
34. #transit nodes controlled by neighborhood size and level of hierarchy
35. “Note that there is a difference between the level of the highway hierarchy and the layer of transit node search” <on this read through the distinction is actually not clear to me – looks like they are basically equivalent but in reverse order?>
36. There is a gloss over how description of how access nodes are found – another paper is referenced <which I will try and get to> and point #20 is referenced
37. Present 2 methods of the algorithm, which have different tradeoffs in terms of space, preprocessing, and query time
38. Search is done top-down
39. “For a given node pair (s, t), in order to get a complete description of the shortest s-t-path, we ﬁrst perform a transit node query and determine the layer i that is used to obtain the shortest path distance. Then, we have to determine the path from s to the forward access node u to layer i, the path from the backward access node v to t, and the path from u to v”
40. There is still room for improvement for the locality filters
41. For US data, 99% of cases are handled at the top layer
42. Search times start low and increase as searches are still local but further away, and then drop extremely low once nodes are far enough that transit nodes can be leveraged
43. <On to conclusion>
44. “Building on highway hierarchies, this can be achieved using a moderate amount of additional storage and precomputation but with an extremely low query time. The geometric grid approach on the other hand allows for very low space consumption at the cost of  slightly higher preprocessing and query times.”
45. “There are many interesting ways to choose transit nodes. For example nodes with high node reach [9, 6] could be a good starting point. Here, we can directly inﬂuence |T |, and the resulting reach bound might help deﬁning a simple locality ﬁlter. However, it seems thatgeometric reach or travel time reach do not reﬂect the inhomogeneous density of real world road networks. Hence, it would be interesting if we could eﬃciently approximate reach based on the Dijkstra rank.
Another interesting approach might be to start with some locality ﬁlter that guarantees uniformly small local searches and to view it as an optimisation problem to choose a small set of transit nodes that cover all the local search spaces.”