WWW2021

Sketch-based Algorithms for Approximate Shortest Paths in Road Networks

Gaurav Aggarwal, Sreenivas Gollapudi, Raghavender, Ali Kemal Sinop

被引用 6 次

摘要

Computing distances and finding shortest paths in massive realworld networks is a fundamental algorithmic task in network analysis. There are two main approaches to solving this task. On one hand are traversal-based algorithms like bidirectional breadth-first search (BiBFS), which have no preprocessing step but are slow on individual distance inquiries. On the other hand are indexing-based approaches, which create and maintain a large index. This allows for answering individual inquiries very fast; however, index creation is prohibitively expensive even for moderately large networks. For a graph with 30 million edges, the index created by the state-of-the-art is about 40 gigabytes. We seek to bridge these two extremes: quickly answer distance inquiries without the need for costly preprocessing. In this work, we propose a new algorithm and data structure, WormHole, for approximate shortest path computations. WormHole leverages structural properties of social networks to build a sublinearly sized index, drawing upon the explicit coreperiphery decomposition of Ben-Eliezer et al. [WSDM'22]. Empirically, the preprocessing time of WormHole improves upon index-based solutions by orders of magnitude: for a graph with over a billion edges, indexing takes only a few minutes. Furthermore, individual inquiries are consistently much faster than in BiBFS. The acceleration comes at the cost of a minor accuracy trade-off. Nonetheless, our empirical evidence demonstrates that WormHole accurately answers essentially all inquiries within a maximum additive error of 2 (and an average additive error usually much less than 1). We complement these empirical results with provable theoretical guarantees, showing that WormHole, utilizing a sublinear index, requires 𝑛 𝑜 (1) node queries per distance inquiry in random power-law networks. In contrast, any approach without a preprocessing step (including BiBFS) requires 𝑛 Ω (1) queries for the same task. WormHole offers several additional advantages over existing methods: (i) it does not require reading the whole graph and can thus be used in settings where access to the graph is ratelimited; (ii) unlike the vast majority of index-based algorithms, it returns paths, not just distances; and (iii) for faster inquiry times, it can be combined effectively with other index-based solutions, by running them only on the sublinear core.