skip to main content
article

Hub Labels on the database for large-scale graphs with the COLD framework

Published: 01 October 2017 Publication History

Abstract

Shortest-path computation on graphs is one of the most well-studied problems in algorithmic theory. An aspect that has only recently attracted attention is the use of databases in combination with graph algorithms, so-called distance oracles, to compute shortest-path queries on large graphs. To this purpose, we propose a novel, efficient, pure-SQL framework for answering exact distance queries on large-scale graphs, implemented entirely on an open-source database engine. Our COLD framework (COmpressed Labels on the Database) can answer multiple distance queries (vertex-to-vertex, one-to-many, k-Nearest Neighbors, Reverse k-Nearest Neighbors, Reverse k-Farthest Neighbors and Top-k Range) not handled by previous methods, rendering it a complete database solution for a variety of practical large-scale graph applications. Our experimentation shows that COLD outperforms existing approaches (including popular graph databases) in terms of query time and efficiency, while requiring significantly less storage space than these methods.

References

[1]
Abraham I, Delling D, Fiat A, Goldberg AV, Werneck RF (2012) Hldb: Location-based services in databases. In: Proceedings of the 20th International Conference on Advances in Geographic Information Systems, pp 339---348
[2]
Abraham I, Delling D, Goldberg AV, Werneck RF (2011) A hub-based labeling algorithm for shortest paths in road networks. In: Proc. 10th International Symposium on Experimental Algorithms (SEA), pp 230---241
[3]
Abraham I, Delling D, Goldberg AV, Werneck RF (2012) Hierarchical hub labelings for shortest paths. In: Proc. 20th Annual European Symposium on Algorithms (ESA), pp 24---35
[4]
Afshani P, Brodal GS, Zeh N (2011) Ordered and unordered top-k range reporting in large data sets. In: Proc. Twenty-second Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp 390---400
[5]
Akiba T, Iwata Y, Kawarabayashi K, Kawata Y (2014) Fast shortest-path distance queries on road networks by pruned highway labeling. In: Proc. 16th Workshop on Algorithm Engineering and Experiments (ALENEX), pp 147---154
[6]
Akiba T, Iwata Y, Yoshida Y (2013) Fast exact shortest-path distance queries on large networks by pruned landmark labeling. In: Proc. ACM SIGMOD International Conference on Management of Data, pp 349---360
[7]
Akiba T, Iwata Y, Yoshida Y (2015) Pruned landmark labeling. https://github.com/iwiwi/pruned-landmark-labeling
[8]
Albert R, Jeong H, Barabási A-L (1999) The diameter of the world wide web. CoRR. arXiv:cond-mat/9907038
[9]
Bader DA, Meyerhenke H, Sanders P, Wagner D (eds) (2013) Proceedings of the 10th DIMACS Implementation Challenge Workshop Graph Partitioning and Graph Clustering
[10]
Bast H, Delling D, Goldberg AV, Muller-Hannemann M, Pajor T, Sanders P, Wagner D, Werneck RF (2015) Route planning in transportation networks. CoRR. arXiv:abs/1504.05140
[11]
Borutta F, Nascimento MA, Niedermayer J, Kröger P (2014) Monochromatic rknn queries in time-dependent road networks. In: Proc. Third ACM SIGSPATIAL International Workshop on Mobile Geographic Information Systems, pp 26---33
[12]
Cheema MA, Shen Z, Lin X, Zhang W (2014) A unified framework for efficiently processing ranking related queries. In: Proc. 17th International Conference on Extending Database Technology (EDBT), pp 427---438
[13]
Cho E, Myers SA, Leskovec J (2011) Friendship and mobility: user movement in location-based social networks. In: Proc. of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp 1082---1090
[14]
Cohen E, Halperin E, Kaplan H, Zwick U (2002) Reachability and distance queries via 2-hop labels. In: Proc. 13th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp 937---946
[15]
Delling D, Dibbelt J, Pajor T, Werneck R (2015) Public transit labeling. In: Proc. 14th International Symposium on Experimental Algorithms(SEA), pp 273---285
[16]
Delling D, Goldberg AV, Pajor T, Werneck RF (2011) Customizable route planning. In: Proc. 10th International Conference on Experimental Algorithms (SEA), pp 376---387
[17]
Delling D, Goldberg AV, Pajor T, Werneck RF (2014) Robust distance queries on massive networks. In: Proc. 22th Annual European Symposium on Algorithms (ESA), pp 321---333
[18]
Delling D, Goldberg AV, Werneck R (2011) Faster batched shortest paths in road networks. In: Proc. 11th Workshop on Algorithmic Approaches for Transportation Modeling, Optimization, and Systems (ATMOS)
[19]
Delling D, Goldberg AV, Werneck RF (2013) Hub label compression. In: Proc. 12th International Symposium on Experimental Algorithms (SEA), pp 18---29
[20]
Delling D, Werneck R (2015) Customizable point-of-interest queries in road networks. IEEE Trans Knowl Data Eng 27(3):686---698
[21]
Delling D, Werneck RFF (2012) Better bounds for graph bisection. In: Proc. 20th Annual European Symposium on Algorithms (ESA), pp 407---418
[22]
Efentakis A (2016) Scalable public transportation queries on the database. In: Proc. 19th International Conference on Extending Database Technology (EDBT), pp 527---538
[23]
Efentakis A, Efstathiades C, Pfoser D (2015) COLD. revisiting hub labels on the database for large-scale graphs. In: Proc. 14th International Symposium on Advances in Spatial and Temporal Databases (SSTD), pp 22---39
[24]
Efentakis A, Pfoser D (2013) Optimizing landmark-based routing and preprocessing. In: Proc. 6th ACM SIGSPATIAL International Workshop on Computational Transportation Science (CTS)
[25]
Efentakis A, Pfoser D (2014) GRASP. extending graph separators for the single-source shortest-path problem. In: Proc. 22th Annual European Symposium on Algorithms (ESA), pp 358---370
[26]
Efentakis A, Pfoser D (2016) Rehub: Extending hub labels for reverse k-nearest neighbor queries on large-scale networks. J. Exp. Algorithmics 21:1.13:1---1.13:35
[27]
Efentakis A, Pfoser D, Vassiliou Y (2015) Salt.aunifiedframeworkforallshortest-path query variants on road networks. In: Proc. 14th International Symposium on Experimental Algorithms (SEA)), pp 298---311
[28]
Gavoille C, Peleg D, Pérennes S, Raz R (2001) Distance labeling in graphs. In: Proc. Twelfth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), SODA '01, pp 210---219
[29]
Gavoille C, Peleg D, Pérennes S, Raz R (2004) Distance labeling in graphs. J. Algorithms 53(1):85---112
[30]
Geisberger R, Sanders P, Schultes D (2008) Better approximation of betweenness centrality. In: Proc. 10th Workshop on Algorithm Engineering and Experiments (ALENEX), pp 90---100
[31]
Geisberger R, Sanders P, Schultes D, Delling D (2008) Contraction hierarchies: Faster and simpler hierarchical routing in road networks. In: Proc. 7th International Workshop on Experimental Algorithms (WEA), pp 319---333
[32]
Hung H-P, Chuang K-T, Chen M-S (2007) Efficient process of top-k range-sum queries over multiple streams with minimized global error, pp 1404---1419
[33]
Jiang M, Fu AW, Wong RC, Xu Y (2014) Hop doubling label indexing for point-to-point distance querying on scale-free networks. PVLDB 7(12):1203---1214
[34]
Kumar Y, Janardan R, Gupta P (2008) Efficient algorithms for reverse proximity query problems. In: Proc. 16th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems, pp 39:1---39:10
[35]
Leskovec J, Krevl A (2014) SNAP Datasets: Stanford large network dataset collection. http://snap.stanford.edu/data
[36]
Leskovec J, Lang KJ, Dasgupta A, Mahoney MW (2009) Community structure in large networks: Natural cluster sizes and the absence of large well-defined clusters. Internet Math 6(1):29---123
[37]
Liao B, LHU, Yiu ML, Gong Z (2015) Beyond millisecond latency knn search on commodity machine. IEEE Trans Knowl Data Eng 27(10):2618---2631
[38]
Liu J, Chen H, Furuse K, Kitagawa H (2010) An efficient algorithm for reverse furthest neighbors query with metric index. In: Proc. 21st International Conference on Database and Expert Systems Applications (DEXA): Part II, pp 437---451
[39]
Luo Z, Ling TW, Ang C-H, Lee SY, Cui B (2001) Range top/bottom k queries in olap sparse data cubes. In: Proc. 12th International Conference on Database and Expert Systems Applications (DEXA), pp 678---687
[40]
McAuley JJ, Leskovec J (2012) Learning to discover social circles in ego networks. In: Proc. 26th Annual Conference on Neural Information Processing Systems, pp 548---556
[41]
PostgreSQL (2016) The world's most advanced open source database. http://www.postgresql.org/
[42]
Safar M, Ibrahimi D, Taniar D (2009) Voronoi-based reverse nearest neighbor query processing on spatial networks. Multimedia Systems 15(5):295---308
[43]
Sankaranarayanan J, Samet H (2010) Query processing using distance oracles for spatial networks. IEEE Trans Knowl Data Eng 22(8):1158---1175
[44]
Sheng C, Tao Y (2012) Dynamic top-k range reporting in external memory. In: Proc. 31st ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS), pp 121---130
[45]
Tao Y (2014) A dynamic i/o-efficient structure for one-dimensional top-k range reporting. In: Proc. 33rd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS), pp 256---265
[46]
Tran QT, Taniar D, Safar M (2009) Transactions on large-scale data- and knowledge-centered systems i. chapter Reverse K Nearest Neighbor and Reverse Farthest Neighbor Search on Spatial Networks, pp 353---372. Springer-Verlag
[47]
Wang S, Cheema MA, Lin X, Zhang Y, Liu D (2016) Efficiently computing reverse k furthest neighbors. In: Proc. 32nd IEEE International Conference on Data Engineering (ICDE), pp 1110---1121
[48]
Wang S, Lin W, Yang Y, Xiao X, Zhou S (2015) Efficient route planning on public transportation networks: A labelling approach. In: Proc. 2015 ACM SIGMOD International Conference on Management of Data, pp 967---982
[49]
Yang J, Leskovec J (2012) Defining and evaluating network communities based on ground-truth. In: Proc. 12th IEEE International Conference on Data Mining (ICDM), pp 745---754
[50]
Yiu ML, Papadias D, Mamoulis N, Tao Y (2006) Reverse nearest neighbors in large graphs. IEEE Trans Knowl Data Eng 18(4):540---553
[51]
Zhong R, Li G, Tan K-L, Zhou L (2013) G-tree: An efficient index for knn search on road networks. In: Proc. 22nd ACM International Conference on Conference on Information Knowledge Management (CIKM), pp 39---48. ACM

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Geoinformatica
Geoinformatica  Volume 21, Issue 4
October 2017
197 pages

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 01 October 2017

Author Tags

  1. Databases
  2. Hub labels
  3. K-nearest neighbor
  4. Large-scale graphs
  5. One-to-many
  6. Query processing
  7. Reverse k-farthest neighbor
  8. Reverse k-nearest neighbor
  9. Shortest-paths
  10. Top-k range
  11. kNN

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 23 Jan 2025

Other Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media