skip to main content
research-article

Effective and efficient relational community detection and search in large dynamic heterogeneous information networks

Published: 01 June 2020 Publication History

Abstract

Community search in heterogeneous information networks (HINs) has attracted much attention in graph analysis. Given a vertex, the goal is to find a densely-connected sub-graph that contains the vertex. In practice, the user may need to restrict the number of connections between vertices, but none of the existing methods can handle such queries. In this paper, we propose the relational constraint that allows the user to specify fine-grained connection requirements between vertices. Base on this, we define the relational community as well as the problems of detecting and searching relational communities, respectively. For the detection problem, we propose an efficient solution that has near-linear time complexity. For the searching problem, although it is shown to be NP-hard and even hard-to-approximate, we devise two efficient approximate solutions. We further design the round index to accelerate the searching algorithm and show that it can handle dynamic graphs by its nature. Extensive experiments on both synthetic and real-world graphs are conducted to evaluate both the effectiveness and efficiency of our proposed methods.

References

[1]
E. Akbas and P. Zhao. Truss-based community search: A truss-equivalence based indexing approach. PVLDB, 10(11):1298--1309, 2017.
[2]
O. Amini, D. Peleg, S. Pérennes, I. Sau, and S. Saurabh. On the approximability of some degree-constrained subgraph problems. Discrete Applied Mathematics, 2012.
[3]
E. Camon, M. Magrane, D. Barrell, V. Lee, E. Dimmer, J. Maslen, D. Binns, N. Harte, R. Lopez, and R. Apweiler. The gene ontology annotation (goa) database: sharing knowledge in uniprot with gene ontology. Nucleic acids research, 2004.
[4]
L. Chang, X. Lin, L. Qin, J. X. Yu, and W. Zhang. Index-based optimal algorithms for computing steiner components with maximum connectivity. In ACM SIGMOD, 2015.
[5]
Y. Chen, Y. Fang, R. Cheng, Y. Li, X. Chen, and J. Zhang. Exploring communities in large profiled graphs. TKDE, 2018.
[6]
J. Cheng, Y. Ke, S. Chu, and M. T. Özsu. Efficient core decomposition in massive networks. In ICDE, 2011.
[7]
W. Cui, Y. Xiao, H. Wang, Y. Lu, and W. Wang. Online search of overlapping communities. In ACM SIGMOD, 2013.
[8]
W. Cui, Y. Xiao, H. Wang, and W. Wang. Local search of communities in large graphs. In ACM SIGMOD, 2014.
[9]
L. Duan, W. N. Street, Y. Liu, and H. Lu. Community detection in graphs through correlation. In SIGKDD, 2014.
[10]
M. Fabian, K. Gjergji, W. Gerhard, et al. Yago: A core of semantic knowledge unifying wordnet and wikipedia. In WWW, 2007.
[11]
Y. Fang, R. Cheng, Y. Chen, S. Luo, and J. Hu. Effective and efficient attributed community search. VLDBJ, 2017.
[12]
Y. Fang, Z. Wang, R. Cheng, X. Li, S. Luo, J. Hu, and X. Chen. On spatial-aware community search. TKDE, 2018.
[13]
Y. Fang, Z. Wang, R. Cheng, H. Wang, and J. Hu. Effective and efficient community search over large directed graphs. TKDE, 2018.
[14]
M. Girvan and M. E. Newman. Community structure in social and biological networks. PNAS, 2002.
[15]
F. M. Harper and J. A. Konstan. The movielens datasets: History and context. ACM TiiS, 2015.
[16]
J. Hopcroft and R. Tarjan. Algorithm 447: Efficient algorithms for graph manipulation. Commun. ACM, 1973.
[17]
J. Hu, R. Cheng, K. C.-C. Chang, A. Sankar, Y. Fang, and B. Y. Lam. Discovering maximal motif cliques in large heterogeneous information networks. In ICDE, 2019.
[18]
J. Hu, X. Wu, R. Cheng, S. Luo, and Y. Fang. On minimal steiner maximum-connected subgraph queries. TKDE, 2017.
[19]
X. Huang, H. Cheng, L. Qin, W. Tian, and J. X. Yu. Querying k-truss community in large and dynamic graphs. In ACM SIGMOD, 2014.
[20]
X. Huang and L. V. S. Lakshmanan. Attribute-driven community search. PVLDB, 10(9):949--960, 2017.
[21]
X. Huang, L. V. S. Lakshmanan, J. X. Yu, and H. Cheng. Approximate closest community search in networks. PVLDB, 9(4):276--287, 2015.
[22]
X. Jian, X. Lian, and L. Chen. On efficiently detecting overlapping communities over distributed dynamic graphs. In ICDE, 2018.
[23]
R. M. Karp. Reducibility among combinatorial problems. In Complexity of computer computations. 1972.
[24]
J. M. Kumpula, M. Kivelä, K. Kaski, and J. Saramäki. Sequential algorithm for fast clique percolation. Physical Review E, 2008.
[25]
J. Li, X. Wang, K. Deng, X. Yang, T. Sellis, and J. X. Yu. Most influential community search over large social networks. In ICDE, 2017.
[26]
R.-H. Li, L. Qin, F. Ye, J. X. Yu, X. Xiao, N. Xiao, and Z. Zheng. Skyline community search in multi-valued networks. In ACM SIGMOD, 2018.
[27]
R.-H. Li, L. Qin, J. X. Yu, and R. Mao. Finding influential communities in massive networks. VLDBJ, 2017.
[28]
R.-H. Li, J. Su, L. Qin, J. X. Yu, and Q. Dai. Persistent community search in temporal networks. In ICDE, 2018.
[29]
L. Qin, R.-H. Li, L. Chang, and C. Zhang. Locally densest subgraph discovery. In SIGKDD, 2015.
[30]
A. Ruepp, B. Brauner, I. Dunger-Kaltenbach, G. Frishman, C. Montrone, M. Stransky, B. Waegele, T. Schmidt, O. N. Doudieu, V. Stümpflen, et al. Corum: the comprehensive resource of mammalian protein complexes. Nucleic acids research, 2007.
[31]
J. Shang, J. Shen, L. Liu, and J. Han. Constructing and mining heterogeneous information networks from massive text. In SIGKDD, 2019.
[32]
C. Shi, X. Kong, P. S. Yu, S. Xie, and B. Wu. Relevance search in heterogeneous networks. In EDBT, 2012.
[33]
A. Spitz, D. Costa, K. Chen, J. Greulich, J. Geiß, S. Wiesberg, and M. Gertz. Heterogeneous subgraph features for information networks. In ACM GRADES, 2018.
[34]
Y. Sun, C. C. Aggarwal, and J. Han. Relation strength-aware clustering of heterogeneous information networks with incomplete attributes. PVLDB, 5(5):394--405, 2012.
[35]
Y. Sun, J. Han, X. Yan, P. S. Yu, and T. Wu. Pathsim: Meta path-based top-k similarity search in heterogeneous information networks. PVLDB, 4(11):992--1003, 2011.
[36]
Y. Sun, B. Norick, J. Han, X. Yan, P. S. Yu, and X. Yu. Pathselclus: Integrating meta-path selection with user-guided object clustering in heterogeneous information networks. TKDD, 2013.
[37]
C. Wang, Y. Song, A. El-Kishky, D. Roth, M. Zhang, and J. Han. Incorporating world knowledge to document clustering via heterogeneous information networks. In SIGKDD, 2015.
[38]
J. Wang and J. Cheng. Truss decomposition in massive networks. PVLDB, 5(9):812--823, 2012.
[39]
K. Wang, X. Cao, X. Lin, W. Zhang, and L. Qin. Efficient computing of radius-bounded k-cores. In ICDE, 2018.
[40]
Y. Wang, X. Jian, Z. Yang, and J. Li. Query optimal k-plex based community in graphs. DSE, 2017.
[41]
I. Xenarios, L. Salwinski, X. J. Duan, P. Higney, S.-M. Kim, and D. Eisenberg. Dip, the database of interacting proteins: a research tool for studying cellular networks of protein interactions. Nucleic acids research, 2002.
[42]
J. Xie and B. K. Szymanski. Towards linear time overlapping community detection in social networks. In PAKDD, 2012.
[43]
L. Yuan, L. Qin, W. Zhang, L. Chang, and J. Yang. Index-based densest clique percolation community search in networks. TKDE, 2017.
[44]
Y. Zhou, H. Cheng, and J. X. Yu. Graph clustering based on structural/attribute similarities. PVLDB, 2(1):718--729, 2009.

Cited By

View all
  1. Effective and efficient relational community detection and search in large dynamic heterogeneous information networks

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image Proceedings of the VLDB Endowment
      Proceedings of the VLDB Endowment  Volume 13, Issue 10
      June 2020
      193 pages
      ISSN:2150-8097
      Issue’s Table of Contents

      Publisher

      VLDB Endowment

      Publication History

      Published: 01 June 2020
      Published in PVLDB Volume 13, Issue 10

      Qualifiers

      • Research-article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)65
      • Downloads (Last 6 weeks)7
      Reflects downloads up to 22 Dec 2024

      Other Metrics

      Citations

      Cited By

      View all

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media