skip to main content
research-article

Finding Subgraphs with Maximum Total Density and Limited Overlap in Weighted Hypergraphs

Published: 12 February 2024 Publication History

Abstract

Finding dense subgraphs in large (hyper)graphs is a key primitive in a variety of real-world application domains, encompassing social network analytics, event detection, biology, and finance. In most such applications, one typically aims at finding several (possibly overlapping) dense subgraphs, which might correspond to communities in social networks or interesting events. While a large amount of work is devoted to finding a single densest subgraph, perhaps surprisingly, the problem of finding several dense subgraphs in weighted hypergraphs with limited overlap has not been studied in a principled way, to the best of our knowledge. In this work, we define and study a natural generalization of the densest subgraph problem in weighted hypergraphs, where the main goal is to find at most k subgraphs with maximum total aggregate density, while satisfying an upper bound on the pairwise weighted Jaccard coefficient, i.e., the ratio of weights of intersection divided by weights of union on two nodes sets of the subgraphs. After showing that such a problem is NP-Hard, we devise an efficient algorithm that comes with provable guarantees in some cases of interest, as well as, an efficient practical heuristic. Our extensive evaluation on large real-world hypergraphs confirms the efficiency and effectiveness of our algorithms.

References

[1]
Reid Andersen and Kumar Chellapilla. 2009. Finding dense subgraphs with size bounds. In Proceedings of the WAW.
[2]
Albert Angel, Nikos Sarkas, Nick Koudas, and Divesh Srivastava. 2012. Dense subgraph maintenance under streaming edge weight updates for real-time story identification. Proc. VLDB Endow. 5, 6 (2012).
[3]
Yuichi Asahiro, Refael Hassin, and Kazuo Iwama. 2002. Complexity of finding dense subgraphs. Discr. Appl. Math. 121, 1-3 (2002).
[4]
Yuichi Asahiro, Kazuo Iwama, Hisao Tamaki, and Takeshi Tokuyama. 2000. Greedily finding a dense subgraph. J. Algorithms 34, 2 (2000).
[5]
Bahman Bahmani, Ravi Kumar, and Sergei Vassilvitskii. 2012. Densest subgraph in streaming and MapReduce. Proc. VLDB Endow. 5, 5 (2012).
[6]
Oana Denisa Balalau, Francesco Bonchi, T.-H. Hubert Chan, Francesco Gullo, and Mauro Sozio. 2015. Finding subgraphs with maximum total density and limited overlap. In Proceedings of the WSDM. ACM, 379–388.
[7]
Jason Baumgartner, Savvas Zannettou, Brian Keegan, Megan Squire, and Jeremy Blackburn. 2020. The pushshift reddit dataset. In Proceedings of the ICWSM, Vol. 14. 830–839.
[8]
Suman K. Bera, Sayan Bhattacharya, Jayesh Choudhari, and Prantar Ghosh. 2022. A new dynamic algorithm for densest subhypergraphs. In Proceedings of the WWW. ACM, 1093–1103.
[9]
Aditya Bhaskara, Moses Charikar, Eden Chlamtac, Uriel Feige, and Aravindan Vijayaraghavan. 2010. Detecting high log-densities: An O(n\({}^{\mbox{1/4}}\)) approximation for densest k-subgraph. In Proceedings of the STOC. 201–210.
[10]
Sayan Bhattacharya, Monika Henzinger, Danupon Nanongkai, and Charalampos E. Tsourakakis. 2015. Space- and time-efficient algorithm for maintaining dense subgraphs on one-pass dynamic streams. In Proceedings of the 47th STOC, Rocco A. Servedio and Ronitt Rubinfeld (Eds.). ACM, 173–182.
[11]
Francesco Bonchi, Francesco Gullo, Andreas Kaltenbrunner, and Yana Volkovich. 2014. Core decomposition of uncertain graphs. In Proceedings of the KDD.
[12]
Digvijay Boob, Yu Gao, Richard Peng, Saurabh Sawlani, Charalampos E. Tsourakakis, Di Wang, and Junxing Wang. 2020. Flowless: Extracting densest subgraphs without flow computations. In Proceedings of the WWW. ACM / IW3C2, 573–583.
[13]
Moses Charikar. 2000. Greedy approximation algorithms for finding dense components in a graph. In Proceedings of the APPROX, Klaus Jansen and Samir Khuller (Eds.). Springer.
[14]
Chandra Chekuri, Kent Quanrud, and Manuel R. Torres. 2022. Densest subgraph: Supermodularity, iterative peeling, and flow. In Proceedings of the SODA. SIAM, 1531–1555.
[15]
Jie Chen and Yousef Saad. 2012. Dense subgraph extraction with application to community detection. Trans. Knowl. Data Eng. 24, 7 (2012).
[16]
Eden Chlamtác, Michael Dinitz, Christian Konrad, Guy Kortsarz, and George Rabanca. 2018. The densest k-subhypergraph problem. SIAM J. Discret. Math. 32, 2 (2018), 1458–1477.
[17]
Wanyun Cui, Yanghua Xiao, Haixun Wang, Yiqi Lu, and Wei Wang. 2013. Online search of overlapping communities. In Proceedings of the SIGMOD.
[18]
Maximilien Danisch, T.-H. Hubert Chan, and Mauro Sozio. 2017. Large scale density-friendly graph decomposition via convex programming. In Proceedings of the WWW. ACM, 233–242.
[19]
Riccardo Dondi, Mohammad Mehdi Hosseinzadeh, Giancarlo Mauri, and Italo Zoppis. 2021. Top-k overlapping densest subgraphs: Approximation algorithms and computational complexity. J. Comb. Optim. 41, 1 (2021), 80–104.
[20]
Xiaoxi Du, Ruoming Jin, Liang Ding, Victor E. Lee, and John H. Thornton, Jr.2009. Migration motif: A spatial - temporal pattern mining approach for financial markets. In Proceedings of the KDD.
[21]
Alessandro Epasto, Silvio Lattanzi, and Mauro Sozio. 2015. Efficient densest subgraph computation in evolving graphs. In Proceedings of the 24th WWW, Aldo Gangemi, Stefano Leonardi, and Alessandro Panconesi (Eds.). ACM, 300–310.
[22]
Eugene Fratkin, Brian T. Naughton, Douglas L. Brutlag, and Serafim Batzoglou. 2006. MotifCut: Regulatory motifs finding with maximum density subgraphs. In Proceedings of the ISMB.
[23]
Esther Galbrun, Aristides Gionis, and Nikolaj Tatti. 2016. Top-k overlapping densest subgraphs. Data Min. Knowl. Discov. 30, 5 (2016), 1134–1165.
[24]
David Gibson, Ravi Kumar, and Andrew Tomkins. 2005. Discovering large dense subgraphs in massive graphs. In Proceedings of the VLDB.
[25]
A. V. Goldberg. 1984. Finding a Maximum Density Subgraph. Technical Report. University of California at Berkeley.
[26]
Shuguang Hu, Xiaowei Wu, and T.-H. Hubert Chan. 2017. Maintaining densest subsets efficiently in evolving hypergraphs. In Proceedings of the CIKM. ACM, 929–938.
[27]
Tommi A. Junttila and Petteri Kaski. 2007. Engineering an efficient canonical labeling tool for large and sparse graphs. In Proceedings of the ALENEX. SIAM.
[28]
Subhash Khot. 2006. Ruling out PTAS for graph min-bisection, dense \(k\)-subgraph, and bipartite clique. J. Comput. 36, 4 (2006).
[29]
Samir Khuller and Barna Saha. 2009. On finding dense subgraphs. In Proceedings of the ICALP.
[30]
Aritra Konar and Nicholas D. Sidiropoulos. 2021. Exploring the subgraph density-size trade-off via the Lovaśz extension. In Proceedings of the WSDM. ACM, 743–751.
[31]
Tommaso Lanciano, Atsushi Miyauchi, Adriano Fazzone, and Francesco Bonchi. 2023. A survey on the densest subgraph problem and its variants. Retrieved from https://arXiv:2303.14467.
[32]
Michael A. Langston and et al.2005. A combinatorial approach to the analysis of differential gene expression data: The use of graph algorithms for disease prediction and screening. In Methods of Microarray Data Analysis IV. Springer, Berlin.
[33]
Victor E. Lee, Ning Ruan, Ruoming Jin, and Charu C. Aggarwal. 2010. A survey of algorithms for dense subgraph discovery. In Managing and Mining Graph Data. Springer, Berlin.
[34]
Hairong Liu, Longin Jan Latecki, and Shuicheng Yan. 2015. Dense subgraph partition of positive hypergraphs. IEEE Trans. Pattern Anal. Mach. Intell. 37, 3 (2015), 541–554.
[35]
Chenhao Ma, Yixiang Fang, Reynold Cheng, Laks V. S. Lakshmanan, Wenjie Zhang, and Xuemin Lin. 2021. Efficient directed densest subgraph discovery. SIGMOD Rec. 50, 1 (2021), 33–40.
[36]
Muhammad Anis Uddin Nasir, Aristides Gionis, Gianmarco De Francisci Morales, and Sarunas Girdzijauskas. 2017. Fully dynamic algorithm for top-k densest subgraphs. In Proceedings of the CIKM. ACM, 1817–1826.
[37]
Christos H. Papadimitriou and Mihalis Yannakakis. 1991. Optimization, approximation, and complexity classes. J. Comput. Syst. Sci. 43, 3 (1991).
[38]
Mauro Sozio and Aristides Gionis. 2010. The community-search problem and how to plan a successful cocktail party. In Proceedings of the KDD. 939–948.
[39]
Bintao Sun, Maximilien Danisch, T.-H. Hubert Chan, and Mauro Sozio. 2020. KClist++: A simple algorithm for finding k-clique densest subgraphs in large graphs. Proc. VLDB Endow. 13, 10 (2020), 1628–1640.
[40]
Nikolaj Tatti and Aristides Gionis. 2013. Discovering nested communities. In Proceedings of the ECML/PKDD (2).
[41]
Nikolaj Tatti and Aristides Gionis. 2015. Density-friendly graph decomposition. In Proceedings of the WWW. ACM, 1089–1099.
[42]
Charalampos Tsourakakis. 2015. The k-clique densest subgraph problem. In Proceedings of the 24th WWW. 1122–1132.
[43]
Charalampos Tsourakakis, Francesco Bonchi, Aristides Gionis, Francesco Gullo, and Maria Tsiarli. 2013. Denser than the densest subgraph: Extracting optimal quasi-cliques with quality guarantees. In Proceedings of the KDD.
[44]
Charalampos E. Tsourakakis. 2015. The K-clique densest subgraph problem. In Proceedings of the WWW. ACM, 1122–1132.
[45]
Charalampos E. Tsourakakis, Tianyi Chen, Naonori Kakimura, and Jakub Pachocki. 2019. Novel dense subgraph discovery primitives: Risk aversion and exclusion queries. In Proceedings of the ECML/PKDD (1) (Lecture Notes in Computer Science), Vol. 11906. Springer, 378–394.
[46]
Elena Valari, Maria Kontaki, and Apostolos N. Papadopoulos. 2012. Discovery of top-k dense subgraphs in dynamic graph collections. In Proceedings of the SSDBM.
[47]
Nan Wang, Jingbo Zhang, Kian-Lee Tan, and Anthony K. H. Tung. 2010. On triangulation-based dense neighborhood graph discovery. Proc. VLDB Endow. 4, 2 (2010).

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Knowledge Discovery from Data
ACM Transactions on Knowledge Discovery from Data  Volume 18, Issue 4
May 2024
707 pages
EISSN:1556-472X
DOI:10.1145/3613622
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 February 2024
Online AM: 02 January 2024
Accepted: 15 December 2023
Revised: 27 October 2023
Received: 16 June 2022
Published in TKDD Volume 18, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Densest subgraphs
  2. weighted hypergraphs

Qualifiers

  • Research-article

Funding Sources

  • Hong Kong RGC
  • French National Agency (ANR)

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 334
    Total Downloads
  • Downloads (Last 12 months)334
  • Downloads (Last 6 weeks)24
Reflects downloads up to 01 Jan 2025

Other Metrics

Citations

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media