skip to main content
research-article

A Counting-based Approach for Efficient k-Clique Densest Subgraph Discovery

Published: 30 May 2024 Publication History

Abstract

Densest subgraph discovery (DSD) is a fundamental topic in graph mining. It has been extensively studied in the literature and has found many real applications in a wide range of fields, such as biology, finance, and social networks. As a typical problem of DSD, the k-clique densest subgraph (CDS) problem aims to detect a subgraph from a graph, such that the ratio of the number of k-cliques over the number of its vertices is maximized. This problem has received plenty of attention in the literature, and is widely used in identifying larger ''near-cliques''. Existing CDS solutions, either k-core or convex programming based solutions, often need to enumerate almost all the k-cliques, which is very inefficient because real-world graphs usually have a vast number of k-cliques. To improve the efficiency, in this paper, we propose a novel framework based on the Frank-Wolfe algorithm, which only needs k-clique counting, rather than k-clique enumeration, where the former one is often much faster than the latter one. Based on the framework, we develop an efficient approximation algorithm, by employing the state-of-the-art k-clique counting algorithm and proposing some optimization techniques. We have performed extensive experimental evaluation on 14 real-world large graphs and the results demonstrate the high efficiency of our algorithms. Particularly, our algorithm is up to seven orders of magnitude faster than the state-of-the-art algorithm with the same accuracy guarantee.

References

[1]
Aris Anagnostopoulos, Luca Becchetti, Adriano Fazzone, Cristina Menghini, and Chris Schwiegelshohn. 2020. Spectral relaxations and fair densest subgraphs. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management. 35--44.
[2]
Venkat Anantharam and Justin Salez. 2016. The densest subgraph problem in sparse random graphs. (2016).
[3]
Reid Andersen and Kumar Chellapilla. 2009. Finding dense subgraphs with size bounds. In International workshop on algorithms and models for the web-graph. Springer, 25--37.
[4]
Albert Angel, Nick Koudas, Nikos Sarkas, Divesh Srivastava, Michael Svendsen, and Srikanta Tirthapura. 2014. Dense subgraph maintenance under streaming edge weight updates for real-time story identification. The VLDB journal 23 (2014), 175--199.
[5]
Yuichi Asahiro, Refael Hassin, and Kazuo Iwama. 2002. Complexity of finding dense subgraphs. Discrete Applied Mathematics 121, 1--3 (2002), 15--26.
[6]
Bahman Bahmani, Ravi Kumar, and Sergei Vassilvitskii. 2012. Densest subgraph in streaming and mapreduce. arXiv preprint arXiv:1201.6567 (2012).
[7]
Oana Denisa Balalau, Francesco Bonchi, TH Hubert Chan, Francesco Gullo, and Mauro Sozio. 2015. Finding subgraphs with maximum total density and limited overlap. In Proceedings of the Eighth ACM International Conference on Web Search and Data Mining. 379--388.
[8]
Sayan Bhattacharya, Monika Henzinger, Danupon Nanongkai, and Charalampos Tsourakakis. 2015. Space-and timeefficient algorithm for maintaining dense subgraphs on one-pass dynamic streams. In Proceedings of the forty-seventh annual ACM symposium on Theory of computing. 173--182.
[9]
Digvijay Boob, Yu Gao, Richard Peng, Saurabh Sawlani, Charalampos Tsourakakis, Di Wang, and Junxing Wang. 2020. Flowless: Extracting densest subgraphs without flow computations. In Proceedings of The Web Conference 2020. 573--583.
[10]
Richard L. Burden and J. Douglas Faires. 2010. Numerical Analysis (9 ed.). Brooks/Cole, Cengage Learning.
[11]
Moses Charikar. 2000. Greedy approximation algorithms for finding dense components in a graph. In International workshop on approximation algorithms for combinatorial optimization. Springer, 84--95.
[12]
Jie Chen and Yousef Saad. 2010. Dense subgraph extraction with application to community detection. IEEE Transactions on knowledge and data engineering 24, 7 (2010), 1216--1230.
[13]
Tianyi Chen and Charalampos Tsourakakis. 2022. Antibenford subgraphs: Unsupervised anomaly detection in financial networks. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 2762--2770.
[14]
Norishige Chiba and Takao Nishizeki. 1985. Arboricity and subgraph listing algorithms. SIAM Journal on computing 14, 1 (1985), 210--223.
[15]
Edith Cohen, Eran Halperin, Haim Kaplan, and Uri Zwick. 2003. Reachability and distance queries via 2-hop labels. SIAM J. Comput. 32, 5 (2003), 1338--1355.
[16]
Guangyu Cui, Yu Chen, De-Shuang Huang, and Kyungsook Han. 2008. An algorithm for finding functional modules and protein complexes in protein-protein interaction networks. Journal of Biomedicine and Biotechnology 2008 (2008).
[17]
Yizhou Dai, Miao Qiao, and Lijun Chang. 2022. Anchored densest subgraph. In Proceedings of the 2022 International Conference on Management of Data. 1200--1213.
[18]
Maximilien Danisch, Oana Balalau, and Mauro Sozio. 2018. Listing k-cliques in sparse real-world graphs. In Proceedings of the 2018 World Wide Web Conference. 589--598.
[19]
Maximilien Danisch, T-H Hubert Chan, and Mauro Sozio. 2017. Large scale density-friendly graph decomposition via convex programming. In Proceedings of the 26th International Conference on World Wide Web. 233--242.
[20]
Xiaoxi Du, Ruoming Jin, Liang Ding, Victor E Lee, and John H Thornton Jr. 2009. Migration motif: a spatial-temporal pattern mining approach for financial markets. In Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining. 1135--1144.
[21]
Alessandro Epasto, Silvio Lattanzi, and Mauro Sozio. 2015. Efficient densest subgraph computation in evolving graphs. In Proceedings of the 24th international conference on world wide web. 300--310.
[22]
Yixiang Fang, Wensheng Luo, and Chenhao Ma. 2022. Densest subgraph discovery on large graphs: Applications, challenges, and techniques. Proceedings of the VLDB Endowment 15, 12 (2022), 3766--3769.
[23]
Yixiang Fang, Kaiqiang Yu, Reynold Cheng, Laks VS Lakshmanan, and Xuemin Lin. 2019. Efficient algorithms for densest subgraph discovery. Proceedings of the VLDB Endowment 12, 11 (2019), 1719--1732.
[24]
Uriel Feige, Michael Seltser, et al. 1997. On the densest k-subgraph problem. Citeseer.
[25]
Eugene Fratkin, Brian T Naughton, Douglas L Brutlag, and Serafim Batzoglou. 2006. MotifCut: regulatory motifs finding with maximum density subgraphs. Bioinformatics 22, 14 (2006), e150--e157.
[26]
David Gibson, Ravi Kumar, and Andrew Tomkins. 2005. Discovering large dense subgraphs in massive graphs. In Proceedings of the 31st international conference on Very large data bases. 721--732.
[27]
Aristides Gionis, Flavio PP Junqueira, Vincent Leroy, Marco Serafini, and Ingmar Weber. 2013. Piggybacking on social networks. In VLDB 2013--39th International Conference on Very Large Databases, Vol. 6. 409--420.
[28]
Aristides Gionis and Charalampos E Tsourakakis. 2015. Dense subgraph discovery: Kdd 2015 tutorial. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2313--2314.
[29]
Andrew V Goldberg. 1984. Finding a maximum density subgraph. (1984).
[30]
Elfarouk Harb, Kent Quanrud, and Chandra Chekuri. 2022. Faster and scalable algorithms for densest subgraph and decomposition. Advances in Neural Information Processing Systems 35 (2022), 26966--26979.
[31]
Yizhang He, Kai Wang, Wenjie Zhang, Xuemin Lin, and Ying Zhang. 2023. Scaling Up k-Clique Densest Subgraph Detection. Proceedings of the ACM on Management of Data 1, 1 (2023), 1--26.
[32]
Haiyan Hu, Xifeng Yan, Yu Huang, Jiawei Han, and Xianghong Jasmine Zhou. 2005. Mining coherent dense subgraphs across massive biological networks for functional discovery. Bioinformatics 21, suppl_1 (2005), i213--i221.
[33]
Martin Jaggi. 2013. Revisiting Frank-Wolfe: Projection-free sparse convex optimization. In International conference on machine learning. PMLR, 427--435.
[34]
Shweta Jain and C Seshadhri. 2020. The power of pivoting for exact clique counting. In Proceedings of the 13th International Conference on Web Search and Data Mining. 268--276.
[35]
Shweta Jain and C Seshadhri. 2020. Provably and efficiently approximating near-cliques using the Turán shadow: PEANUTS. In Proceedings of The Web Conference 2020. 1966--1976.
[36]
Ruoming Jin, Yang Xiang, Ning Ruan, and David Fuhry. 2009. 3-hop: a high-compression indexing scheme for reachability query. In Proceedings of the 2009 ACM SIGMOD International Conference on Management of data. 813--826.
[37]
Samir Khuller and Barna Saha. 2009. On finding dense subgraphs. In International colloquium on automata, languages, and programming. Springer, 597--608.
[38]
Tommaso Lanciano, Atsushi Miyauchi, Adriano Fazzone, and Francesco Bonchi. 2023. A survey on the densest subgraph problem and its variants. arXiv preprint arXiv:2303.14467 (2023).
[39]
Victor E Lee, Ning Ruan, Ruoming Jin, and Charu Aggarwal. 2010. A survey of algorithms for dense subgraph discovery. Managing and mining graph data (2010), 303--336.
[40]
Qing Liu, Xuliang Zhu, Xin Huang, and Jianliang Xu. 2021. Local algorithms for distance-generalized core decomposition over large dynamic graphs. Proceedings of the VLDB Endowment 14, 9 (2021), 1531--1543.
[41]
Linyuan Lü, Tao Zhou, Qian-Ming Zhang, and H Eugene Stanley. 2016. The H-index of a network node and its relation to degree and coreness. Nature communications 7, 1 (2016), 10168.
[42]
Wensheng Luo, Chenhao Ma, Yixiang Fang, and Laks VS Lakshman. 2023. A Survey of Densest Subgraph Discovery on Large Graphs. arXiv preprint arXiv:2306.07927 (2023).
[43]
Wensheng Luo, Zhuo Tang, Yixiang Fang, Chenhao Ma, and Xu Zhou. 2023. Scalable algorithms for densest subgraph discovery. In 2023 IEEE 39th International Conference on Data Engineering (ICDE). IEEE, 287--300.
[44]
Chenhao Ma, Reynold Cheng, Laks VS Lakshmanan, and Xiaolin Han. 2022. Finding locally densest subgraphs: a convex programming approach. Proceedings of the VLDB Endowment 15, 11 (2022), 2719--2732.
[45]
Chenhao Ma, Yixiang Fang, Reynold Cheng, Laks VS Lakshmanan, and Xiaolin Han. 2022. A convex-programming approach for efficient directed densest subgraph discovery. In Proceedings of the 2022 International Conference on Management of Data. 845--859.
[46]
Chenhao Ma, Yixiang Fang, Reynold Cheng, Laks VS Lakshmanan,Wenjie Zhang, and Xuemin Lin. 2020. Efficient algorithms for densest subgraph discovery on large directed graphs. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. 1051--1066.
[47]
Michael Mitzenmacher, Jakub Pachocki, Richard Peng, Charalampos Tsourakakis, and Shen Chen Xu. 2015. Scalable large near-clique detection in large-scale networks via sampling. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 815--824.
[48]
Atsushi Miyauchi, Tianyi Chen, Konstantinos Sotiropoulos, and Charalampos E Tsourakakis. 2023. Densest Diverse Subgraphs: How to Plan a Successful Cocktail Party with Diversity. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 1710--1721.
[49]
Lu Qin, Rong-Hua Li, Lijun Chang, and Chengqi Zhang. 2015. Locally densest subgraph discovery. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 965--974.
[50]
The Technique Report. 2023. A Counting-based Approach for Efficient k-Clique Densest Subgraph Discovery (technical report). https://drive.google.com/file/d/1--9bDgjiQuIDKnUOWy16JU-iaX-reQNTc/view?usp=sharing.
[51]
Yousef Saad. 2003. Iterative methods for sparse linear systems. SIAM.
[52]
Barna Saha, Allison Hoch, Samir Khuller, Louiqa Raschid, and Xiao-Ning Zhang. 2010. Dense subgraphs with restrictions and applications to gene annotation graphs. In Research in Computational Molecular Biology: 14th Annual International Conference, RECOMB 2010, Lisbon, Portugal, April 25--28, 2010. Proceedings 14. Springer, 456--472.
[53]
Raman Samusevich, Maximilien Danisch, and Mauro Sozio. 2016. Local triangle-densest subgraphs. In 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM). IEEE, 33--40.
[54]
Ahmet Erdem Sariyüce, C Seshadhri, and Ali Pinar. 2018. Local algorithms for hierarchical dense subgraph discovery. Proceedings of the VLDB Endowment 12, 1 (2018), 43--56.
[55]
Saurabh Sawlani and Junxing Wang. 2020. Near-optimal fully dynamic densest subgraph. In Proceedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing. 181--193.
[56]
Bintao Sun, Maximilien Danisch, TH Hubert Chan, and Mauro Sozio. 2020. Kclist: A simple algorithm for finding k-clique densest subgraphs in large graphs. Proceedings of the VLDB Endowment (PVLDB) (2020).
[57]
Brian K Tanner, Gary Warner, Henry Stern, and Scott Olechowski. 2010. Koobface: The evolution of the social botnet. In 2010 eCrime Researchers Summit. IEEE, 1--10.
[58]
Nikolaj Tatti and Aristides Gionis. 2015. Density-friendly graph decomposition. In Proceedings of the 24th International Conference on World Wide Web. 1089--1099.
[59]
Etsuji Tomita, Akira Tanaka, and Haruhisa Takahashi. 2006. The worst-case time complexity for generating all maximal cliques and computational experiments. Theoretical computer science 363, 1 (2006), 28--42.
[60]
Charalampos Tsourakakis. 2015. The k-clique densest subgraph problem. In Proceedings of the 24th international conference on world wide web. 1122--1132.
[61]
Charalampos Tsourakakis, Francesco Bonchi, Aristides Gionis, Francesco Gullo, and Maria Tsiarli. 2013. Denser than the densest subgraph: extracting optimal quasi-cliques with quality guarantees. In Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining. 104--112.
[62]
Charalampos E Tsourakakis. 2014. Mathematical and algorithmic analysis of network and biological data. arXiv preprint arXiv:1407.0375 (2014).
[63]
Charalampos E Tsourakakis. 2014. A novel approach to finding near-cliques: The triangle-densest subgraph problem. arXiv preprint arXiv:1405.1477 (2014).
[64]
Nate Veldt, Austin R Benson, and Jon Kleinberg. 2021. The generalized mean densest subgraph problem. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 1604--1614.
[65]
Yichen Xu, Chenhao Ma, Yixiang Fang, and Zhifeng Bao. 2023. Efficient and Effective Algorithms for Generalized Densest Subgraph Discovery. Proceedings of the ACM on Management of Data 1, 2 (2023), 1--27.
[66]
Kaiqiang Yu and Cheng Long. 2021. Graph Mining Meets Fake News Detection. In Data Science for Fake News: Surveys and Perspectives. Springer, 169--189.
[67]
Yang Zhang and Srinivasan Parthasarathy. 2012. Extracting analyzing and visualizing triangle k-core motifs within networks. In 2012 IEEE 28th international conference on data engineering. IEEE, 1049--1060.
[68]
Feng Zhao and Anthony KH Tung. 2012. Large scale cohesive subgraphs discovery for social network visual analysis. Proceedings of the VLDB Endowment 6, 2 (2012), 85--96.

Cited By

View all

Index Terms

  1. A Counting-based Approach for Efficient k-Clique Densest Subgraph Discovery

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Proceedings of the ACM on Management of Data
    Proceedings of the ACM on Management of Data  Volume 2, Issue 3
    SIGMOD
    June 2024
    1953 pages
    EISSN:2836-6573
    DOI:10.1145/3670010
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 30 May 2024
    Published in PACMMOD Volume 2, Issue 3

    Permissions

    Request permissions for this article.

    Author Tags

    1. clique densest subgraph
    2. convex programming
    3. graph density

    Qualifiers

    • Research-article

    Funding Sources

    • This work was supported in part by NSFC under Grants 62102341, and 62302421, Guangdong Talent Program under Grant 2021QN02X826, Shenzhen Science and Technology Program under Grants JCYJ20220530143602006 and ZDSYS 20211021111415025, and Basic and Applied Basic Research Fund in Guangdong Province under Grant 2023A1515011280. This paper was also supported by Shenzhen Stability Science Program and Guangdong Key Lab of Mathematical Foundations for Artificial Intelligence.

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)218
    • Downloads (Last 6 weeks)31
    Reflects downloads up to 27 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media