skip to main content
10.1145/3583780.3615038acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Robust Graph Clustering via Meta Weighting for Noisy Graphs

Published: 21 October 2023 Publication History

Abstract

How can we find meaningful clusters in a graph robustly against noise edges? Graph clustering (i.e., dividing nodes into groups of similar ones) is a fundamental problem in graph analysis with applications in various fields. Recent studies have demonstrated that graph neural network (GNN) based approaches yield promising results for graph clustering. However, we observe that their performance degenerates significantly on graphs with noise edges, which are prevalent in practice. In this work, we propose MetaGC for robust GNN-based graph clustering. MetaGC employs a decomposable clustering loss function, which can be rephrased as a sum of losses over node pairs. We add a learnable weight to each node pair, and MetaGC adaptively adjusts the weights of node pairs using meta-weighting so that the weights of meaningful node pairs increase and the weights of less-meaningful ones (e.g., noise edges) decrease. We show empirically that MetaGC learns weights as intended and consequently outperforms the state-of-the-art GNN-based competitors, even when they are equipped with separate denoising schemes, on five real-world graphs under varying levels of noise. Our code and datasets are available at https://github.com/HyeonsooJo/MetaGC.

Supplementary Material

MP4 File (full0435-video.mp4)
How can we find meaningful clusters in a graph robustly against noise edges? Graph clustering is a fundamental problem in graph analysis with applications in various fields. Recent studies have demonstrated that GNN based approaches yield promising results for graph clustering. However, we observe that their performance degenerates significantly on graphs with noise edges, which are prevalent in practice. In this work, we propose MetaGC for robust GNN-based graph clustering. MetaGC employs a decomposable clustering loss function, which can be rephrased as a sum of losses over node pairs. We add a learnable weight to each node pair, and MetaGC adaptively adjusts the weights of node pairs using meta-weighting so that the weights of meaningful node pairs increase and vice versa. We show empirically that MetaGC learns weights as intended and consequently outperforms the state-of-the-art competitors on five real-world graphs under varying levels of noise.

References

[1]
Lada A Adamic and Eytan Adar. 2003. Friends and neighbors on the web. Social networks, Vol. 25, 3 (2003), 211--230.
[2]
Punam Bedi and Chhavi Sharma. 2016. Community detection in social networks. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, Vol. 6, 3 (2016), 115--135.
[3]
Alex Beutel, Wanhong Xu, Venkatesan Guruswami, Christopher Palow, and Christos Faloutsos. 2013. Copycatch: stopping group attacks by spotting lockstep behavior in social networks. In WWW.
[4]
Filippo Maria Bianchi, Daniele Grattarola, and Cesare Alippi. 2020. Spectral clustering with graph neural networks for graph pooling. In ICML.
[5]
Vincent D Blondel, Jean-Loup Guillaume, Renaud Lambiotte, and Etienne Lefebvre. 2008. Fast unfolding of communities in large networks. Journal of statistical mechanics: theory and experiment, Vol. 2008, 10 (2008), P10008.
[6]
Ulrik Brandes, Daniel Delling, Marco Gaertler, Robert Görke, Martin Hoefer, Zoran Nikoloski, and Dorothea Wagner. 2006. Maximizing modularity is hard. arXiv preprint physics/0608255 (2006).
[7]
Chiyu Cai, Linjing Li, and Daniel Zeng. 2017. Detecting social bots by jointly modeling deep behavior and content information. In CIKM.
[8]
Jingchun Chen and Bo Yuan. 2006. Detecting functional modules in the yeast protein--protein interaction network. Bioinformatics, Vol. 22, 18 (2006), 2283--2290.
[9]
Davide Chicco and Giuseppe Jurman. 2020. The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC genomics, Vol. 21, 1 (2020), 1--13.
[10]
Djork-Arné Clevert, Thomas Unterthiner, and Sepp Hochreiter. 2016. Fast and accurate deep network learning by exponential linear units (elus). In ICLR.
[11]
Yin Cui, Menglin Jia, Tsung-Yi Lin, Yang Song, and Serge Belongie. 2019. Class-balanced loss based on effective number of samples. In CVPR.
[12]
Enyan Dai, Wei Jin, Hui Liu, and Suhang Wang. 2022. Towards robust graph neural networks for noisy graphs with sparse labels. In WSDM.
[13]
Negin Entezari, Saba A Al-Sayouri, Amirali Darvishzadeh, and Evangelos E Papalexakis. 2020. All you need is low (rank) defending against adversarial attacks on graphs. In WSDM.
[14]
Boyuan Feng, Yuke Wang, and Yufei Ding. 2021. Uag: Uncertainty-aware attention graph neural network for defending adversarial attacks. In AAAI.
[15]
Emilio Ferrara, Onur Varol, Clayton Davis, Filippo Menczer, and Alessandro Flammini. 2016. The rise of social bots. Commun. ACM, Vol. 59, 7 (2016), 96--104.
[16]
Chakib Fettal, Lazhar Labiod, and Mohamed Nadif. 2022. Efficient graph convolution for joint node representation learning and clustering. In WSDM.
[17]
Michelle Girvan and Mark EJ Newman. 2002. Community structure in social and biological networks. Proceedings of the national academy of sciences, Vol. 99, 12 (2002), 7821--7826.
[18]
Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable feature learning for networks. In KDD.
[19]
Siyuan Guo, Lixin Zou, Yiding Liu, Wenwen Ye, Suqi Cheng, Shuaiqiang Wang, Hechang Chen, Dawei Yin, and Yi Chang. 2021. Enhanced doubly robust learning for debiasing post-click conversion rate estimation. In SIGIR.
[20]
Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. In NeurIPS.
[21]
Wei Jin, Yao Ma, Xiaorui Liu, Xianfeng Tang, Suhang Wang, and Jiliang Tang. 2020. Graph structure learning for robust graph neural networks. In KDD.
[22]
Zhao Kang, Zhanyu Liu, Shirui Pan, and Ling Tian. 2022. Fine-grained attributed graph clustering. In SDM.
[23]
Nikolaos Karalias, Joshua Robinson, Andreas Loukas, and Stefanie Jegelka. 2022. Neural Set Function Extensions: Learning with Discrete Functions in High Dimensions. In NeurIPS.
[24]
Brian W Kernighan and Shen Lin. 1970. An efficient heuristic procedure for partitioning graphs. The Bell system technical journal, Vol. 49, 2 (1970), 291--307.
[25]
Minseok Kim, Hwanjun Song, Doyoung Kim, Kijung Shin, and Jae-Gil Lee. 2021. PREMERE: Meta-Reweighting via Self-Ensembling for Point-of-Interest Recommendation. In AAAI.
[26]
Minseok Kim, Hwanjun Song, Yooju Shin, Dongmin Park, Kijung Shin, and Jae-Gil Lee. 2022. Meta-Learning for Online Update of Recommender Systems. In AAAI.
[27]
Diederik P Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In ICLR.
[28]
Thomas N Kipf and Max Welling. 2017. Semi-supervised classification with graph convolutional networks. In ICLR.
[29]
Günter Klambauer, Thomas Unterthiner, Andreas Mayr, and Sepp Hochreiter. 2019. Self-normalizing neural networks. In NeurIPS.
[30]
Johannes Klicpera, Stefan Weißenberger, and Stephan Günnemann. 2019. Diffusion improves graph learning. In NeurIPS.
[31]
Andrea Lancichinetti and Santo Fortunato. 2011. Limits of modularity maximization in community detection. Physical review E, Vol. 84, 6 (2011), 066122.
[32]
Sune Lehmann and Lars Kai Hansen. 2007. Deterministic modularity optimization. The European Physical Journal B, Vol. 60, 1 (2007), 83--88.
[33]
Ao Li, Zhou Qin, Runshi Liu, Yiqun Yang, and Dong Li. 2019. Spam review detection with graph convolutional networks. In CIKM.
[34]
Xin Liu, Hui-Min Cheng, and Zhong-Yuan Zhang. 2019. Evaluation of community detection methods. IEEE Transactions on Knowledge and Data Engineering, Vol. 32, 9 (2019), 1736--1746.
[35]
Stuart Lloyd. 1982. Least squares quantization in PCM. IEEE transactions on information theory, Vol. 28, 2 (1982), 129--137.
[36]
Dongsheng Luo, Wei Cheng, Wenchao Yu, Bo Zong, Jingchao Ni, Haifeng Chen, and Xiang Zhang. 2021. Learning to drop: Robust graph neural network via topological denoising. In WSDM.
[37]
Chen Ma, Yingxue Zhang, Qinglong Wang, and Xue Liu. 2018. Point-of-interest recommendation: Exploiting self-attentive autoencoders with neighbor-aware influence. In CIKM.
[38]
Vinod Nair and Geoffrey E Hinton. 2010. Rectified linear units improve restricted boltzmann machines. In ICML.
[39]
Mark EJ Newman. 2004. Fast algorithm for detecting community structure in networks. Physical review E, Vol. 69, 6 (2004), 066133.
[40]
Mark EJ Newman. 2006. Modularity and community structure in networks. Proceedings of the national academy of sciences, Vol. 103, 23 (2006), 8577--8582.
[41]
Mark EJ Newman. 2013. Spectral methods for community detection and graph partitioning. Physical Review E, Vol. 88, 4 (2013), 042822.
[42]
Lawrence Page, Sergey Brin, Rajeev Motwani, and Terry Winograd. 1999. The PageRank citation ranking: Bringing order to the web. Technical Report. Stanford InfoLab.
[43]
Shashank Pandit, Duen Horng Chau, Samuel Wang, and Christos Faloutsos. 2007. Netprobe: a fast and scalable system for fraud detection in online auction networks. In WWW.
[44]
Zhen Peng, Wenbing Huang, Minnan Luo, Qinghua Zheng, Yu Rong, Tingyang Xu, and Junzhou Huang. 2020. Graph representation learning via graphical mutual information maximization. In WWW.
[45]
Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. Deepwalk: Online learning of social representations. In KDD.
[46]
Muhammad Zubair Rafique and Muhammad Abulaish. 2012. Graph-based learning model for detection of SMS spam on smart phones. In IWCMC.
[47]
Mengye Ren, Wenyuan Zeng, Bin Yang, and Raquel Urtasun. 2018. Learning to reweight examples for robust deep learning. In ICML.
[48]
Mohsen Sayyadiharikandeh, Onur Varol, Kai-Cheng Yang, Alessandro Flammini, and Filippo Menczer. 2020. Detection of novel social bots by ensembles of specialized classifiers. In CIKM.
[49]
Prithviraj Sen, Galileo Namata, Mustafa Bilgic, Lise Getoor, Brian Galligher, and Tina Eliassi-Rad. 2008. Collective classification in network data. AI magazine, Vol. 29, 3 (2008), 93--93.
[50]
Oleksandr Shchur, Maximilian Mumme, Aleksandar Bojchevski, and Stephan Günnemann. 2018. Pitfalls of graph neural network evaluation. arXiv preprint arXiv:1811.05868 (2018).
[51]
Kijung Shin, Bryan Hooi, Jisu Kim, and Christos Faloutsos. 2017. Densealert: Incremental dense-subtensor detection in tensor streams. In KDD.
[52]
Jun Shu, Qi Xie, Lixuan Yi, Qian Zhao, Sanping Zhou, Zongben Xu, and Deyu Meng. 2019. Meta-weight-net: learning an explicit mapping for sample weighting. In NeurIPS.
[53]
Anton Tsitsulin, John Palowitch, Bryan Perozzi, and Emmanuel Müller. 2020. Graph clustering with graph neural networks. arXiv preprint arXiv:2006.16904 (2020).
[54]
Sergei Vassilvitskii and David Arthur. 2006. k-means: The advantages of careful seeding. In SODA.
[55]
Petar Velickovic, William Fedus, William L Hamilton, Pietro Liò, Yoshua Bengio, and R Devon Hjelm. 2019. Deep Graph Infomax. In ICLR.
[56]
Guan Wang, Sihong Xie, Bing Liu, and S Yu Philip. 2011. Review graph based online store review spammer detection. In ICDM.
[57]
Huijun Wu, Chen Wang, Yuriy Tyshetskiy, Andrew Docherty, Kai Lu, and Liming Zhu. 2019. Adversarial examples for graph data: deep insights into attack and defense. In IJCAI.
[58]
Zhenyu Wu and Richard Leahy. 1993. An optimal graph theoretic approach to data clustering: Theory and its application to image segmentation. IEEE transactions on pattern analysis and machine intelligence, Vol. 15, 11 (1993), 1101--1113.
[59]
Hanqing Zeng, Hongkuan Zhou, Ajitesh Srivastava, Rajgopal Kannan, and Viktor Prasanna. 2019. GraphSAINT: Graph Sampling Based Inductive Learning Method. In ICLR.
[60]
Xiang Zhang and Marinka Zitnik. 2020. Gnnguard: Defending graph neural networks against adversarial attacks. In NeurIPS.
[61]
Dingyuan Zhu, Ziwei Zhang, Peng Cui, and Wenwu Zhu. 2019. Robust graph convolutional networks against adversarial attacks. In KDD.

Cited By

View all
  • (2024)ACDM: An Effective and Scalable Active Clustering with Pairwise ConstraintProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679601(643-652)Online publication date: 21-Oct-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management
October 2023
5508 pages
ISBN:9798400701245
DOI:10.1145/3583780
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 October 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. graph clustering
  2. meta weighting
  3. robust learning

Qualifiers

  • Research-article

Funding Sources

  • Institute of Information & Communications Technology Planning & Evaluation

Conference

CIKM '23
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)123
  • Downloads (Last 6 weeks)7
Reflects downloads up to 21 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)ACDM: An Effective and Scalable Active Clustering with Pairwise ConstraintProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679601(643-652)Online publication date: 21-Oct-2024

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media