research-article

B²-Sampling: Fusing Balanced and Biased Sampling for Graph Contrastive Learning

Authors:

Mengyue Liu,

Yun Lin,

Jun Liu,

Bohao Liu,

Qinghua Zheng,

Jin Song DongAuthors Info & Claims

KDD '23: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Pages 1489 - 1500

https://doi.org/10.1145/3580305.3599262

Published: 04 August 2023 Publication History

Get Access

Abstract

Graph contrastive learning (GCL), aiming for an embedding space where semantically similar nodes are closer, has been widely applied in graph-structured data. Researchers have proposed many approaches to define positive and negative pairs (i.e., semantically similar and dissimilar pairs) on the graph, serving as labels to learn their embedding distances. Despite the effectiveness, those approaches usually suffer from two typical learning challenges. First, the number of candidate negative pairs is enormous. Thus, it is non-trivial to select representative ones to train the model in a more effective way. Second, the heuristics (e.g., graph views or meta-path patterns) to define positive and negative pairs are sometimes less reliable, causing considerable noise for both "labelled'' positive and negative pairs. In this work, we propose a novel sampling approach B²-Sampling to address the above challenges in a unified way. On the one hand, we use balanced sampling to select the most representative negative pairs regarding both the topological and embedding diversities. On the other hand, we use biased sampling to learn and correct the labels of the most error-prone negative pairs during the training. The balanced and biased samplings can be applied iteratively for discriminating and correcting training pairs, boosting the performance of GCL models. B²-Sampling is designed as a framework to support many known GCL models. Our extensive experiments on node classification, node clustering, and graph classification tasks show that B2-Sampling significantly improves the performance of GCL models with acceptable runtime overhead. Our website[11] https://sites.google.com/view/b2-sampling/home provides access to our codes and additional experiment results.

Supplementary Material

M4V File (rtfp0421-2min-promo.m4v)

A typical workflow of graph contrastive learning involves contrasting heuristics and contrastive objective design. For a given node, the heuristic way defines its augmented nodes in different graph views as positive ones and all the other nodes as negative ones. However, it is non-trivial to select representative ones from sizable negative pairs to train the model. And such definitions are sometimes unreliable, causing considerable noise for "labeled" positive and negative pairs. Thus we use balanced sampling to select the most representative negative pairs that are uniformly distributed over topological and embedding distances. And we leverage the slow learning effect phenomenon and use biased sampling to correct labels of most error-prone negative pairs. B2-Sampling serves as a plugin in the overall graph contrastive learning paradigm. Extensive experiments show that B2-Sampling improves the performance of most graph contrastive learning methods.

Download
28.98 MB

References

[1]

Luis Bulla. 1994. An index of evenness and its associated diversity measure. Oikos (1994), 167--171.

Abstract

Supplementary Material

References

Cited By

Index Terms

Recommendations

Revisiting Negative Sampling vs. Non-sampling in Implicit Recommendation

Entity-Relation Distribution-Aware Negative Sampling for Knowledge Graph Embedding

AFANS: Augmentation-Free Graph Contrastive Learning with Adversarial Negative Sampling

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Upcoming Conference

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations