research-article

Neural Subgraph Counting with Wasserstein Estimator

Authors:

Wenjie ZhangAuthors Info & Claims

SIGMOD '22: Proceedings of the 2022 International Conference on Management of Data

Pages 160 - 175

https://doi.org/10.1145/3514221.3526163

Published: 11 June 2022 Publication History

Abstract

Subgraph counting is a fundamental graph analysis task which has been widely used in many applications. As the problem of subgraph counting is NP-complete and hence intractable, approximate solutions have been widely studied, which fail to work with large and complex query graphs. Alternatively, Machine Learning techniques have been recently applied for this problem, yet the existing ML approaches either only support very small data graphs or cannot make full use of the data graph information, which inherently limits their scalability, estimation accuracies and robustness.

In this paper, we propose a novel approximate subgraph counting algorithm, NeurSC, that can exploit and combine information from both the query graphs and the data graphs effectively and efficiently. It consists of two components: (1) an extraction module that adaptively generates simple yet representative substructures from data graph for each query graph and (2) an estimator WEst that first computes the representations from individual and joint distributions of query and data graphs and then estimates subgraph counts with the learned representations. Furthermore, we design a novel Wasserstein discriminator in WEst to minimize the Wasserstein distance between query and data graphs by updating the parameters in network with the vertex correspondence relationship between query and data graphs. By doing this, WEst can better capture the correlation between query and data graphs which is essential to the quality of the estimation. We conduct experimental studies on seven large real-life labeled graphs to demonstrate the superior performance of NeurSC in terms of estimation accuracy and robustness.

References

[1]

2019. https://github.com/THUDM/ProNE.

[2]

2020. https://github.com/RapidsAtHKUST/SubgraphMatching.

[3]

2020. https://github.com/yspark-dblab/gcare.

[4]

2020. https://github.com/HKUST-KnowComp/NeuralSubgraphCounting.

[5]

Nesreen K Ahmed, Jennifer Neville, Ryan A Rossi, and Nick Duffield. 2015. Efficient graphlet counting for large networks. In 2015 IEEE International Conference on Data Mining. IEEE, 1--10.

Digital Library

[6]

Molham Aref, Balder ten Cate, Todd J. Green, Benny Kimelfeld, Dan Olteanu, Emir Pasalic, Todd L. Veldhuizen, and Geoffrey Washburn. 2015. Design and Implementation of the LogicBlox System. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Melbourne, Victoria, Australia, May 31 - June 4, 2015, Timos K. Sellis, Susan B. Davidson, and Zachary G. Ives (Eds.). ACM, 1371--1382.

Digital Library

[7]

Martín Arjovsky and Léon Bottou. 2017. Towards Principled Methods for Training Generative Adversarial Networks. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24--26, 2017, Conference Track Proceedings. OpenReview.net.

[8]

Martín Arjovsky, Soumith Chintala, and Léon Bottou. 2017. Wasserstein GAN. CoRR abs/1701.07875 (2017).

[9]

Mansurul A Bhuiyan, Mahmudur Rahman, Mahmuda Rahman, and Mohammad Al Hasan. 2012. Guise: Uniform sampling of graphlets for large graph analysis. In 2012 IEEE 12th International Conference on Data Mining. IEEE, 91--100.

Digital Library

[10]

Fei Bi, Lijun Chang, Xuemin Lin, Lu Qin, and Wenjie Zhang. 2016. Efficient Subgraph Matching by Postponing Cartesian Products. In Proceedings of the 2016 International Conference on Management of Data, SIGMOD Conference 2016, San Francisco, CA, USA, June 26 - July 01, 2016, Fatma Özcan, Georgia Koutrika, and Sam Madden (Eds.). ACM, 1199--1214.

Digital Library

[11]

Manuel Bodirsky. 2015. Graph homomorphisms and universal algebra course notes. TU Dresden (2015).

[12]

Marco Bressan, Flavio Chierichetti, Ravi Kumar, Stefano Leucci, and Alessandro Panconesi. 2017. Counting graphlets: Space vs time. In Proceedings of the tenth ACM international conference on web search and data mining. 557--566.

Digital Library

[13]

Marco Bressan, Stefano Leucci, and Alessandro Panconesi. 2019. Motivo: Fast Motif Counting via Succinct Color Coding and Adaptive Sampling. Proccedings of the VLDB Endowment 12, 11 (2019), 1651--1663.

Digital Library

[14]

Walter Cai, Magdalena Balazinska, and Dan Suciu. 2019. Pessimistic cardinality estimation: Tighter upper bounds for intermediate join cardinalities. In Proceedings of the 2019 International Conference on Management of Data. 18--35.

Digital Library

[15]

Vincenzo Carletti, Pasquale Foggia, Alessia Saggese, and Mario Vento. 2017. Challenging the time complexity of exact subgraph isomorphism for huge and dense graphs with VF3. IEEE transactions on pattern analysis and machine intelligence 40, 4 (2017), 804--818.

[16]

Vincenzo Carletti, Pasquale Foggia, and Mario Vento. 2015. VF2 Plus: An Improved version of VF2 for Biological Graphs. In Graph-Based Representations in Pattern Recognition - 10th IAPR-TC-15 International Workshop, GbRPR 2015, Beijing, China, May 13--15, 2015. Proceedings (Lecture Notes in Computer Science, Vol. 9069), Cheng-Lin Liu, Bin Luo, Walter G. Kropatsch, and Jian Cheng (Eds.). Springer, 168--177.

[17]

Xiaowei Chen and John CS Lui. 2018. Mining graphlet counts in online social networks. ACM Transactions on Knowledge Discovery from Data (TKDD) 12, 4 (2018), 1--38.

[18]

Zhengdao Chen, Lei Chen, Soledad Villar, and Joan Bruna. 2020. Can graph neural networks count substructures? arXiv preprint arXiv:2002.04025 (2020).

[19]

Luigi P. Cordella, Pasquale Foggia, Carlo Sansone, and Mario Vento. 2004. A (Sub)Graph Isomorphism Algorithm for Matching Large Graphs. IEEE Trans. Pattern Anal. Mach. Intell. 26, 10 (2004), 1367--1372.

Digital Library

[20]

Angjela Davitkova, Damjan Gjurovski, and Sebastian Michel. 2021. LMKG: Learned Models for Cardinality Estimation in Knowledge Graphs. arXiv preprint arXiv:2102.10588 (2021).

[21]

Ji Gao, Xiao Huang, and Jundong Li. 2021. Unsupervised Graph Alignment with Wasserstein Distance Discriminator. In 27th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Association for Computing Machinery.

[22]

Matt W Gardner and SR Dorling. 1998. Artificial neural networks (the multilayer perceptron)-a review of applications in the atmospheric sciences. Atmospheric environment 32, 14--15 (1998), 2627--2636.

[23]

M. R. Garey and David S. Johnson. 1979. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman.

Digital Library

[24]

Nathael Gozlan, Cyril Roberto, Paul-Marie Samson, and Prasad Tetali. 2017. Kantorovich duality for general transport costs and applications. Journal of Functional Analysis 273, 11 (2017), 3327--3405.

[25]

Martin Grohe. 2017. Descriptive complexity, canonisation, and definable graph structure theory. Vol. 47. Cambridge University Press.

[26]

Aditya Grover and Jure Leskovec. 2016. node2vec: Scalable feature learning for networks. In Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining. 855--864.

Digital Library

[27]

William L Hamilton, Rex Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. In Proceedings of the 31st International Conference on Neural Information Processing Systems. 1025--1035.

[28]

Myoungji Han, Hyunjoon Kim, Geonmo Gu, Kunsoo Park, and Wook-Shin Han. 2019. Efficient Subgraph Matching: Harmonizing Dynamic Programming, Adaptive Matching Order, and Failing Set Together. In Proceedings of the 2019 International Conference on Management of Data, SIGMOD Conference 2019, Amsterdam, The Netherlands, June 30 - July 5, 2019, Peter A. Boncz, Stefan Manegold, Anastasia Ailamaki, Amol Deshpande, and Tim Kraska (Eds.). ACM, 1429--1446.

Digital Library

[29]

Wook-Shin Han, Jinsoo Lee, and Jeong-Hoon Lee. 2013. Turboiso: towards ultrafast and robust subgraph isomorphism search in large graph databases. In Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data. 337--348.

Digital Library

[30]

Yu Hao, Xin Cao, Yufan Sheng, Yixiang Fang, and Wei Wang. 2021. KS-GNN: Keywords Search over Incomplete Graphs via Graphs Neural Network. Advances in Neural Information Processing Systems 34 (2021).

[31]

Zaïd Harchaoui and Francis Bach. 2007. Image classification with segmentation graph kernels. In 2007 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 1--8.

[32]

Shohedul Hasan, Saravanan Thirumuruganathan, Jees Augustine, Nick Koudas, and Gautam Das. 2020. Deep learning models for selectivity estimation of multi-attribute queries. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. 1035--1050.

Digital Library

[33]

Huahai He and Ambuj K Singh. 2008. Graphs-at-a-time: query language and access methods for graph databases. In Proceedings of the 2008 ACM SIGMOD international conference on Management of data. 405--418.

Digital Library

[34]

Toma? Ho?evar and Janez Dem?ar. 2014. A combinatorial approach to graphlet counting. Bioinformatics 30, 4 (2014), 559--565.

[35]

Toma? Ho?evar and Janez Dem?ar. 2017. Combinatorial algorithm for counting small induced graphs and orbits. PloS one 12, 2 (2017), e0171428.

[36]

Kai Huang, Haibo Hu, Shuigeng Zhou, Jihong Guan, Qingqing Ye, and Xiaofang Zhou. 2021. Privacy and efficiency guaranteed social subgraph matching. The VLDB Journal (2021), 1--22.

[37]

Alpár Jüttner and Péter Madarasi. 2018. VF2++-An improved subgraph isomorphism algorithm. Discrete Applied Mathematics 242 (2018), 69--81.

[38]

Nadav Kashtan, Shalev Itzkovitz, Ron Milo, and Uri Alon. 2004. Efficient sampling algorithm for estimating subgraph concentrations and detecting network motifs. Bioinformatics 20, 11 (2004), 1746--1758.

Digital Library

[39]

Duck Hoon Kim, Il Dong Yun, and Sang Uk Lee. 2004. A new attributed relational graph matching algorithm using the nested structure of earth mover's distance. In Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004., Vol. 1. IEEE, 48--51.

[40]

Hyunjoon Kim, Yunyoung Choi, Kunsoo Park, Xuemin Lin, Seok-Hee Hong, and Wook-Shin Han. 2021. Versatile Equivalences: Speeding up Subgraph Query Processing and Subgraph Matching. In Proceedings of the 2021 International Conference on Management of Data. 925--937.

Digital Library

[41]

Jongmin Kim, Taesup Kim, Sungwoong Kim, and Chang D Yoo. 2019. Edge-labeling graph neural network for few-shot learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 11--20.

[42]

Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).

[43]

Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24--26, 2017, Conference Track Proceedings. OpenReview.net.

[44]

Solomon Kullback and Richard A Leibler. 1951. On information and sufficiency. The annals of mathematical statistics 22, 1 (1951), 79--86.

[45]

Longbin Lai, Lu Qin, Xuemin Lin, and Lijun Chang. 2015. Scalable Subgraph Enumeration in MapReduce. Proc. VLDB Endow. 8, 10 (2015), 974--985.

Digital Library

[46]

Longbin Lai, Lu Qin, Xuemin Lin, Ying Zhang, and Lijun Chang. 2016. Scalable Distributed Subgraph Enumeration. Proc. VLDB Endow. 10, 3 (2016), 217--228.

Digital Library

[47]

Jinsoo Lee, Wook-Shin Han, Romans Kasperovics, and Jeong-Hoon Lee. 2012. An in-depth comparison of subgraph isomorphism algorithms in graph databases. Proceedings of the VLDB Endowment 6, 2 (2012), 133--144.

Digital Library

[48]

Viktor Leis, Andrey Gubichev, Atanas Mirchev, Peter Boncz, Alfons Kemper, and Thomas Neumann. 2015. How good are query optimizers, really? Proceedings of the VLDB Endowment 9, 3 (2015), 204--215.

Digital Library

[49]

AA Leman and B Weisfeiler. 1968. A reduction of a graph to a canonical form and an algebra arising during this reduction. Nauchno-Technicheskaya Informatsiya 2, 9 (1968), 12--16.

[50]

Feifei Li, Bin Wu, Ke Yi, and Zhuoyue Zhao. 2016. Wander join: Online aggregation via random walks. In Proceedings of the 2016 International Conference on Management of Data. 615--629.

Digital Library

[51]

Henry Liu, Mingbin Xu, Ziting Yu, Vincent Corvinelli, and Calisto Zuzarte. 2015. Cardinality estimation using neural networks. In Proceedings of the 25th Annual International Conference on Computer Science and Software Engineering. 53--59.

Digital Library

[52]

Xin Liu, Haojie Pan, Mutian He, Yangqiu Song, Xin Jiang, and Lifeng Shang. 2020. Neural subgraph isomorphism counting. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1959--1969.

Digital Library

[53]

Xin Liu, Haojie Pan, Mutian He, Yangqiu Song, Xin Jiang, and Lifeng Shang. 2020. Neural Subgraph Isomorphism Counting. In KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Virtual Event, CA, USA, August 23--27, 2020. ACM, 1959--1969.

[54]

Yangwei Liu, Hu Ding, Danyang Chen, and Jinhui Xu. 2017. Novel geometric approach for global alignment of PPI networks. In Thirty-First AAAI Conference on Artificial Intelligence.

Digital Library

[55]

Zhaoyu Lou, Jiaxuan You, Chengtao Wen, Arquimedes Canedo, Jure Leskovec, et al. 2020. Neural Subgraph Matching. arXiv preprint arXiv:2007.03092 (2020).

[56]

Shmoolik Mangan and Uri Alon. 2003. Structure and function of the feed-forward loop network motif. Proceedings of the National Academy of Sciences 100, 21 (2003), 11980--11985.

[57]

Christopher Manning and Hinrich Schutze. 1999. Foundations of statistical natural language processing. MIT press.

Digital Library

[58]

Dror Marcus and Yuval Shavitt. 2010. Efficient counting of network motifs. In 2010 IEEE 30th International Conference on Distributed Computing Systems Workshops. IEEE, 92--98.

Digital Library

[59]

Dror Marcus and Yuval Shavitt. 2012. Rage--a rapid graphlet enumerator for large networks. Computer Networks 56, 2 (2012), 810--819.

Digital Library

[60]

Hermina Petric Maretic, Mireille El Gheche, Matthias Minder, Giovanni Chierchia, and Pascal Frossard. 2020. Wasserstein-based graph alignment. arXiv preprint arXiv:2003.06048 (2020).

[61]

Haggai Maron, Heli Ben-Hamu, Hadar Serviansky, and Yaron Lipman. 2019. Provably Powerful Graph Networks. In Proceedings of the 33rd International Conference on Neural Information Processing Systems. 2156--2167.

[62]

Haggai Maron, Heli Ben-Hamu, Nadav Shamir, and Yaron Lipman. 2018. Invariant and Equivariant Graph Networks. In International Conference on Learning Representations.

[63]

Brendan D McKay and Adolfo Piperno. 2013. Nauty and Traces user's guide (Version 2.5). Computer Science Department, Australian National University, Canberra, Australia (2013).

[64]

Ron Milo, Shai Shen-Orr, Shalev Itzkovitz, Nadav Kashtan, Dmitri Chklovskii, and Uri Alon. 2002. Network motifs: simple building blocks of complex networks. Science 298, 5594 (2002), 824--827.

[65]

Guido Moerkotte, Thomas Neumann, and Gabriele Steidl. 2009. Preventing bad plans by bounding the impact of cardinality estimation errors. Proceedings of the VLDB Endowment 2, 1 (2009), 982--993.

Digital Library

[66]

Guido Moerkotte, Thomas Neumann, and Gabriele Steidl. 2009. Preventing Bad Plans by Bounding the Impact of Cardinality Estimation Errors. Proc. VLDB Endow. 2, 1 (2009), 982--993.

Digital Library

[67]

Christopher Morris, Gaurav Rattan, and Petra Mutzel. 2020. Weisfeiler and Leman go sparse: Towards scalable higher-order graph embeddings. Advances in Neural Information Processing Systems 33 (2020).

[68]

Vinod Nair and Geoffrey E Hinton. 2010. Rectified linear units improve restricted boltzmann machines. In Icml.

[69]

Thomas Neumann and Guido Moerkotte. 2011. Characteristic sets: Accurate cardinality estimation for RDF queries with multiple joins. In 2011 IEEE 27th International Conference on Data Engineering. IEEE, 984--994.

Digital Library

[70]

Hung Q. Ngo. 2018. Worst-Case Optimal Join Algorithms: Techniques, Results, and Open Problems. In Proceedings of the 37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, Houston, TX, USA, June 10--15, 2018, Jan Van den Bussche and Marcelo Arenas (Eds.). ACM, 111--124.

Digital Library

[71]

Giannis Nikolentzos, Polykarpos Meladianos, and Michalis Vazirgiannis. 2017. Matching node embeddings for graph similarity. In Thirty-first AAAI conference on artificial intelligence.

Digital Library

[72]

Mark Ortmann and Ulrik Brandes. 2016. Quad census computation: Simple, efficient, and orbit-aware. In International Conference and School on Network Science. Springer, 1--13.

Digital Library

[73]

Yeonsu Park, Seongyun Ko, Sourav S. Bhowmick, Kyoungmin Kim, Kijae Hong, and Wook-Shin Han. 2020. G-CARE: A Framework for Performance Benchmarking of Cardinality Estimation Techniques for Subgraph Matching. In Proceedings of the 2020 International Conference on Management of Data, SIGMOD Conference 2020, online conference [Portland, OR, USA], June 14--19, 2020. ACM, 1099--1114.

Digital Library

[74]

Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. Deepwalk: Online learning of social representations. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. 701--710.

Digital Library

[75]

Giulia Preti, Gianmarco De Francisci Morales, and Matteo Riondato. 2021. MaNIACS: Approximate Mining of Frequent Subgraph Patterns through Sampling. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 1348--1358.

Digital Library

[76]

N Pr?ulj, Derek G Corneil, and Igor Jurisica. 2006. Efficient estimation of graphlet frequency distributions in protein--protein interaction networks. Bioinformatics 22, 8 (2006), 974--980.

Digital Library

[77]

Mahmudur Rahman, Mansurul Alam Bhuiyan, and Mohammad Al Hasan. 2014. Graft: An efficient graphlet counting method for large graph analysis. IEEE Transactions on Knowledge and Data Engineering 26, 10 (2014), 2466--2478.

[78]

Pedro Ribeiro, Pedro Paredes, Miguel EP Silva, David Aparicio, and Fernando Silva. 2021. A Survey on Subgraph Counting: Concepts, Algorithms, and Applications to Network Motifs and Graphlets. ACM Computing Surveys (CSUR) 54, 2 (2021), 1--36.

Digital Library

[79]

Carlos R Rivero and Hasan M Jamil. 2017. Efficient and scalable labeled subgraph matching using SGMatch. Knowledge and Information Systems 51, 1 (2017), 61--87.

Digital Library

[80]

Tanay Kumar Saha and Mohammad Al Hasan. 2015. Finding network motifs using MCMC sampling. In Complex Networks VI. Springer, 13--24.

[81]

Ryoma Sato. 2020. A survey on the expressive power of graph neural networks. arXiv preprint arXiv:2003.04078 (2020).

[82]

Comandur Seshadhri, Ali Pinar, and Tamara G Kolda. 2013. Triadic measures on graphs: The power of wedge sampling. In Proceedings of the 2013 SIAM international conference on data mining. SIAM, 10--18.

[83]

Comandur Seshadhri and Srikanta Tirthapura. 2019. Scalable subgraph counting: the methods behind the madness. In Companion Proceedings of The 2019 World Wide Web Conference. 1317--1318.

Digital Library

[84]

Haichuan Shang, Ying Zhang, Xuemin Lin, and Jeffrey Xu Yu. 2008. Taming verification hardness: an efficient algorithm for testing subgraph isomorphism. Proceedings of the VLDB Endowment 1, 1 (2008), 364--375.

Digital Library

[85]

Nino Shervashidze, SVN Vishwanathan, Tobias Petri, Kurt Mehlhorn, and Karsten Borgwardt. 2009. Efficient graphlet kernels for large graph comparison. In Artificial intelligence and statistics. PMLR, 488--495.

[86]

Giorgio Stefanoni, Boris Motik, and Egor V Kostylev. 2018. Estimating the cardinality of conjunctive queries over RDF data using graph summarisation. In Proceedings of the 2018 World Wide Web Conference. 1043--1052.

Digital Library

[87]

Ji Sun and Guoliang Li. 2019. An end-to-end learning-based cost estimator. Proceedings of the VLDB Endowment 13, 3 (2019), 307--319.

Digital Library

[88]

Ji Sun, Guoliang Li, and Nan Tang. 2021. Learned Cardinality Estimation for Similarity Queries. In Proceedings of the 2021 International Conference on Management of Data. 1745--1757.

Digital Library

[89]

Shixuan Sun and Qiong Luo. 2020. In-Memory Subgraph Matching: An In-depth Study. In Proceedings of the 2020 International Conference on Management of Data, SIGMOD Conference 2020, online conference [Portland, OR, USA], June 14--19, 2020. ACM, 1083--1098.

Digital Library

[90]

Vladimir Vacic, Lilia M Iakoucheva, Stefano Lonardi, and Predrag Radivojac. 2010. Graphlet kernels for prediction of functional residues in protein structures. Journal of Computational Biology 17, 1 (2010), 55--72.

[91]

Petar Velickovic, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. 2018. Graph Attention Networks. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings. OpenReview.net.

[92]

David Vengerov, Andre Cavalheiro Menck, Mohamed Zait, and Sunil P Chakkappen. 2015. Join size estimation subject to filter conditions. Proceedings of the VLDB Endowment 8, 12 (2015), 1530--1541.

Digital Library

[93]

Clément Vignac, Andreas Loukas, and Pascal Frossard. 2020. Building powerful and equivariant graph neural networks with structural message-passing. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6--12, 2020, virtual, Hugo Larochelle, Marc'Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, and Hsuan-Tien Lin (Eds.).

[94]

Cédric Villani. 2009. Optimal transport: old and new. Vol. 338. Springer.

[95]

Hanchen Wang, Defu Lian, Wanqi Liu, Dong Wen, Chen Chen, and Xiaoyang Wang. 2021. Powerful graph of graphs neural network for structured entity analysis. World Wide Web (2021), 1--21.

[96]

Hanchen Wang, Defu Lian, Ying Zhang, Lu Qin, Xiangjian He, Yiguang Lin, and Xuemin Lin. 2021. Binarized graph neural network. World Wide Web 24, 3 (2021), 825--848.

[97]

Hanchen Wang, Defu Lian, Ying Zhang, Lu Qin, and Xuemin Lin. 2021. GoGNN: graph of graphs neural network for predicting structured entity interactions. In Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence. 1317--1323.

[98]

Hanchen Wang, Ying Zhang, Lu Qin, Wei Wang, Wenjie Zhang, and Xuemin Lin. 2022. Reinforcement Learning Based Query Vertex Ordering Model for Subgraph Matching. arXiv preprint arXiv:2201.11251 (2022).

[99]

Jianxin Wang, Yuannan Huang, Fang-Xiang Wu, and Yi Pan. 2012. Symmetry compression method for discovering network motifs. IEEE/ACM transactions on computational biology and bioinformatics 9, 6 (2012), 1776--1789.

Digital Library

[100]

Pinghui Wang, John CS Lui, Bruno Ribeiro, Don Towsley, Junzhou Zhao, and Xiaohong Guan. 2014. Efficiently estimating motif statistics of large networks. ACM Transactions on Knowledge Discovery from Data (TKDD) 9, 2 (2014), 1--27.

[101]

Pinghui Wang, Junzhou Zhao, Xiangliang Zhang, Zhenguo Li, Jiefeng Cheng, John CS Lui, Don Towsley, Jing Tao, and Xiaohong Guan. 2017. MOSS-5: A fast method of approximating counts of 5-node graphlets in large graphs. IEEE Transactions on Knowledge and Data Engineering 30, 1 (2017), 73--86.

[102]

Yaoshu Wang, Chuan Xiao, Jianbin Qin, Rui Mao, Makoto Onizuka, Wei Wang, Rui Zhang, and Yoshiharu Ishikawa. 2021. Consistent and flexible selectivity estimation for high-dimensional data. In Proceedings of the 2021 International Conference on Management of Data. 2319--2327.

Digital Library

[103]

Peizhi Wu and Gao Cong. 2021. A Unified Deep Model of Learning from both Data and Queries for Cardinality Estimation. In Proceedings of the 2021 International Conference on Management of Data. 2009--2022.

Digital Library

[104]

Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, and S Yu Philip. 2020. A comprehensive survey on graph neural networks. IEEE transactions on neural networks and learning systems 32, 1 (2020), 4--24.

[105]

Ziniu Wu, Amir Shaikhha, Rong Zhu, Kai Zeng, Yuxing Han, and Jingren Zhou. 2020. BayesCard: Revitilizing Bayesian Frameworks for Cardinality Estimation. arXiv preprint arXiv:2012.14743 (2020).

[106]

Hongteng Xu, Dixin Luo, and Lawrence Carin. 2019. Scalable Gromov-Wasserstein learning for graph partitioning and matching. Advances in neural information processing systems 32 (2019), 3052--3062.

[107]

Hongteng Xu, Dixin Luo, Hongyuan Zha, and Lawrence Carin Duke. 2019. Gromov-wasserstein learning for graph matching and node embedding. In International conference on machine learning. PMLR, 6932--6941.

[108]

Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. 2018. How Powerful are Graph Neural Networks?. In International Conference on Learning Representations.

[109]

Zongheng Yang, Amog Kamsetty, Sifei Luan, Eric Liang, Yan Duan, Xi Chen, and Ion Stoica. 2020. NeuroCard: one cardinality estimator for all tables. Proceedings of the VLDB Endowment 14, 1 (2020), 61--73.

Digital Library

[110]

Zhengyi Yang, Longbin Lai, Xuemin Lin, Kongzhang Hao, and Wenjie Zhang. 2021. Huge: An efficient and scalable subgraph enumeration system. In Proceedings of the 2021 International Conference on Management of Data. 2049--2062.

Digital Library

[111]

]yang13deep Zongheng Yang, Eric Liang, Amog Kamsetty, Chenggang Wu, Yan Duan, Xi Chen, Pieter Abbeel, Joseph M Hellerstein, Sanjay Krishnan, and Ion Stoica. [n. d.]. Deep Unsupervised Cardinality Estimation. Proceedings of the VLDB Endowment 13, 3 ([n. d.]).

[112]

Chuxu Zhang, Dongjin Song, Chao Huang, Ananthram Swami, and Nitesh V Chawla. 2019. Heterogeneous graph neural network. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 793--803.

Digital Library

[113]

Jie Zhang, Yuxiao Dong, Yan Wang, Jie Tang, and Ming Ding. 2019. ProNE: Fast and Scalable Network Representation Learning. In IJCAI, Vol. 19. 4278--4284.

[114]

Luming Zhang, Mingli Song, Zicheng Liu, Xiao Liu, Jiajun Bu, and Chun Chen. 2013. Probabilistic graphlet cut: Exploiting spatial structure cue for weakly supervised image segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1908--1915.

Digital Library

[115]

Luming Zhang, Mingli Song, Qi Zhao, Xiao Liu, Jiajun Bu, and Chun Chen. 2012. Probabilistic graphlet transfer for photo cropping. IEEE Transactions on Image Processing 22, 2 (2012), 802--815.

Digital Library

[116]

Shijie Zhang, Shirong Li, and Jiong Yang. 2009. GADDI: distance index based subgraph matching in biological networks. In Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology. 192--203.

Digital Library

[117]

Kangfei Zhao, Jeffrey Xu Yu, Hao Zhang, Qiyan Li, and Yu Rong. 2021. A Learned Sketch for Subgraph Counting. In Proceedings of the 2021 International Conference on Management of Data. 2142--2155.

Digital Library

[118]

Peixiang Zhao and Jiawei Han. 2010. On graph query optimization in large networks. Proceedings of the VLDB Endowment 3, 1--2 (2010), 340--351.

Digital Library

[119]

Zhuoyue Zhao, Robert Christensen, Feifei Li, Xiao Hu, and Ke Yi. 2018. Random sampling over joins revisited. In Proceedings of the 2018 International Conference on Management of Data. 1525--1539.

Digital Library

[120]

Chen Zhengdao, Chen Lei, Villar Soledad, and Joan Bruna. 2020. Can Graph Neural Networks Count Substructures? Advances in neural information processing systems (2020).

[121]

Dongxiao Zhu and Zhaohui S Qin. 2005. Structural comparison of metabolic networks in selected single cell organisms. BMC bioinformatics 6, 1 (2005), 1--12.

Cited By

Sima QYu JWang XZhang WZhang YLin X(2025)Deep Overlapping Community Search via Subspace EmbeddingProceedings of the ACM on Management of Data10.1145/37096783:1(1-26)Online publication date: 11-Feb-2025
https://dl.acm.org/doi/10.1145/3709678
Cheng QYan DWu THuang ZZhang Q(2025)Computing Approximate Graph Edit Distance via Optimal TransportProceedings of the ACM on Management of Data10.1145/37096733:1(1-26)Online publication date: 11-Feb-2025
https://doi.org/10.1145/3709673
Fang SZhao KRong YLi ZYu J(2024)Inductive Attributed Community Search: To Learn Communities Across GraphsProceedings of the VLDB Endowment10.14778/3675034.367504817:10(2576-2589)Online publication date: 6-Aug-2024
https://dl.acm.org/doi/10.14778/3675034.3675048
Show More Cited By

Index Terms

Neural Subgraph Counting with Wasserstein Estimator
1. Information systems
  1. Information systems applications

Recommendations

A Learned Sketch for Subgraph Counting
SIGMOD '21: Proceedings of the 2021 International Conference on Management of Data

Subgraph counting, as a fundamental problem in network analysis, is to count the number of subgraphs in a data graph that match a given query graph by either homomorphism or subgraph isomorphism. The importance of subgraph counting derives from the fact ...
Neural Subgraph Isomorphism Counting
KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

In this paper, we study a new graph learning problem: learning to count subgraph isomorphisms. Different from other traditional graph learning problems such as node classification and link prediction, subgraph isomorphism counting is NP-complete and ...
Learned sketch for subgraph counting: a holistic approach
Abstract
Subgraph counting, as a fundamental problem in network analysis, is to count the number of subgraphs in a data graph that match a given query graph by either homomorphism or subgraph isomorphism. The importance of subgraph counting derives from ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGMOD '22: Proceedings of the 2022 International Conference on Management of Data

June 2022

2597 pages

ISBN:9781450392495

DOI:10.1145/3514221

General Chair:
Zachary Ives
University of Pennsylvania (USA)
,
Program Chairs:
Angela Bonifati
Lyon 1 University (France)
,
Amr El Abbadi
University of California, Santa Barbara (USA)

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMOD: ACM Special Interest Group on Management of Data

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 June 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Australian Research Council

Conference

SIGMOD/PODS '22

Sponsor:

SIGMOD

SIGMOD/PODS '22: International Conference on Management of Data

June 12 - 17, 2022

PA, Philadelphia, USA

Acceptance Rates

Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

22
Total Citations
View Citations
1,220
Total Downloads

Downloads (Last 12 months)297
Downloads (Last 6 weeks)20

Reflects downloads up to 10 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Sima QYu JWang XZhang WZhang YLin X(2025)Deep Overlapping Community Search via Subspace EmbeddingProceedings of the ACM on Management of Data10.1145/37096783:1(1-26)Online publication date: 11-Feb-2025
https://dl.acm.org/doi/10.1145/3709678
Cheng QYan DWu THuang ZZhang Q(2025)Computing Approximate Graph Edit Distance via Optimal TransportProceedings of the ACM on Management of Data10.1145/37096733:1(1-26)Online publication date: 11-Feb-2025
https://doi.org/10.1145/3709673
Fang SZhao KRong YLi ZYu J(2024)Inductive Attributed Community Search: To Learn Communities Across GraphsProceedings of the VLDB Endowment10.14778/3675034.367504817:10(2576-2589)Online publication date: 6-Aug-2024
https://dl.acm.org/doi/10.14778/3675034.3675048
Wang JWang KLin XZhang WZhang Y(2024)Efficient Unsupervised Community Search with Pre-Trained Graph TransformerProceedings of the VLDB Endowment10.14778/3665844.366585317:9(2227-2240)Online publication date: 6-Aug-2024
https://dl.acm.org/doi/10.14778/3665844.3665853
Li QYu J(2024)Fast Local Subgraph CountingProceedings of the VLDB Endowment10.14778/3659437.365945117:8(1967-1980)Online publication date: 31-May-2024
https://dl.acm.org/doi/10.14778/3659437.3659451
Shin WSong SPark KHan W(2024)Cardinality Estimation of Subgraph Matching: A Filtering-Sampling ApproachProceedings of the VLDB Endowment10.14778/3654621.365463517:7(1697-1709)Online publication date: 30-May-2024
https://dl.acm.org/doi/10.14778/3654621.3654635
邱文(2024)Frequent Itemset Mining in the Graph Data FieldComputer Science and Application10.12677/CSA.2024.14101714:01(158-172)Online publication date: 2024
https://doi.org/10.12677/CSA.2024.141017
Schwabe TAcosta M(2024)Cardinality Estimation over Knowledge Graphs with Embeddings and Graph Neural NetworksProceedings of the ACM on Management of Data10.1145/36392992:1(1-26)Online publication date: 26-Mar-2024
https://dl.acm.org/doi/10.1145/3639299
Ye CLi YSun SGuo W(2024)gSWORD: GPU-accelerated Sampling for Subgraph CountingProceedings of the ACM on Management of Data10.1145/36392882:1(1-26)Online publication date: 26-Mar-2024
https://dl.acm.org/doi/10.1145/3639288
Yan DYuan LAhmad AZheng CChen HCheng JBaeza-Yates RBonchi F(2024)Systems for Scalable Graph Analytics and Machine Learning: Trends and MethodsProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671472(6627-6632)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671472
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten