skip to main content
10.1145/3639233.3639357acmotherconferencesArticle/Chapter ViewAbstractPublication PagesnlpirConference Proceedingsconference-collections
research-article
Open access

Assessing the Quality of a Knowledge Graph via Link Prediction Tasks

Published: 05 March 2024 Publication History

Abstract

Knowledge Graph (KG) Construction is the prerequisite for all other KG research and applications. Researchers and engineers have proposed various approaches to build KGs for their use cases. However, how can we know whether our constructed KG is good or bad? Is it correct and complete? Is it consistent and robust? In this paper, we propose a method called LP-Measure to assess the quality of a KG via a link prediction tasks, without using a gold standard or other human labour. Though theoretically, the LP-Measure can only assess consistency and redundancy, instead of the more desirable correctness and completeness, empirical evidence shows that this measurement method can quantitatively distinguish the good KGs from the bad ones, even in terms of incorrectness and incompleteness. Compared with the most commonly used manual assessment, our LP-Measure is an automated evaluation, which saves time and human labour.

References

[1]
Abdulwahhab Alshammari, Raed Almalki, and Riyad Alshammari. 2021. Developing a Predictive Model of Predicting Appointment No-Show by Using Machine Learning Algorithms. Journal of Advances in Information Technology 12, 3 (2021).
[2]
Hiba Arnaout, Simon Razniewski, Gerhard Weikum, and Jeff Z Pan. 2021. Negative statements considered useful. Journal of Web Semantics 71 (2021), 100661.
[3]
Caleb Belth, Xinyi Zheng, Jilles Vreeken, and Danai Koutra. 2020. What is normal, what is strange, and what is missing in a knowledge graph: Unified characterization via inductive summarization. In Proceedings of The Web Conference 2020. 1115–1126.
[4]
Carlos Bobed, Pierre Maillot, Peggy Cellier, and Sébastien Ferré. 2020. Data-driven assessment of structural evolution of RDF graphs. Semantic Web 11, 5 (2020), 831–853.
[5]
Kurt Bollacker, Colin Evans, Praveen Paritosh, Tim Sturge, and Jamie Taylor. 2008. Freebase: a collaboratively created graph database for structuring human knowledge. In Proceedings of the 2008 ACM SIGMOD international conference on Management of data. 1247–1250.
[6]
Antoine Bordes, Nicolas Usunier, Alberto Garcia-Duran, Jason Weston, and Oksana Yakhnenko. 2013. Translating embeddings for modeling multi-relational data. Advances in neural information processing systems 26 (2013).
[7]
Bonggeun Choi, Daesik Jang, and Youngjoong Ko. 2021. MEM-KGC: Masked entity model for knowledge graph completion with pre-trained language model. IEEE Access 9 (2021), 132025–132032.
[8]
Matt Dennis, Kees Van Deemter, Daniele Dell’Aglio, and Jeff Z Pan. 2017. Computing authoring tests from competency questions: experimental validation. In The Semantic Web–ISWC 2017: 16th International Semantic Web Conference, Vienna, Austria, October 21–25, 2017, Proceedings, Part I 16. Springer, 243–259.
[9]
Tim Dettmers, Pasquale Minervini, Pontus Stenetorp, and Sebastian Riedel. 2018. Convolutional 2d knowledge graph embeddings. In Proceedings of the AAAI conference on artificial intelligence, Vol. 32.
[10]
Xin Dong, Evgeniy Gabrilovich, Geremy Heitz, Wilko Horn, Ni Lao, Kevin Murphy, Thomas Strohmann, Shaohua Sun, and Wei Zhang. 2014. Knowledge vault: A web-scale approach to probabilistic knowledge fusion. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. 601–610.
[11]
Junyang Gao, Xian Li, Yifan Ethan Xu, Bunyamin Sisman, Xin Luna Dong, and Jun Yang. 2019. Efficient Knowledge Graph Accuracy Evaluation. Proceedings of the VLDB Endowment 12, 11 (2019).
[12]
Nicola Guarino and Christopher Welty. 2002. Evaluating ontological decisions with OntoClean. Commun. ACM 45, 2 (2002), 61–65.
[13]
Shaoxiong Ji, Shirui Pan, Erik Cambria, Pekka Marttinen, and S Yu Philip. 2021. A survey on knowledge graphs: Representation, acquisition, and applications. IEEE transactions on neural networks and learning systems 33, 2 (2021), 494–514.
[14]
Shengbin Jia, Yang Xiang, Xiaojun Chen, and Kun Wang. 2019. Triple trustworthiness measurement for knowledge graph. In The World Wide Web Conference. 2865–2871.
[15]
George A Miller. 1995. WordNet: a lexical database for English. Commun. ACM 38, 11 (1995), 39–41.
[16]
Dai Quoc Nguyen, Tu Dinh Nguyen, Dat Quoc Nguyen, and Dinh Phung. 2017. A novel embedding model for knowledge base completion based on convolutional neural network. arXiv preprint arXiv:1712.02121 (2017).
[17]
Maximilian Nickel, Kevin Murphy, Volker Tresp, and Evgeniy Gabrilovich. 2015. A review of relational machine learning for knowledge graphs. Proc. IEEE 104, 1 (2015), 11–33.
[18]
Prakhar Ojha and Partha Talukdar. 2017. KGEval: Accuracy estimation of automatically constructed knowledge graphs. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 1741–1750.
[19]
J.Z. Pan, G. Vetere, J.M. Gomez-Perez, and H. Wu (Eds.). 2017. Exploiting Linked Data and Knowledge Graphs for Large Organisations. Springer.
[20]
Jeff Z. Pan. 2009. Resource Description Framework. In Handbook on Ontologies. 71–90. https://doi.org/10.1007/978-3-540-92673-3_3
[21]
Heiko Paulheim. 2017. Knowledge graph refinement: A survey of approaches and evaluation methods. Semantic web 8, 3 (2017), 489–508.
[22]
Mohammad Rashid, Marco Torchiano, Giuseppe Rizzo, Nandana Mihindukulasooriya, and Oscar Corcho. 2019. A quality assessment approach for evolving knowledge bases. Semantic Web 10, 2 (2019), 349–383.
[23]
Yuan Ren, Artemis Parvizi, Chris Mellish, Jeff Z Pan, Kees Van Deemter, and Robert Stevens. 2014. Towards competency question-driven ontology authoring. In The Semantic Web: Trends and Challenges: 11th International Conference, ESWC 2014, Anissaras, Crete, Greece, May 25-29, 2014. Proceedings 11. Springer, 752–767.
[24]
S Revathy, B Bharathi, P Jeyanthi, and M Ramesh. 2019. Chronic kidney disease prediction using machine learning models. International Journal of Engineering and Advanced Technology 9, 1 (2019), 6364–6367.
[25]
Andrea Rossi, Denilson Barbosa, Donatella Firmani, Antonio Matinata, and Paolo Merialdo. 2021. Knowledge graph embedding for link prediction: A comparative analysis. ACM Transactions on Knowledge Discovery from Data (TKDD) 15, 2 (2021), 1–49.
[26]
Mohammad Javad Saeedizade, Najmeh Torabian, and Behrouz Minaei-Bidgoli. 2022. KGRefiner: Knowledge Graph Refinement for Improving Accuracy of Translational Link Prediction Methods. In Proceedings of The Third Workshop on Simple and Efficient Natural Language Processing (SustaiNLP). Association for Computational Linguistics, Abu Dhabi, United Arab Emirates (Hybrid), 10–16. https://aclanthology.org/2022.sustainlp-1.3
[27]
Kristina Toutanova and Danqi Chen. 2015. Observed versus latent features for knowledge base and text inference. In Proceedings of the 3rd Workshop on Continuous Vector Space Models and their Compositionality. Association for Computational Linguistics, Beijing, China, 57–66. https://doi.org/10.18653/v1/W15-4007
[28]
Théo Trouillon, Johannes Welbl, Sebastian Riedel, Éric Gaussier, and Guillaume Bouchard. 2016. Complex embeddings for simple link prediction. In International conference on machine learning. PMLR, 2071–2080.
[29]
Boris Villazon-Terrazas, Nuria Garcia-Santa, Yuan Ren, Alessandro Faraotti, Honghan Wu, Yuting Zhao, Guido Vetere, and Jeff Z Pan. 2017. Knowledge graph foundations. In Exploiting Linked Data and Knowledge Graphs in Large Organisations. Springer, 17–55.
[30]
Alex Wang, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, and Samuel Bowman. 2018. GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP. 353–355.
[31]
Liang Wang, Wei Zhao, Zhuoyu Wei, and Jingming Liu. 2022. SimKGC: Simple Contrastive Knowledge Graph Completion with Pre-trained Language Models. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 4281–4294.
[32]
Meihong Wang, Linling Qiu, and Xiaoli Wang. 2021. A survey on knowledge graph embeddings for link prediction. Symmetry 13, 3 (2021), 485.
[33]
Peter West, Chandra Bhagavatula, Jack Hessel, Jena Hwang, Liwei Jiang, Ronan Le Bras, Ximing Lu, Sean Welleck, and Yejin Choi. 2022. Symbolic Knowledge Distillation: from General Language Models to Commonsense Models. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 4602–4625.
[34]
Kemas Wiharja, Jeff Z Pan, Martin J Kollingbaum, and Yu Deng. 2020. Schema aware iterative Knowledge Graph completion. Journal of Web Semantics 65 (2020), 100616.
[35]
Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, and S Yu Philip. 2020. A comprehensive survey on graph neural networks. IEEE transactions on neural networks and learning systems 32, 1 (2020), 4–24.
[36]
Amrapali Zaveri, Anisa Rula, Andrea Maurino, Ricardo Pietrobon, Jens Lehmann, and Soeren Auer. 2016. Quality assessment for linked data: A survey. Semantic Web 7, 1 (2016), 63–93.
[37]
Xiaohan Zou. 2020. A survey on application of knowledge graph. In Journal of Physics: Conference Series, Vol. 1487. IOP Publishing, 012016.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
NLPIR '23: Proceedings of the 2023 7th International Conference on Natural Language Processing and Information Retrieval
December 2023
336 pages
ISBN:9798400709227
DOI:10.1145/3639233
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 March 2024

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Knowledge Graph
  2. Link Prediction
  3. Quality Assessment

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

Conference

NLPIR 2023

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 292
    Total Downloads
  • Downloads (Last 12 months)292
  • Downloads (Last 6 weeks)60
Reflects downloads up to 13 Jan 2025

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media