skip to main content
research-article

Incorporating topic and property for knowledge base synchronization

Published: 13 June 2024 Publication History

Abstract

Open-domain knowledge bases have been widely used in many applications, and it is critical to maintain their freshness. Most existing studies update an open knowledge base by predicting the change frequencies of the entities and then updating those unstable ones. However, in the knowledge base, there are various entities and properties with complex structural information, and many entities are time-sensitive. In this work, we propose a novel topic-aware entity stability prediction framework which incorporates property and topic features of the entities to help efficiently update the knowledge base. To deal with the complex entity structure and various entity properties, we first build an entity property graph for each entity, with its property names as edges and property values as nodes. Then, with the constructed entity property graph, we analyze the topic information of the entities and propose a topic classifier via unsupervised clustering to further improve the accuracy of prediction. To tackle the time-sensitive challenge, we measure the monthly average update frequency of the entity, based on its revision history acquired from the source encyclopedia webpage, as the basis for labeling its stability. Finally, we formulate the prediction task as a binary classification problem and solve it with an entity stability predictor, wherein the topic information serves as strong supervision. Extensive experiments on collections of real-world entities have demonstrated the superior performance of our proposed method and also well shown the benefits of each new module in our framework.

References

[1]
Galárraga L, Heitz G, Murphy K, Suchanek FM (2014) Canonicalizing open knowledge bases. In: Proceedings of the 23rd Acm International Conference on Conference on Information and Knowledge Management, pp 1679–1688
[2]
Vashishth S, Jain P, Talukdar P (2018) Cesi: Canonicalizing open knowledge bases using embeddings and side information. In: Proceedings of the 2018 World Wide Web Conference, pp 1317–1327
[3]
Beniwal R, Gawas P, Charan CP, Nutalapati V, and Mariserla BMK Effect of hydroxy groups on nonlinear optical behaviour of encapsulated freebase porphyrin thin films in a borate glass matrix Mater Sci Eng, B 2022 284 115908
[4]
Miller GA Wordnet: a lexical database for english Commun ACM 1995 38 11 39-41
[5]
Bollacker K, Evans C, Paritosh P, Sturge T, Taylor J (2008) Freebase: a collaboratively created graph database for structuring human knowledge. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pp 1247–1250
[6]
Mitchell T, Cohen W, Hruschka E, Talukdar P, Yang B, Betteridge J, Carlson A, Dalvi B, Gardner M, Kisiel B, et al. Never-ending learning Commun ACM 2018 61 5 103-115
[7]
Hellmann S, Stadler C, Lehmann J, Auer S (2009) Dbpedia live extraction. In: On the Move to Meaningful Internet Systems: OTM 2009: Confederated International Conferences, CoopIS, DOA, IS, and ODBASE 2009, Vilamoura, Portugal, November 1-6, 2009, Proceedings, Part II, pp 1209–1223. Springer
[8]
Morsey M, Lehmann J, Auer S, Stadler C, and Hellmann S Dbpedia and the live extraction of structured data from wikipedia Program 2012 46 2 157-181
[9]
Liang12 J, Zhang S, Xiao134 Y (2017) How to keep a knowledge base synchronized with its encyclopedia source
[10]
Konovalov A, Strauss B, Ritter A, O’Connor B (2017) Learning to extract events from knowledge base revisions. In: Proceedings of the 26th International Conference on World Wide Web, pp 1007–1014
[11]
Tang J, Feng Y, Zhao D (2019) Learning to update knowledge graphs by reading news. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp 2632–2641
[12]
Tanon TP, Kaffee L-A (2018) Property label stability in wikidata. In: Companion of the The Web Conference 2018. ACM Press
[13]
Dikeoulias I, Strötgen J, Razniewski S (2019) Epitaph or breaking news? analyzing and predicting the stability of knowledge base properties. In: Companion Proceedings of The 2019 World Wide Web Conference, pp 1155–1158
[14]
Pellissier Tanon T, Kaffee L-A (2018) Property label stability in wikidata: evolution and convergence of schemas in collaborative knowledge bases. In: Companion Proceedings of the The Web Conference 2018, pp 1801–1803
[15]
Vrandečić D and Krötzsch M Wikidata: a free collaborative knowledgebase Commun ACM 2014 57 10 78-85
[16]
Hogan A, Blomqvist E, Cochez M, d’Amato C, Melo Gd, Gutierrez C, Kirrane S, Gayo JEL, Navigli R, Neumaier S, et al. Knowledge graphs ACM Computing Surveys (CSUR) 2021 54 4 1-37
[17]
Ji S, Pan S, Cambria E, Marttinen P, and Philip SY A survey on knowledge graphs: Representation, acquisition, and applications IEEE transactions on neural networks and learning systems 2021 33 2 494-514
[18]
Pareja A, Domeniconi G, Chen J, Ma T, Suzumura T, Kanezashi H, Kaler T, Schardl T, Leiserson C (2020) Evolvegcn: Evolving graph convolutional networks for dynamic graphs. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 5363–5370
[19]
Xu D, Ruan C, Korpeoglu E, Kumar S, Achan K (2020) Inductive representation learning on temporal graphs. arXiv preprint arXiv:2002.07962
[20]
Wang X, Lyu D, Li M, Xia Y, Yang Q, Wang X, Wang X, Cui P, Yang Y, Sun B et al (2021) Apan: Asynchronous propagation attention network for real-time temporal graph embedding. In: Proceedings of the 2021 International Conference on Management of Data, pp. 2628–2638
[21]
Zhou H, Zheng D, Nisa I, Ioannidis V, Song X, Karypis G (2022) Tgl: A general framework for temporal gnn training on billion-scale graphs. arXiv preprint arXiv:2203.14883
[22]
Rossi E, Chamberlain B, Frasca F, Eynard D, Monti F, Bronstein M (2020) Temporal graph networks for deep learning on dynamic graphs. arXiv preprint arXiv:2006.10637
[23]
Longa A, Lachi V, Santin G, Bianchini M, Lepri B, Lio P, Scarselli F, Passerini A (2023) Graph neural networks for temporal graphs: State of the art, open challenges, and opportunities. arXiv preprint arXiv:2302.01018
[24]
Xu K, Hu W, Leskovec J, Jegelka S (2018) How powerful are graph neural networks? arXiv preprint arXiv:1810.00826
[25]
Kipf TN, Welling M (2016) Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907
[26]
Cho J and Garcia-Molina H Estimating frequency of change ACM Transactions on Internet Technology (TOIT) 2003 3 3 256-290
[27]
Riedmiller M, Lernen A (2014) Multi layer perceptron. Machine Learning Lab Special Lecture, University of Freiburg, 7–24
[28]
Cordonnier J-B, Loukas A, Jaggi M (2020) Multi-head attention: Collaborate instead of concatenate. arXiv preprint arXiv:2006.16362
[29]
Yue L, Jun X, Sihang Z, Siwei W, Xifeng G, Xihong Y, Ke L, Wenxuan T, Wang LX et al (2022) A survey of deep graph clustering: Taxonomy, challenge, and application. arXiv preprint arXiv:2211.12875
[30]
Ran X, Xi Y, Lu Y, Wang X, and Lu Z Comprehensive survey on hierarchical clustering algorithms and the recent developments Artif Intell Rev 2023 56 8 8219-8264
[31]
Tsitsulin A, Palowitch J, Perozzi B, and Müller E Graph clustering with graph neural networks J Mach Learn Res 2023 24 127 1-21
[32]
Shin G, Albanie S, Xie W (2022) Unsupervised salient object detection with spectral cluster voting. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3971–3980
[33]
Zhao S, Zhu L, Wang X, Yang Y (2022) Centerclip: Token clustering for efficient text-video retrieval. In: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 970–981
[34]
Verma D, Meila M (2003) A comparison of spectral clustering algorithms. University of Washington Tech Rep UWCSE030501 1:1–18
[35]
Filippone M, Camastra F, Masulli F, and Rovetta S A survey of kernel and spectral methods for clustering Pattern Recogn 2008 41 1 176-190
[36]
Ruby U and Yendapalli V Binary cross entropy with deep learning technique for image classification Int J Adv Trends Comput Sci Eng 2020 9 10 8353
[37]
Blakely D, Lanchantin J, Qi Y (2021) Time and space complexity of graph convolutional networks. Accessed on: Dec 31
[38]
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Advances in neural information processing systems 30
[39]
Tremblay N, Loukas A (2020) Approximating spectral clustering via sampling: a review. Sampling Techniques for Supervised or Unsupervised Tasks, 129–183
[40]
Neter J, Kutner MH, Nachtsheim CJ, Wasserman W et al (1996) Applied linear statistical models
[41]
Liaw A, Wiener M, et al. Classification and regression by randomforest R news 2002 2 3 18-22
[42]
Schlichtkrull M, Kipf TN, Bloem P, Van Den Berg R, Titov I, Welling M (2018) Modeling relational data with graph convolutional networks. In: The Semantic Web: 15th International Conference, ESWC 2018, Heraklion, Crete, Greece, June 3–7, 2018, Proceedings 15, pp 593–607. Springer
[43]
Velickovic P, Cucurull G, Casanova A, Romero A, Lio P, Bengio Y, et al. Graph attention networks stat 2017 1050 20 10-48550
[44]
Xu B, Xu Y, Liang J, Xie C, Liang B, Cui W, Xiao Y (2017) Cn-dbpedia: A never-ending chinese knowledge extraction system. In: International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems, pp 428–438. Springer
[45]
Wijaya DT, Nakashole N, Mitchell T (2015) “a spousal relation begins with a deletion of engage and ends with an addition of divorce”: Learning state changing verbs from wikipedia revision history. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp 518–523
[46]
Razniewski S (2016) Optimizing update frequencies for decaying information. In: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, pp 1191–1200
[47]
Galárraga L, Razniewski S, Amarilli A, Suchanek FM (2017) Predicting completeness in knowledge bases. In: Proceedings of the Tenth Acm International Conference on Web Search and Data Mining, pp. 375–383
[48]
Luggen M, Audiffren J, Difallah D, Cudré-Mauroux P (2021) Wiki2prop: A multimodal approach for predicting wikidata properties from wikipedia. In: Proceedings of the Web Conference 2021, pp. 2357–2366
[49]
Shenoy K, Ilievski F, Garijo D, Schwabe D, and Szekely P A study of the quality of wikidata Journal of Web Semantics 2022 72 100679
[50]
Liu Y, Hua W, Xin K, Hosseini S, Zhou X (2023) Tea: Time-aware entity alignment in knowledge graphs. In: Proceedings of the ACM Web Conference 2023, pp. 2591–2599
[51]
Najafipour S, Hosseini S, Hua W, Kangavari MR, and Zhou X Soulmate: Short-text author linking through multi-aspect temporal-textual embedding IEEE Trans Knowl Data Eng 2020 34 1 448-461

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Knowledge and Information Systems
Knowledge and Information Systems  Volume 66, Issue 10
Oct 2024
779 pages

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 13 June 2024
Accepted: 01 June 2024
Revision received: 21 December 2023
Received: 12 October 2023

Author Tags

  1. Knowledge bases
  2. Temporal validity
  3. Entity stability prediction
  4. Graph neural network

Qualifiers

  • Research-article

Funding Sources

  • This work was supported by the National Natural Science Foundation of China.

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 06 Jan 2025

Other Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media