skip to main content
10.1145/3626772.3657800acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

NativE: Multi-modal Knowledge Graph Completion in the Wild

Published: 11 July 2024 Publication History

Abstract

Multi-modal knowledge graph completion (MMKGC) aims to automatically discover the unobserved factual knowledge from a given multi-modal knowledge graph by collaboratively modeling the triple structure and multi-modal information from entities. However, real-world MMKGs present challenges due to their diverse and imbalanced nature, which means that the modality information can span various types (e.g., image, text, numeric, audio, video) but its distribution among entities is uneven, leading to missing modalities for certain entities. Existing works usually focus on common modalities like image and text while neglecting the imbalanced distribution phenomenon of modal information. To address these issues, we propose a comprehensive framework NativE to achieve MMKGC in the wild. NativE proposes a relation-guided dual adaptive fusion module that enables adaptive fusion for any modalities and employs a collaborative modality adversarial training framework to augment the imbalanced modality information. We construct a new benchmark called WildKGC with five datasets to evaluate our method. The empirical results compared with 21 recent baselines confirm the superiority of our method, consistently achieving state-of-the-art performance across different datasets and various scenarios while keeping efficient and generalizable. Our code and data are released at https://github.com/zjukg/NATIVE.

References

[1]
Martín Arjovsky, Soumith Chintala, and Léon Bottou. 2017. Wasserstein GAN. CoRR, Vol. abs/1701.07875 (2017).
[2]
Ivana Balazevic, Carl Allen, and Timothy M. Hospedales. 2019. TuckER: Tensor Factorization for Knowledge Graph Completion. In EMNLP/IJCNLP (1). Association for Computational Linguistics, 5184--5193.
[3]
Antoine Bordes, Nicolas Usunier, Alberto García-Durán, Jason Weston, and Oksana Yakhnenko. 2013. Translating Embeddings for Modeling Multi-relational Data. In NIPS. 2787--2795.
[4]
Liwei Cai and William Yang Wang. 2018. KBGAN: Adversarial Learning for Knowledge Graph Embeddings. In Proc. of NAACL.
[5]
Zongsheng Cao, Qianqian Xu, Zhiyong Yang, Yuan He, Xiaochun Cao, and Qingming Huang. 2022. OTKGE: Multi-modal Knowledge Graph Embeddings via Optimal Transport. In NeurIPS.
[6]
Linlin Chao, Jianshan He, Taifeng Wang, and Wei Chu. 2021. PairRE: Knowledge Graph Embeddings via Paired Relation Vectors. In Proc. of ACL.
[7]
Zhuo Chen, Jiaoyan Chen, Wen Zhang, Lingbing Guo, Yin Fang, Yufeng Huang, Yichi Zhang, Yuxia Geng, Jeff Z. Pan, Wenting Song, and Huajun Chen. 2023a. MEAformer: Multi-modal Entity Alignment Transformer for Meta Modality Hybrid. In ACM Multimedia. ACM, 3317--3327.
[8]
Zhuo Chen, Yin Fang, Yichi Zhang, Lingbing Guo, Jiaoyan Chen, Huajun Chen, and Wen Zhang. 2024a. The Power of Noise: Toward a Unified Multi-modal Knowledge Graph Representation Framework. arxiv: 2403.06832 [cs.CL]
[9]
Zhuo Chen, Lingbing Guo, Yin Fang, Yichi Zhang, Jiaoyan Chen, Jeff Z. Pan, Yangning Li, Huajun Chen, and Wen Zhang. 2023b. Rethinking Uncertainly Missing and Ambiguous Visual Modality in Multi-Modal Entity Alignment. In ISWC (Lecture Notes in Computer Science, Vol. 14265). Springer, 121--139.
[10]
Zhuo Chen, Wen Zhang, Yufeng Huang, Mingyang Chen, Yuxia Geng, Hongtao Yu, Zhen Bi, Yichi Zhang, Zhen Yao, Wenting Song, Xinliang Wu, Yi Yang, Mingyi Chen, Zhaoyang Lian, Yingying Li, Lei Cheng, and Huajun Chen. 2023c. Tele-Knowledge Pre-training for Fault Analysis. In ICDE. IEEE, 3453--3466.
[11]
Zhuo Chen, Yichi Zhang, Yin Fang, Yuxia Geng, Lingbing Guo, Xiang Chen, Qian Li, Wen Zhang, Jiaoyan Chen, Yushan Zhu, Jiaqi Li, Xiaoze Liu, Jeff Z. Pan, Ningyu Zhang, and Huajun Chen. 2024b. Knowledge Graphs Meet Multi-Modal Learning: A Comprehensive Survey. CoRR, Vol. abs/2402.05391 (2024).
[12]
Danilo Croce, Giuseppe Castellucci, and Roberto Basili. 2020. GAN-BERT: Generative Adversarial Learning for Robust Text Classification with a Bunch of Labeled Examples. In ACL. Association for Computational Linguistics, 2114--2119.
[13]
Tim Dettmers, Pasquale Minervini, Pontus Stenetorp, and Sebastian Riedel. 2018. Convolutional 2D Knowledge Graph Embeddings. In AAAI. AAAI Press, 1811--1818.
[14]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL-HLT (1). Association for Computational Linguistics, 4171--4186.
[15]
Alberto García-Durán and Mathias Niepert. 2018. KBlrn: End-to-End Learning of Knowledge Base Representations with Latent, Relational, and Numerical Features. In UAI. AUAI Press, 372--381.
[16]
Xavier Glorot, Antoine Bordes, and Yoshua Bengio. 2011. Deep Sparse Rectifier Neural Networks. In AISTATS (JMLR Proceedings, Vol. 15). JMLR.org, 315--323.
[17]
Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron C. Courville, and Yoshua Bengio. 2014. Generative Adversarial Nets. In Proc. of NeurIPS.
[18]
Xin Gu, Guang Chen, Yufei Wang, Libo Zhang, Tiejian Luo, and Longyin Wen. 2023. Text with Knowledge Graph Augmented Transformer for Video Captioning. In CVPR. IEEE, 18941--18951.
[19]
Ishaan Gulrajani, Faruk Ahmed, Martín Arjovsky, Vincent Dumoulin, and Aaron C. Courville. 2017. Improved Training of Wasserstein GANs. In NIPS. 5767--5777.
[20]
Xu Han, Shulin Cao, Xin Lv, Yankai Lin, Zhiyuan Liu, Maosong Sun, and Juanzi Li. 2018. OpenKE: An Open Toolkit for Knowledge Embedding. In Proc. of EMNLP.
[21]
Guoliang Ji, Shizhu He, Liheng Xu, Kang Liu, and Jun Zhao. 2015. Knowledge Graph Embedding via Dynamic Mapping Matrix. In ACL (1). The Association for Computer Linguistics, 687--696.
[22]
Tero Karras, Samuli Laine, and Timo Aila. 2019. A Style-Based Generator Architecture for Generative Adversarial Networks. In CVPR. Computer Vision Foundation / IEEE, 4401--4410.
[23]
Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In ICLR (Poster).
[24]
Agustinus Kristiadi, Mohammad Asif Khan, Denis Lukovnikov, Jens Lehmann, and Asja Fischer. 2019. Incorporating Literals into Knowledge Graph Embeddings. In ISWC (1) (Lecture Notes in Computer Science, Vol. 11778). Springer, 347--363.
[25]
Jaejun Lee, Chanyoung Chung, Hochang Lee, Sungho Jo, and Joyce Jiyoung Whang. 2023. VISTA: Visual-Textual Knowledge Graph Representation Learning. In EMNLP (Findings). Association for Computational Linguistics, 7314--7328.
[26]
Jens Lehmann, Robert Isele, Max Jakob, Anja Jentzsch, Dimitris Kontokostas, Pablo N. Mendes, Sebastian Hellmann, Mohamed Morsey, Patrick van Kleef, Sören Auer, and Christian Bizer. 2015. DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia. Semantic Web (2015).
[27]
Xinhang Li, Xiangyu Zhao, Jiaxing Xu, Yong Zhang, and Chunxiao Xing. 2023. IMF: Interactive Multimodal Fusion Model for Link Prediction. In WWW. ACM, 2572--2580.
[28]
Ke Liang, Yue Liu, Sihang Zhou, Wenxuan Tu, Yi Wen, Xihong Yang, Xiangjun Dong, and Xinwang Liu. 2024. Knowledge Graph Contrastive Learning Based on Relation-Symmetrical Structure. IEEE Trans. Knowl. Data Eng., Vol. 36, 1 (2024), 226--238.
[29]
Ke Liang, Lingyuan Meng, Meng Liu, Yue Liu, Wenxuan Tu, Siwei Wang, Sihang Zhou, and Xinwang Liu. 2023. Learn from Relational Correlations and Periodic Events for Temporal Knowledge Graph Reasoning. In SIGIR. ACM, 1559--1568.
[30]
Ke Liang, Lingyuan Meng, Meng Liu, Yue Liu, Wenxuan Tu, Siwei Wang, Sihang Zhou, X Liu, and F Sun. 2022. A Survey of Knowledge Graph Reasoning on Graph Types: Static, Dynamic, and Multimodal. (2022).
[31]
Juncheng Liu, Zequn Sun, Bryan Hooi, Yiwei Wang, Dayiheng Liu, Baosong Yang, Xiaokui Xiao, and Muhao Chen. 2022. Dangling-Aware Entity Alignment with Mixed High-Order Proximities. In NAACL-HLT (Findings). Association for Computational Linguistics, 1172--1184.
[32]
Ye Liu, Hui Li, Alberto García-Durán, Mathias Niepert, Daniel O noro-Rubio, and David S. Rosenblum. 2019. MMKG: Multi-modal Knowledge Graphs. In ESWC (Lecture Notes in Computer Science, Vol. 11503). Springer, 459--474.
[33]
Xinyu Lu, Lifang Wang, Zejun Jiang, Shichang He, and Shizhong Liu. 2022. MMKRL: A robust embedding approach for multi-modal knowledge graph representation learning. Appl. Intell., Vol. 52, 7 (2022), 7480--7497.
[34]
Haojie Pan, Yuzhou Zhang, Zepeng Zhai, Ruiji Fu, Ming Liu, Yangqiu Song, Zhongyuan Wang, and Bing Qin. 2022. Kuaipedia: a Large-scale Multi-modal Short-video Encyclopedia. CoRR, Vol. abs/2211.00732 (2022).
[35]
Apoorv Saxena, Adrian Kochsiek, and Rainer Gemulla. 2022. Sequence-to-Sequence Knowledge Graph Completion and Question Answering. In ACL (1). Association for Computational Linguistics, 2814--2828.
[36]
Hatem Mousselly Sergieh, Teresa Botschen, Iryna Gurevych, and Stefan Roth. 2018. A Multimodal Translation-Based Approach for Knowledge Graph Representation Learning. In *SEM@NAACL-HLT. Association for Computational Linguistics, 225--234.
[37]
Karen Simonyan and Andrew Zisserman. 2015. Very Deep Convolutional Networks for Large-Scale Image Recognition. In ICLR.
[38]
Fabian M. Suchanek, Gjergji Kasneci, and Gerhard Weikum. 2007. Yago: a core of semantic knowledge. In WWW. ACM, 697--706.
[39]
Zhiqing Sun, Zhi-Hong Deng, Jian-Yun Nie, and Jian Tang. 2019. RotatE: Knowledge Graph Embedding by Relational Rotation in Complex Space. In ICLR (Poster). OpenReview.net.
[40]
Zhenwei Tang, Shichao Pei, Zhao Zhang, Yongchun Zhu, Fuzhen Zhuang, Robert Hoehndorf, and Xiangliang Zhang. 2022. Positive-Unlabeled Learning with Adversarial Data Augmentation for Knowledge Graph Completion. In Proc. of IJCAI.
[41]
Théo Trouillon, Johannes Welbl, Sebastian Riedel, Éric Gaussier, and Guillaume Bouchard. 2016. Complex Embeddings for Simple Link Prediction. In ICML (JMLR Workshop and Conference Proceedings, Vol. 48). JMLR.org, 2071--2080.
[42]
J. D. Tygar. 2011. Adversarial Machine Learning. IEEE Internet Comput., Vol. 15, 5 (2011), 4--6.
[43]
Denny Vrandecic and Markus Krötzsch. 2014. Wikidata: a free collaborative knowledgebase. Commun. ACM, Vol. 57, 10 (2014), 78--85.
[44]
Jiang Wang, Filip Ilievski, Pedro A. Szekely, and Ke-Thia Yao. 2022. Augmenting Knowledge Graphs for Better Link Prediction. In IJCAI. ijcai.org, 2277--2283.
[45]
Jun Wang, Lantao Yu, Weinan Zhang, Yu Gong, Yinghui Xu, Benyou Wang, Peng Zhang, and Dell Zhang. 2017b. IRGAN: A Minimax Game for Unifying Generative and Discriminative Information Retrieval Models. In SIGIR. ACM, 515--524.
[46]
Meng Wang, Sen Wang, Han Yang, Zheng Zhang, Xi Chen, and Guilin Qi. 2021. Is Visual Context Really Helpful for Knowledge Graph? A Representation Learning Perspective. In ACM Multimedia. ACM, 2735--2743.
[47]
Peifeng Wang, Shuangyin Li, and Rong Pan. 2018. Incorporating GAN for Negative Sampling in Knowledge Representation Learning. In Proc. of AAAI.
[48]
Quan Wang, Zhendong Mao, Bin Wang, and Li Guo. 2017a. Knowledge Graph Embedding: A Survey of Approaches and Applications. IEEE Trans. Knowl. Data Eng., Vol. 29, 12 (2017), 2724--2743. https://doi.org/10.1109/TKDE.2017.2754499
[49]
Xiang Wang, Xiangnan He, Yixin Cao, Meng Liu, and Tat-Seng Chua. 2019a. KGAT: Knowledge Graph Attention Network for Recommendation. In KDD. ACM, 950--958.
[50]
Xin Wang, Benyuan Meng, Hong Chen, Yuan Meng, Ke Lv, and Wenwu Zhu. 2023. TIVA-KG: A Multimodal Knowledge Graph with Text, Image, Video and Audio. In ACM Multimedia. ACM, 2391--2399.
[51]
Xiang Wang, Yaokun Xu, Xiangnan He, Yixin Cao, Meng Wang, and Tat-Seng Chua. 2020. Reinforced Negative Sampling over Knowledge Graph for Recommendation. In WWW. ACM / IW3C2, 99--109.
[52]
Zikang Wang, Linjing Li, Qiudan Li, and Daniel Zeng. 2019b. Multimodal Data Enhanced Representation Learning for Knowledge Graphs. In IJCNN. IEEE, 1--8.
[53]
Wei Wei, Chao Huang, Lianghao Xia, and Chuxu Zhang. 2023. Multi-Modal Self-Supervised Learning for Recommendation. In WWW. ACM, 790--800.
[54]
Ruobing Xie, Zhiyuan Liu, Huanbo Luan, and Maosong Sun. 2017. Image-embodied Knowledge Representation Learning. In IJCAI. ijcai.org, 3140--3146.
[55]
Derong Xu, Tong Xu, Shiwei Wu, Jingbo Zhou, and Enhong Chen. 2022. Relation-enhanced Negative Sampling for Multimodal Knowledge Graph Completion. In ACM Multimedia. ACM, 3857--3866.
[56]
Bishan Yang, Wen-tau Yih, Xiaodong He, Jianfeng Gao, and Li Deng. 2015. Embedding Entities and Relations for Learning and Inference in Knowledge Bases. In ICLR (Poster).
[57]
Liang Yao, Chengsheng Mao, and Yuan Luo. 2019. KG-BERT: BERT for Knowledge Graph Completion. CoRR, Vol. abs/1909.03193 (2019).
[58]
Michihiro Yasunaga, Hongyu Ren, Antoine Bosselut, Percy Liang, and Jure Leskovec. 2021. QA-GNN: Reasoning with Language Models and Knowledge Graphs for Question Answering. In NAACL-HLT. Association for Computational Linguistics, 535--546.
[59]
Lantao Yu, Weinan Zhang, Jun Wang, and Yong Yu. 2017. SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient. In AAAI. AAAI Press, 2852--2858.
[60]
Hang Zhang, Yeyun Gong, Yelong Shen, Jiancheng Lv, Nan Duan, and Weizhu Chen. 2022. Adversarial Retriever-Ranker for Dense Text Retrieval. In ICLR. OpenReview.net.
[61]
Yichi Zhang, Mingyang Chen, and Wen Zhang. 2023a. Modality-Aware Negative Sampling for Multi-modal Knowledge Graph Embedding. In IJCNN. IEEE, 1--8.
[62]
Yichi Zhang, Zhuo Chen, Yin Fang, Lei Cheng, Yanxi Lu, Fangming Li, Wen Zhang, and Huajun Chen. 2023b. Knowledgeable Preference Alignment for LLMs in Domain-specific Question Answering. CoRR, Vol. abs/2311.06503 (2023).
[63]
Yichi Zhang, Zhuo Chen, Lei Liang, Huajun Chen, and Wen Zhang. 2024. Unleashing the Power of Imbalanced Modality Information for Multi-modal Knowledge Graph Completion. CoRR, Vol. abs/2402.15444 (2024).
[64]
Yichi Zhang, Zhuo Chen, Wen Zhang, and Huajun Chen. 2023c. Making Large Language Models Perform Better in Knowledge Graph Completion. CoRR, Vol. abs/2310.06671 (2023).
[65]
Yichi Zhang and Wen Zhang. 2022. Knowledge Graph Completion with Pre-trained Multimodal Transformer and Twins Negative Sampling. CoRR, Vol. abs/2209.07084 (2022).
[66]
Yu Zhao, Xiangrui Cai, Yike Wu, Haiwei Zhang, Ying Zhang, Guoqing Zhao, and Ning Jiang. 2022. MoSE: Modality Split and Ensemble for Multimodal Knowledge Graph Completion. In EMNLP. Association for Computational Linguistics, 10527--10536.
[67]
Zhaocheng Zhu, Zuobai Zhang, Louis-Pascal A. C. Xhonneux, and Jian Tang. 2021. Neural Bellman-Ford Networks: A General Graph Neural Network Framework for Link Prediction. In NeurIPS. 29476--29490.

Index Terms

  1. NativE: Multi-modal Knowledge Graph Completion in the Wild

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGIR '24: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval
    July 2024
    3164 pages
    ISBN:9798400704314
    DOI:10.1145/3626772
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 11 July 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. adversarial learning
    2. knowledge graph completion
    3. multi-modal fusion
    4. multi-modal knowledge graphs

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    SIGIR 2024
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 792 of 3,983 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 172
      Total Downloads
    • Downloads (Last 12 months)172
    • Downloads (Last 6 weeks)86
    Reflects downloads up to 14 Sep 2024

    Other Metrics

    Citations

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media