skip to main content
10.1145/3534678.3539459acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Knowledge Enhanced Search Result Diversification

Published: 14 August 2022 Publication History

Abstract

Search result diversification focuses on reducing redundancy and improving subtopic richness in the results for a given query. Most existing approaches measure document diversity mainly based on text or pre-trained representations. However, some underlying relationships between the query and documents are difficult for the model to capture only from the content. Given that the knowledge base can offer well-defined entities and explicit relationships between entities, we exploit knowledge to model the relationship between documents and the query and propose a knowledge-enhanced search result diversification approach KEDIV. Concretely, we build a query-specific relation graph to model the complicated query-document relationship from an entity view. Then a graph neural network and node weight adjust algorithm are applied to the relation graph to obtain context-aware entity representations and document representations at each selection step. The diversity features are derived from the updated node representations of the relation graph. In this way, we can take advantage of entities' abundant information to model document's diversity in search result diversification. Experimental results on commonly used datasets show that our proposed approach can outperform the state-of-the-art methods.

Supplemental Material

MP4 File
Presentation video

References

[1]
Kurt Bollacker, Colin Evans, Praveen Paritosh, Tim Sturge, and Jamie Taylor. 2008. Freebase: A Collaboratively Created Graph Database for Structuring Human Knowledge. In Proc. of SIGMOD.
[2]
Antoine Bordes, Nicolas Usunier, Alberto García-Durá n, Jason Weston, and Oksana Yakhnenko. 2013. Translating Embeddings for Modeling Multi-relational Data. In Proc. of NeurIPS.
[3]
Jamie Callan, Mark Hoy, Changkuk Yoo, and Le Zhao". 2009. Clueweb09 data set. https://boston.lti.cs.cmu.edu/Data/clueweb09/
[4]
Jaime G. Carbonell and Jade Goldstein. 1998. The Use of MMR, Diversity-Based Reranking for Reordering Documents and Producing Summaries. In Proc. of SIGIR.
[5]
David Carmel, Haggai Roitman, and Naama Zwerdling. 2009. Enhancing Cluster Labeling Using Wikipedia. In Proc. of SIGIR.
[6]
Olivier Chapelle, Donald Metlzer, Ya Zhang, and Pierre Grinspan. 2009. Expected reciprocal rank for graded relevance. In Proc. of CIKM.
[7]
Charles L. A. Clarke, Maheedhar Kolla, Gordon V. Cormack, Olga Vechtomova, Azin Ashkan, Stefan Büttcher, and Ian MacKinnon. 2008. Novelty and diversity in information retrieval evaluation. In Proc. of SIGIR.
[8]
Charles L. A. Clarke, Maheedhar Kolla, and Olga Vechtomova. 2009. An Effectiveness Measure for Ambiguous and Underspecified Queries. In Proc. of ICTIR.
[9]
Van Dang and W. Bruce Croft. 2012. Diversity by proportionality: an election-based approach to search result diversification. In Proc. of SIGIR.
[10]
Van Dang and W. Bruce Croft. 2013. Term level search result diversification. In Proc. of SIGIR.
[11]
Yue Feng, Jun Xu, Yanyan Lan, Jiafeng Guo, Wei Zeng, and Xueqi Cheng. 2018. From Greedy Selection to Exploratory Decision-Making: Diverse Ranking with Policy-Value Networks. In Proc. of SIGIR.
[12]
Evgeniy Gabrilovich and Shaul Markovitch. 2007. Computing Semantic Relatedness Using Wikipedia-based Explicit Semantic Analysis. In Proc. of IJCAI.
[13]
Sreenivas Gollapudi and Aneesh Sharma. 2009. An Axiomatic Approach for Result Diversification. In Proc. of WWW.
[14]
Faegheh Hasibi, Krisztian Balog, and Svein Erik Bratsberg. 2015. Entity Linking in Queries: Tasks and Evaluation. In Proc. of ICTIR.
[15]
Sha Hu, Zhicheng Dou, Xiao-Jie Wang, Tetsuya Sakai, and Ji-Rong Wen. 2015. Search Result Diversification Based on Hierarchical Intents. In Proc. of CIKM.
[16]
Zhengbao Jiang, Zhicheng Dou, and Ji-Rong Wen. 2017a. Generating Query Facets Using Knowledge Bases. IEEE Trans. Knowl. Data Eng. (2017).
[17]
Zhengbao Jiang, Ji-Rong Wen, Zhicheng Dou, Wayne Xin Zhao, Jian-Yun Nie, and Ming Yue. 2017b. Learning to Diversify Search Results via Subtopic Attention. In Proc. of SIGIR .
[18]
Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In Proc. of ICLR.
[19]
Jiongnan Liu, Zhicheng Dou, Xiao-Jie Wang, Shuqi Lu, and Ji-Rong Wen. 2020. DVGAN: A Minimax Game for Search Result Diversification Combining Explicit and Implicit Features. In Proc. of SIGIR .
[20]
Zhenghao Liu, Chenyan Xiong, Maosong Sun, and Zhiyuan Liu. 2018. Entity-Duet Neural Ranking: Understanding the Role of Knowledge Graph Semantics in Neural Information Retrieval. In Proc. of ACL.
[21]
Ilya Loshchilov and Frank Hutter. 2019. Decoupled Weight Decay Regularization. In Proc. of ICLR.
[22]
Shuqi Lu, Zhicheng Dou, Chenyan Xiong, Xiaojie Wang, and Ji-Rong Wen. 2020. Knowledge Enhanced Personalized Search.
[23]
Rada Mihalcea and Andras Csomai. 2007. Wikify! Linking Documents to Encyclopedic Knowledge. In Proceedings of the Sixteenth ACM Conference on Conference on Information and Knowledge Management.
[24]
Simone Paolo Ponzetto and Roberto Navigli. 2010. Knowledge-Rich Word Sense Disambiguation Rivaling Supervised Systems. In Proc. of ACL.
[25]
Xubo Qin, Zhicheng Dou, and Ji-Rong Wen. 2020. Diversifying Search Results using Self-Attention Network. In Proc. of CIKM.
[26]
Celina Santamaría, Julio Gonzalo, and Javier Artiles. 2010. Wikipedia as Sense Inventory to Improve Diversity in Web Search Results. In Proc. of ACL.
[27]
Rodrygo L. T. Santos, Craig Macdonald, and Iadh Ounis. 2010. Exploiting query reformulations for web search result diversification. In Proc. of WWW.
[28]
Zhan Su, Zhicheng Dou, Yutao Zhu, Xubo Qin, and Ji-Rong Wen. 2021. Modeling Intent Graph for Search Result Diversification. In Proc. of SIGIR.
[29]
Tianxiang Sun, Yunfan Shao, Xipeng Qiu, Qipeng Guo, Yaru Hu, Xuanjing Huang, and Zheng Zhang. 2020. CoLAKE: Contextualized Language and Knowledge Embedding. In Proc. of COLING.
[30]
Yu Sun, Shuohuan Wang, Yu-Kun Li, Shikun Feng, Xuyi Chen, Han Zhang, Xin Tian, Danxiang Zhu, Hao Tian, and Hua Wu. 2019. ERNIE: Enhanced Representation through Knowledge Integration. CoRR (2019).
[31]
Zareen Saba Syed, Tim Finin, and Anupam Joshi. 2008. Wikipedia as an Ontology for Describing Documents. In Proc. of ICWSM.
[32]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Proc. of NeurIPS.
[33]
Wiebke Wagner. 2010. Steven Bird, Ewan Klein and Edward Loper: Natural Language Processing with Python, Analyzing Text with the Natural Language Toolkit - O'Reilly Media, Beijing, 2009, ISBN 978-0-596-51649-9. Lang. Resour. Evaluation (2010).
[34]
Fen Xia, Tie-Yan Liu, Jue Wang, Wensheng Zhang, and Hang Li. 2008. Listwise approach to learning to rank: theory and algorithm. In Proc. of ICML.
[35]
Long Xia, Jun Xu, Yanyan Lan, Jiafeng Guo, and Xueqi Cheng. 2015. Learning Maximal Marginal Relevance Model via Directly Optimizing Diversity Evaluation Measures. In Proc. of SIGIR.
[36]
Long Xia, Jun Xu, Yanyan Lan, Jiafeng Guo, and Xueqi Cheng. 2016. Modeling Document Novelty with Neural Tensor Network for Search Result Diversification. In Proc. of SIGIR.
[37]
Long Xia, Jun Xu, Yanyan Lan, Jiafeng Guo, Wei Zeng, and Xueqi Cheng. 2017. Adapting Markov Decision Process for Search Result Diversification. In Proc. of SIGIR.
[38]
Chenyan Xiong, Jamie Callan, and Tie-Yan Liu. 2017a. Word-Entity Duet Representations for Document Ranking. In Proc. of SIGIR.
[39]
Chenyan Xiong, Russell Power, and Jamie Callan. 2017b. Explicit Semantic Ranking for Academic Search via Knowledge Graph Embedding.
[40]
Chenyan Xiong, Russell Power, and Jamie Callan. 2017c. Explicit Semantic Ranking for Academic Search via Knowledge Graph Embedding. In Proc. of WWW.
[41]
Jun Xu, Zeng Wei, Long Xia, Yanyan Lan, Dawei Yin, Xueqi Cheng, and Ji-Rong Wen. 2020. Reinforcement Learning to Rank with Pairwise Policy Gradient.
[42]
Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. 2019. How Powerful are Graph Neural Networks?. In Proc. of ICLR.
[43]
Le Yan, Zhen Qin, Rama Kumar Pasumarthi, Xuanhui Wang, and Michael Bendersky. 2021. Diversification-Aware Learning to Rank Using Distributed Representation.
[44]
Jun Yu, Sunil Mohan, Duangmanee Putthividhya, and Weng-Keen Wong. 2014. Latent dirichlet allocation based diversified retrieval for e-commerce search. In Proc. of WSDM.
[45]
Yisong Yue and Thorsten Joachims. 2008. Predicting diverse subsets using structural SVMs. In Proc. of ICML.
[46]
ChengXiang Zhai, William W. Cohen, and John D. Lafferty. 2003. Beyond independent relevance: methods and evaluation metrics for subtopic retrieval. In Proc. of SIGIR.
[47]
Zhengyan Zhang, Xu Han, Zhiyuan Liu, Xin Jiang, Maosong Sun, and Qun Liu. 2019. ERNIE: Enhanced Language Representation with Informative Entities. In Proc. of ACL.
[48]
Jianghong Zhou, Eugene Agichtein, and Surya Kallumadi. 2020 a. Diversifying Multi-Aspect Search Results Using Simpson's Diversity Index.
[49]
Jianghong Zhou, Eugene Agichtein, and Surya Kallumadi. 2020 b. Diversifying Multi-Aspect Search Results Using Simpson's Diversity Index.
[50]
Yadong Zhu, Yanyan Lan, Jiafeng Guo, Xueqi Cheng, and Shuzi Niu. 2014. Learning for search result diversification. In Proc. of SIGIR.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
August 2022
5033 pages
ISBN:9781450393850
DOI:10.1145/3534678
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 August 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. knowledge base
  2. relation graph
  3. search result diversification

Qualifiers

  • Research-article

Funding Sources

  • Beijing Outstanding Young Scientist Program
  • the Fundamental Research Funds for the Central Universities
  • the Research Funds of Renmin University of China
  • the Outstanding Innovative Talents Cultivation Funded Programs 2021 of Renmin University of China
  • China Unicom Innovation Ecological Cooperation Plan
  • National Natural Science Foundation of China

Conference

KDD '22
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)42
  • Downloads (Last 6 weeks)13
Reflects downloads up to 07 Nov 2024

Other Metrics

Citations

Cited By

View all

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media