skip to main content
10.1145/3626772.3657966acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
short-paper
Open access

Multi-Layer Ranking with Large Language Models for News Source Recommendation

Published: 11 July 2024 Publication History

Abstract

To seek reliable information sources for news events, we introduce a novel task of expert recommendation, which aims to identify trustworthy sources based on their previously quoted statements. To achieve this, we built a novel dataset, called NewsQuote, consisting of 23,571 quote-speaker pairs sourced from a collection of news articles. We formulate the recommendation task as the retrieval of experts based on their likelihood of being associated with a given query. We also propose a multi-layer ranking framework employing Large Language Models to improve the recommendation performance. Our results show that employing an in-context learning based LLM ranker and a multi-layer ranking-based filter significantly improve both the predictive quality and behavioural quality of the recommender system.

References

[1]
Mabrook Al-Rakhami and Atif Alamri. 2020. Lies Kill, Facts Save: Detecting COVID-19 Misinformation in Twitter. IEEE Access, Vol. 8 (2020), 155961--155970. https://doi.org/10.1109/ACCESS.2020.3019600
[2]
Isabelle Augenstein, Christina Lioma, Dongsheng Wang, Lucas Chaves Lima, Casper Hansen, Christian Hansen, and Jakob Grue Simonsen. 2019. MultiFC: A Real-World Multi-Domain Dataset for Evidence-Based Fact Checking of Claims. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP 2019, Hong Kong, China, November 3--7, 2019, Kentaro Inui, Jing Jiang, Vincent Ng, and Xiaojun Wan (Eds.). Association for Computational Linguistics, 4684--4696. https://doi.org/10.18653/V1/D19--1475
[3]
Krisztian Balog, Leif Azzopardi, and Maarten de Rijke. 2009. A language modeling framework for expert finding. Inf. Process. Manag., Vol. 45, 1 (2009), 1--19. https://doi.org/10.1016/J.IPM.2008.06.003
[4]
Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. Language Models are Few-Shot Learners. In Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6--12, 2020, virtual, Hugo Larochelle, Marc'Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, and Hsuan-Tien Lin (Eds.). https://proceedings.neurips.cc/paper/2020/hash/1457c0d6bfcb4967418bfb8ac142f64a-Abstract.html
[5]
Tien Duc Cao, Ludivine Duroyon, Francc ois Goasdoué, Ioana Manolescu, and Xavier Tannier. 2019. BeLink: Querying Networks of Facts, Statements and Beliefs. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management, CIKM 2019, Beijing, China, November 3--7, 2019, Wenwu Zhu, Dacheng Tao, Xueqi Cheng, Peng Cui, Elke A. Rundensteiner, David Carmel, Qi He, and Jeffrey Xu Yu (Eds.). ACM, 2941--2944. https://doi.org/10.1145/3357384.3357851
[6]
Zi Chu, Steven Gianvecchio, Haining Wang, and Sushil Jajodia. 2012. Detecting Automation of Twitter Accounts: Are You a Human, Bot, or Cyborg? IEEE Trans. Dependable Secur. Comput., Vol. 9, 6 (2012), 811--824. https://doi.org/10.1109/TDSC.2012.75
[7]
Sunhao Dai, Ninglu Shao, Haiyuan Zhao, Weijie Yu, Zihua Si, Chen Xu, Zhongxiang Sun, Xiao Zhang, and Jun Xu. 2023. Uncovering ChatGPT's Capabilities in Recommender Systems. In Proceedings of the 17th ACM Conference on Recommender Systems, RecSys 2023, Singapore, Singapore, September 18--22, 2023, Jie Zhang, Li Chen, Shlomo Berkovsky, Min Zhang, Tommaso Di Noia, Justin Basilico, Luiz Pizzato, and Yang Song (Eds.). ACM, 1126--1132. https://doi.org/10.1145/3604915.3610646
[8]
Andreas Hanselowski, Christian Stab, Claudia Schulz, Zile Li, and Iryna Gurevych. 2019. A Richly Annotated Corpus for Different Tasks in Automated Fact-Checking. In Proceedings of the 23rd Conference on Computational Natural Language Learning, CoNLL 2019, Hong Kong, China, November 3--4, 2019, Mohit Bansal and Aline Villavicencio (Eds.). Association for Computational Linguistics, 493--503. https://doi.org/10.18653/V1/K19--1046
[9]
Yupeng Hou, Junjie Zhang, Zihan Lin, Hongyu Lu, Ruobing Xie, Julian J. McAuley, and Wayne Xin Zhao. 2024. Large Language Models are Zero-Shot Rankers for Recommender Systems. In Advances in Information Retrieval - 46th European Conference on Information Retrieval, ECIR 2024, Glasgow, UK, March 24--28, 2024, Proceedings, Part II (Lecture Notes in Computer Science, Vol. 14609), Nazli Goharian, Nicola Tonellotto, Yulan He, Aldo Lipani, Graham McDonald, Craig Macdonald, and Iadh Ounis (Eds.). Springer, 364--381. https://doi.org/10.1007/978--3-031--56060--6_24
[10]
Georgi Karadzhov, Preslav Nakov, Llu'i s Mà rquez, Alberto Barró n-Cede n o, and Ivan Koychev. 2017. Fully Automated Fact Checking Using External Sources. In Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017, Varna, Bulgaria, September 2 - 8, 2017, Ruslan Mitkov and Galia Angelova (Eds.). INCOMA Ltd., 344--353. https://doi.org/10.26615/978--954--452-049--6_046
[11]
Neema Kotonya and Francesca Toni. 2020. Explainable Automated Fact-Checking for Public Health Claims. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020, Online, November 16--20, 2020, Bonnie Webber, Trevor Cohn, Yulan He, and Yang Liu (Eds.). Association for Computational Linguistics, 7740--7754. https://doi.org/10.18653/V1/2020.EMNLP-MAIN.623
[12]
Alexander K Lew, Tan Zhi-Xuan, Gabriel Grand, and Vikash K Mansinghka. 2023. Sequential Monte Carlo Steering of Large Language Models using Probabilistic Programs. arXiv preprint arXiv:2306.03081 (2023).
[13]
Jiazheng Li, Runcong Zhao, Yulan He, and Lin Gui. 2023. OverPrompt: Enhancing ChatGPT Capabilities through an Efficient In-Context Learning Approach. CoRR, Vol. abs/2305.14973 (2023). https://doi.org/10.48550/ARXIV.2305.14973 showeprint[arXiv]2305.14973
[14]
Junling Liu, Chao Liu, Renjie Lv, Kang Zhou, and Yan Zhang. 2023. Is chatgpt a good recommender? a preliminary study. arXiv preprint arXiv:2304.10149 (2023).
[15]
Hanjia Lyu, Song Jiang, Hanqing Zeng, Yinglong Xia, and Jiebo Luo. 2023. Llm-rec: Personalized recommendation via prompting large language models. arXiv preprint arXiv:2307.15780 (2023).
[16]
Arjun Mukherjee, Abhinav Kumar, Bing Liu, Junhui Wang, Meichun Hsu, Malú Castellanos, and Riddhiman Ghosh. 2013. Spotting opinion spammers using behavioral footprints. In The 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2013, Chicago, IL, USA, August 11--14, 2013, Inderjit S. Dhillon, Yehuda Koren, Rayid Ghani, Ted E. Senator, Paul Bradley, Rajesh Parekh, Jingrui He, Robert L. Grossman, and Ramasamy Uthurusamy (Eds.). ACM, 632--640. https://doi.org/10.1145/2487575.2487580
[17]
Rob Procter, Miguel Arana Catania, Yulan He, Maria Liakata, Arkaitz Zubiaga, Elena Kochkina, and Runcong Zhao. 2023. Some Observations on Fact-Checking Work with Implications for Computational Support. arXiv preprint arXiv:2305.02224 (2023).
[18]
Khubaib Ahmed Qureshi, Rauf Ahmed Shams Malick, Muhammad Sabih, and Hocine Cherifi. 2022. Deception detection on social media: A source-based perspective. Knowl. Based Syst., Vol. 256 (2022), 109649. https://doi.org/10.1016/J.KNOSYS.2022.109649
[19]
Andreas Rü cklé, Nafise Sadat Moosavi, and Iryna Gurevych. 2019. Neural Duplicate Question Detection without Labeled Training Data. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP 2019, Hong Kong, China, November 3--7, 2019, Kentaro Inui, Jing Jiang, Vincent Ng, and Xiaojun Wan (Eds.). Association for Computational Linguistics, 1607--1617. https://doi.org/10.18653/V1/D19--1171
[20]
Arkadiy Saakyan, Tuhin Chakrabarty, and Smaranda Muresan. 2021. COVID-Fact: Fact Extraction and Verification of Real-World Claims on COVID-19 Pandemic. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL/IJCNLP 2021, (Volume 1: Long Papers), Virtual Event, August 1--6, 2021, Chengqing Zong, Fei Xia, Wenjie Li, and Roberto Navigli (Eds.). Association for Computational Linguistics, 2116--2129. https://doi.org/10.18653/V1/2021.ACL-LONG.165
[21]
Tal Schuster, Adam Fisch, and Regina Barzilay. 2021. Get Your Vitamin C! Robust Fact Verification with Contrastive Evidence. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2021, Online, June 6--11, 2021, Kristina Toutanova, Anna Rumshisky, Luke Zettlemoyer, Dilek Hakkani-Tü r, Iz Beltagy, Steven Bethard, Ryan Cotterell, Tanmoy Chakraborty, and Yichao Zhou (Eds.). Association for Computational Linguistics, 624--643. https://doi.org/10.18653/V1/2021.NAACL-MAIN.52
[22]
Xiaoxiao Shang, Zhiyuan Peng, Qiming Yuan, Sabiq Khan, Lauren Xie, Yi Fang, and Subramaniam Vincent. 2022. DIANES: A DEI Audit Toolkit for News Sources. In SIGIR '22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, July 11 - 15, 2022, Enrique Amigó, Pablo Castells, Julio Gonzalo, Ben Carterette, J. Shane Culpepper, and Gabriella Kazai (Eds.). ACM, 3312--3317. https://doi.org/10.1145/3477495.3531660
[23]
Peng Shi and Jimmy Lin. 2019. Simple bert models for relation extraction and semantic role labeling. arXiv preprint arXiv:1904.05255 (2019).
[24]
Kai Shu, Xinyi Zhou, Suhang Wang, Reza Zafarani, and Huan Liu. 2019. The role of user profiles for fake news detection. In ASONAM '19: International Conference on Advances in Social Networks Analysis and Mining, Vancouver, British Columbia, Canada, 27--30 August, 2019, Francesca Spezzano, Wei Chen, and Xiaokui Xiao (Eds.). ACM, 436--439. https://doi.org/10.1145/3341161.3342927
[25]
James Thorne, Andreas Vlachos, Oana Cocarascu, Christos Christodoulopoulos, and Arpit Mittal. 2018. The Fact Extraction and VERification (FEVER) Shared Task. In Proceedings of the First Workshop on Fact Extraction and VERification (FEVER). Association for Computational Linguistics, Brussels, Belgium, 1--9. https://doi.org/10.18653/v1/W18--5501
[26]
Timoté Vaucher, Andreas Spitz, Michele Catasta, and Robert West. 2021. Quotebank: A Corpus of Quotations from a Decade of News. In WSDM '21, The Fourteenth ACM International Conference on Web Search and Data Mining, Virtual Event, Israel, March 8--12, 2021, Liane Lewin-Eytan, David Carmel, Elad Yom-Tov, Eugene Agichtein, and Evgeniy Gabrilovich (Eds.). ACM, 328--336. https://doi.org/10.1145/3437963.3441760
[27]
Juraj Vladika and Florian Matthes. 2023. Scientific Fact-Checking: A Survey of Resources and Approaches. In Findings of the Association for Computational Linguistics: ACL 2023, Toronto, Canada, July 9--14, 2023, Anna Rogers, Jordan L. Boyd-Graber, and Naoaki Okazaki (Eds.). Association for Computational Linguistics, 6215--6230. https://doi.org/10.18653/V1/2023.FINDINGS-ACL.387
[28]
Vuk Vukovic, Akhil Arora, Huan-Cheng Chang, Andreas Spitz, and Robert West. 2022. Quote Erat Demonstrandum: A Web Interface for Exploring the Quotebank Corpus. In SIGIR '22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, July 11 - 15, 2022, Enrique Amigó, Pablo Castells, Julio Gonzalo, Ben Carterette, J. Shane Culpepper, and Gabriella Kazai (Eds.). ACM, 3350--3354. https://doi.org/10.1145/3477495.3531696
[29]
David Wadden, Shanchuan Lin, Kyle Lo, Lucy Lu Wang, Madeleine van Zuylen, Arman Cohan, and Hannaneh Hajishirzi. 2020. Fact or Fiction: Verifying Scientific Claims. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020, Online, November 16--20, 2020, Bonnie Webber, Trevor Cohn, Yulan He, and Yang Liu (Eds.). Association for Computational Linguistics, 7534--7550. https://doi.org/10.18653/V1/2020.EMNLP-MAIN.609
[30]
Wenjie Wang, Xinyu Lin, Fuli Feng, Xiangnan He, and Tat-Seng Chua. 2023. Generative recommendation: Towards next-generation recommender paradigm. arXiv preprint arXiv:2304.03516 (2023).
[31]
Ikuya Yamada, Akari Asai, Jin Sakuma, Hiroyuki Shindo, Hideaki Takeda, Yoshiyasu Takefuji, and Yuji Matsumoto. 2020. Wikipedia2Vec: An Efficient Toolkit for Learning and Visualizing the Embeddings of Words and Entities from Wikipedia. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, EMNLP 2020 - Demos, Online, November 16--20, 2020, Qun Liu and David Schlangen (Eds.). Association for Computational Linguistics, 23--30. https://doi.org/10.18653/V1/2020.EMNLP-DEMOS.4
[32]
Yuanchi Zhang and Yang Liu. 2022. DirectQuote: A Dataset for Direct Quotation Extraction and Attribution in News Articles. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, LREC 2022, Marseille, France, 20--25 June 2022, Nicoletta Calzolari, Fré dé ric Bé chet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hé lè ne Mazo, Jan Odijk, and Stelios Piperidis (Eds.). European Language Resources Association, 6959--6966. https://aclanthology.org/2022.lrec-1.752

Cited By

View all

Index Terms

  1. Multi-Layer Ranking with Large Language Models for News Source Recommendation

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGIR '24: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval
    July 2024
    3164 pages
    ISBN:9798400704314
    DOI:10.1145/3626772
    This work is licensed under a Creative Commons Attribution International 4.0 License.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 11 July 2024

    Check for updates

    Author Tags

    1. in-context learning
    2. large language model
    3. recommender system

    Qualifiers

    • Short-paper

    Funding Sources

    • Turing AI Fellowship
    • UK Engineering and Physical Sciences Research Council

    Conference

    SIGIR 2024
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 792 of 3,983 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)373
    • Downloads (Last 6 weeks)78
    Reflects downloads up to 25 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media