skip to main content
10.1145/3589335.3651445acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
short-paper
Open access

SE-PQA: Personalized Community Question Answering

Published: 13 May 2024 Publication History

Abstract

Personalization in Information Retrieval is a topic studied for a long time. Nevertheless, there is still a lack of high-quality, real-world datasets to conduct large-scale experiments and evaluate models for personalized search. This paper contributes to filling this gap by introducing SE-PQA(StackExchange - Personalized Question Answering), a new curated resource to design and evaluate personalized models related to the task of community Question Answering (cQA). The contributed dataset includes more than 1 million queries and 2 million answers, annotated with a rich set of features modeling the social interactions among the users of a popular cQA platform. We describe the characteristics of SE-PQA and detail the features associated with questions and answers. We also provide reproducible baseline methods for the cQA task based on the resource, including deep learning models and personalization approaches. The results of the preliminary experiments conducted show the appropriateness of SE-PQA to train effective cQA models; they also show that personalization remarkably improves the effectiveness of all the methods tested. Furthermore, we show the benefits in terms of robustness and generalization of combining data from multiple communities for personalization purposes.

Supplemental Material

MP4 File
Presentation video
MP4 File
Supplemental video

References

[1]
Qingyao Ai, Yongfeng Zhang, Keping Bi, Xu Chen, and W. Bruce Croft. 2017. Learning a Hierarchical Embedding Model for Personalized Product Search. In Proc. of SIGIR '17 (SIGIR '17).
[2]
Michael Barbaro, Tom Zeller, and Saul Hansell. 2006. A face is exposed for AOL searcher no. 4417749. New York Times, Vol. 9, 2008 (2006), 8.
[3]
Elias Bassani. 2022. ranx: A Blazing-Fast Python Library for Ranking Evaluation and Comparison. In ECIR.
[4]
Elias Bassani, Pranav Kasela, Alessandro Raganato, and Gabriella Pasi. 2022. A Multi-Domain Benchmark for Personalized Search Evaluation. In CIKM '22.
[5]
Alexey Borisov, Ilya Markov, Maarten de Rijke, and Pavel Serdyukov. 2016. A Context-Aware Time Model for Web Search. In Proc. of ACM SIGIR '16 (SIGIR '16). Association for Computing Machinery, New York, NY, USA, 205--214.
[6]
Marco Braga, Alessandro Raganato, Gabriella Pasi, et al. 2023. Personalization in bert with adapter modules and topic modelling. In Proc.s of IIR 2023. 24--29.
[7]
Silvia Calegari and Gabriella Pasi. 2013. Personal ontologies: Generation of user profiles based on the YAGO ontology. Information Processing & Management, Vol. 49, 3 (2013), 640--658. Personalization and Recommendation in Information Access.
[8]
Matthew Henderson, Rami Al-Rfou, Brian Strope, Yun-hsuan Sung, Laszlo Lukacs, Ruiqi Guo, Sanjiv Kumar, Balint Miklos, and Ray Kurzweil. 2017. Efficient Natural Language Response Suggestion for Smart Reply.
[9]
HuggingFace. 2021. Train a Sentence Embedding Model with 1B Training Pairs. HuggingFace. https://huggingface.co/blog/1b-sentence-embeddings
[10]
Bhaskar Mitra and Nick Craswell. 2018. An Introduction to Neural Information Retrieval. Foundations and Trends® in Information Retrieval, Vol. 13, 1 (2018), 1--126.
[11]
Rodrigo Nogueira, Zhiying Jiang, Ronak Pradeep, and Jimmy Lin. 2020. Document Ranking with a Pretrained Sequence-to-Sequence Model. In Findings of EMNLP 2020, Trevor Cohn, Yulan He, and Yang Liu (Eds.).
[12]
Greg Pass, Abdur Chowdhury, and Cayley Torgeson. 2006. A Picture of Search. In Proc. of InfoScale '06.
[13]
Clifton Poth, Hannah Sterz, Indraneil Paul, Sukannya Purkayastha, Leon Engl"ander, Timo Imhof, Ivan Vulić, Sebastian Ruder, Iryna Gurevych, and Jonas Pfeiffer. 2023. Adapters: A Unified Library for Parameter-Efficient and Modular Transfer Learning. In Proc. of the EMNLP '23.
[14]
Shayan A. Tabrizi, Azadeh Shakery, Hamed Zamani, and Mohammad Ali Tavallaei. 2018. PERSON: Personalized information retrieval evaluation based on citation networks. Information Processing & Management, Vol. 54, 4 (2018), 630--656. io

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
WWW '24: Companion Proceedings of the ACM Web Conference 2024
May 2024
1928 pages
ISBN:9798400701726
DOI:10.1145/3589335
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 May 2024

Check for updates

Badges

Author Tags

  1. personalization
  2. question answering
  3. user model

Qualifiers

  • Short-paper

Funding Sources

  • CINECA

Conference

WWW '24
Sponsor:
WWW '24: The ACM Web Conference 2024
May 13 - 17, 2024
Singapore, Singapore

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 341
    Total Downloads
  • Downloads (Last 12 months)341
  • Downloads (Last 6 weeks)61
Reflects downloads up to 16 Jan 2025

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media