skip to main content
10.1145/2808194.2809468acmconferencesArticle/Chapter ViewAbstractPublication PagesictirConference Proceedingsconference-collections
research-article

Learning to Reinforce Search Effectiveness

Published: 27 September 2015 Publication History

Abstract

Session search is an Information Retrieval (IR) task which handles a series of queries issued for a search task. In this paper, we propose a novel reinforcement learning style information retrieval framework and develop a new feedback learning algorithm to model user feedback, including clicks and query reformulations, as reinforcement signals and to generate rewards in the RL framework. From a new perspective, we view session search as a cooperative game played between two agents, the user and the search engine. We study the communications between the two agents; they always exchange opinions on "whether the current stage of search is relevant" and "whether we should explore now." The algorithm infers user feedback models by an EM algorithm from the query logs. We compare to several state-of-the-art session search algorithms and evaluate our algorithm on the most recent TREC 2012 to 2014 Session Tracks. The experimental results demonstrates that our approach is highly effective for improving session search accuracy.

References

[1]
Y. Artzi and L. Zettlemoyer. Bootstrapping semantic parsers from conversations. In EMNLP '11.
[2]
D. P. Bertsekas. Dynamic Programming: Deterministic and Stochastic Models. Prentice-Hall, 1987.
[3]
S. R. K. Branavan, H. Chen, L. S. Zettlemoyer, and R. Barzilay. Reinforcement learning for mapping instructions to actions. In ACL '09.
[4]
S. R. K. Branavan, D. Silver, and R. Barzilay. Non-linear monte-carlo search in civilization ii. In IJCAI'11.
[5]
S. R. K. Branavan, L. S. Zettlemoyer, and R. Barzilay. Reading between the lines: learning to map high-level instructions to commands. In ACL '10.
[6]
D. L. Chen and R. J. Mooney. Learning to sportscast: A test of grounded language acquisition. In ICML '08.
[7]
A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum Likelihood from Incomplete Data via the EM Algorithm. J. Roy. Statist. Soc. Ser. B, 39(1):1--38, 1977.
[8]
A. Diriye, R. White, G. Buscher, and S. Dumais. Leaving so soon?: Understanding and predicting web search abandonment rationales. In CIKM '12, pages 1025--1034, 2012.
[9]
J. Eisenstein, J. Clarke, D. Goldwasser, and D. Roth. Reading to learn: constructing features from semantic abstracts. In EMNLP '09.
[10]
S. Fox, K. Karnawat, M. Mydland, S. Dumais, and T. White. Evaluating implicit measures to improve web search. ACM Trans. Inf. Syst., 23(2), Apr. 2005.
[11]
K. Georgila, C. Nelson, and D. Traum. Single-agent vs. multi-agent techniques for concurrent reinforcement learning of negotiation dialogue policies. In ACL '14.
[12]
D. Goldwasser, R. Reichart, J. Clarke, and D. Roth. Confidence driven unsupervised semantic parsing. In HLT '11.
[13]
D. Guan, S. Zhang, and H. Yang. Utilizing query change for session search. In SIGIR '13.
[14]
R. Howard. Dynamic Programming and Markov Process. MIT Press, 1960.
[15]
J. Hu and M. P. Wellman. Multiagent reinforcement learning: Theoretical framework and an algorithm. In ICML '98.
[16]
K. Järvelin and J. Kekäläinen. Cumulated gain-based evaluation of ir techniques. ACM Trans. Inf. Syst., 20(4):422--446, 2002.
[17]
L. P. Kaelbling, M. L. Littman, and A. W. Moore. Reinforcement learning: A survey. J Artificial Intelligence Res., 4:237--285, 1996.
[18]
P. Kanani, A. McCallum, and S. Hu. Resource-bounded information extraction: Acquiring missing feature values on demand. In Advances in Knowledge Discovery and Data Mining, 2010.
[19]
E. Kanoulas, B. Carterette, M. Hall, P. Clough, and M. Sanderson. Overview of the trec 2013 session track. In TREC'13.
[20]
E. Kanoulas, B. Carterette, M. Hall, P. Clough, and M. Sanderson. Overview of the trec 2014 session track. In TREC'14.
[21]
R. J. Kate and R. J. Mooney. Learning language semantics from ambiguous supervision. In AAAI '07.
[22]
M. Lauer and M. A. Riedmiller. An algorithm for distributed reinforcement learning in cooperative multi-agent systems. In ICML '00.
[23]
V. Lesser, B. Horling, F. Klassner, A. Raja, T. Wagner, and S. Zhang. BIG: An Agent for Resource-Bounded Information Gathering and Decision Making. Artificial Intelligence Journal, Special Issue on Internet Information Agents, 118(1--2):197--244, 2000.
[24]
P. Liang, M. I. Jordan, and D. Klein. Learning semantic correspondences with less supervision. In ACL '09.
[25]
M. L. Littman. markov games as a framework for multi-agent reinforcement learning. In ICML '94.
[26]
R. T. Loftin, J. MacGlashan, B. Peng, M. E. Taylor, M. L. Littman, J. Huang, and D. L. Roberts. A strategy-aware technique for learning behaviors from discrete human feedback. In AAAI '14.
[27]
J. Luo, S. Zhang, and H. Yang. Win-win search: Dual-agent stochastic game in session search. In SIGIR '14.
[28]
D. Meger, P.-E. Forssén, K. Lai, S. Helmer, S. McCann, T. Southey, M. Baumann, J. J. Little, and D. G. Lowe. Curious george: An attentive semantic robot. Robot. Auton. Syst., 56(6):503--511, 2008.
[29]
L. Peshkin, K.-E. Kim, N. Meuleau, and L. P. Kaelbling. Learning to cooperate via policy search. In UAI '00.
[30]
J. Schmidhuber. Sequential decision making based on direct search. In R. Sun and C. L. Giles, editors, Sequence Learning: Paradigms, Algorithms, and Applications. Springer, 2001.
[31]
R. S. Sutton and A. G. Barto. Reinforcement Learning: An Introduction. MIT Press, 1998.
[32]
M. Tan. Multi-agent reinforcement learning: Independent vs. cooperative agents. In ICML '93.
[33]
A. Vogel and D. Jurafsky. Learning to follow navigational directions. In ACL '10.
[34]
H. Yang, M. Sloan, and J. Wang. Tutorial on dynamic information retrieval modeling. In SIGIR '14.
[35]
H. Yang, M. Sloan, and J. Wang. Tutorial on dynamic information retrieval modeling. In WSDM '15.
[36]
C. Zhang, S. Abdallah, and V. Lesser. Efficient multi-agent reinforcement learning through automated supervision. In AAMAS '08.

Cited By

View all

Index Terms

  1. Learning to Reinforce Search Effectiveness

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ICTIR '15: Proceedings of the 2015 International Conference on The Theory of Information Retrieval
    September 2015
    402 pages
    ISBN:9781450338332
    DOI:10.1145/2808194
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 27 September 2015

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. dynamic information retrieval modeling
    2. reinforcement learning
    3. session search
    4. stochastic game

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    ICTIR '15
    Sponsor:

    Acceptance Rates

    ICTIR '15 Paper Acceptance Rate 29 of 57 submissions, 51%;
    Overall Acceptance Rate 235 of 527 submissions, 45%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)3
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 26 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media