skip to main content
10.1145/1367497.1367509acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

Recommending questions using the mdl-based tree cut model

Published: 21 April 2008 Publication History

Abstract

The paper is concerned with the problem of question recommendation. Specifically, given a question as query, we are to retrieve and rank other questions according to their likelihood of being good recommendations of the queried question. A good recommendation provides alternative aspects around users' interest. We tackle the problem of question recommendation in two steps: first represent questions as graphs of topic terms, and then rank recommendations on the basis of the graphs. We formalize both steps as the tree-cutting problems and then employ the MDL (Minimum Description Length) for selecting the best cuts. Experiments have been conducted with the real questions posted at Yahoo! Answers. The questions are about two domains, 'travel' and 'computers & internet'. Experimental results indicate that the use of the MDL-based tree cut model can significantly outperform the baseline methods of word-based VSM or phrase-based VSM. The results also show that the use of the MDL-based tree cut model is essential to our approach.

References

[1]
Banerjee, S. and Pedersen, T. The design, implementation, and use of the ngram statistics package. In Proc. of the 4th CICLing'03.
[2]
Barron, A., Rissanen, J., and Yu, B. The minimum description length principle in coding and modeling. IEEE Trans. Information Theory, vol. 44 (1998), pp. 2743--2760.
[3]
Burke, R. D., Hammond, K. J., Kulyukin, V. A., Lytinen, S. L., Tomuro, N., and Schoenberg, S. Question answering from frequently asked question files: Experiences with the faq finder system. Technical report, 1997.
[4]
Cao, Y. and Li, H. Base noun phrase translation using web data and the EM algorithm. In Proc. of COLING'02.
[5]
Christopher, M. D. and Hinrich, S. Foundations of Statistical Natural Language Processing. MIT Press: 1999.
[6]
Church, K. W. and Hanks, P. Word association norms, mutual information, and lexicography. In Proc. of ACL'89.
[7]
Cuerzan, S. and White, R. W. Query Suggestion based on landing pages. In Proc. of SIGIR'07.
[8]
Fellbaum, C. WordNet: An Electronic Lexical Database. MIT Press, 1998.
[9]
Fonseca, B. M., Golgher, P. B., Moura, E. S., Possas, B., and Ziviani, N. Discovering search engine related queries using association rules. Journal of Web Engineering, 2003.
[10]
Fredkin, E. Trie Memory. Communications of the ACM, D. 3(9):490--499, 1960.
[11]
Gleich, D. and Zhukov, L. SVD based term suggestion and ranking system. In Proc. of ICDM'04.
[12]
Jensen, E. C., Beitzel, S. M., Chowdhury, A., and Frieder, O. Query phrase suggestion from topically tagged session logs. In Proc. of FQAS'06.
[13]
Jeon, J. and Croft, W. B. Learning translation-based language models using Q&A archives. Technical Report, University of Massachusetts.
[14]
Jeon, J., Croft, W. B., and Lee, J. Finding similar questions in large question and answer archives. In Proc. of CIKM'05.
[15]
Jeon, J., Croft, W. B., and Lee, J. H. Finding semantically similar questions based on their answers. In Proc. of SIGIR'05.
[16]
Jones, R., Rey, B., Madani, O., and Greiner, W. Generating query substitutions. In Proc. of WWW'06.
[17]
Kawamae, N., Suzuki, H., and Mizuno, O. Query and content suggestion based on latent interest and topic class. In Proc. of WWW'04.
[18]
Lai, Y.-S., Fung, K.-A., and Wu, C.-H. Faq mining via list detection. In Proc. of the Workshop on Multilingual Summarization and Question Answering, 2002.
[19]
Li, H. and Abe, N. Generalizing Case Frames using a thesaurus and the MDL principle. Computational Linguistics, 24(2), pp.217--244, 1998.
[20]
Rissanen, J. Modeling by shortest data description. Automatica, vol. 14 (1978), pp. 465--471.
[21]
Rissanen, J. Universal coding information, prediction and estimation. IEEE Transaction on Information Theory, vol. 30(4): 629--636.
[22]
Salton, G., Wong, A., and Yang, C. S. A vector space model for automatic indexing. Communications of the ACM, vol. 18, nr. 11, pages 613--620.
[23]
Sneiders, E. Automated question answering using question templates that cover the conceptual model of the database. In Proc. of the 6th NLDB'02.
[24]
Wen, J. R., Nie, J.-Y., and Zhang, H. J. Query clustering using user logs. ACM Trans. Information Systems, 20(1):59--81, 2002.
[25]
Xun, E. D., Huang, C.N., and Zhou, M. A unified statistical model for the identification of English BaseNP. In Proc. of ACL'00.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
WWW '08: Proceedings of the 17th international conference on World Wide Web
April 2008
1326 pages
ISBN:9781605580852
DOI:10.1145/1367497
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 April 2008

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. minimum description length
  2. query suggestion
  3. question recommendation
  4. tree cut model

Qualifiers

  • Research-article

Conference

WWW '08
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)3
Reflects downloads up to 25 Dec 2024

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media