research-article

Query dependent ranking using K-nearest neighbor

Authors:

Heung-Yeung ShumAuthors Info & Claims

SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval

Pages 115 - 122

https://doi.org/10.1145/1390334.1390356

Published: 20 July 2008 Publication History

Abstract

Many ranking models have been proposed in information retrieval, and recently machine learning techniques have also been applied to ranking model construction. Most of the existing methods do not take into consideration the fact that significant differences exist between queries, and only resort to a single function in ranking of documents. In this paper, we argue that it is necessary to employ different ranking models for different queries and onduct what we call query-dependent ranking. As the first such attempt, we propose a K-Nearest Neighbor (KNN) method for query-dependent ranking. We first consider an online method which creates a ranking model for a given query by using the labeled neighbors of the query in the query feature space and then rank the documents with respect to the query using the created model. Next, we give two offline approximations of the method, which create the ranking models in advance to enhance the efficiency of ranking. And we prove a theory which indicates that the approximations are accurate in terms of difference in loss of prediction, if the learning algorithm used is stable with respect to minor changes in training examples. Our experimental results show that the proposed online and offline methods both outperform the baseline method of using a single ranking function.

References

[1]

S. Agarwal and P. Niyogi. Stability and generalization of bipartite ranking algorithms. In Proceedings of COLT 2005, pages 32--47, 2005.

Digital Library

[2]

R. Baeza--Yates and B. Ribeiro-Neto. Modern Information Retrieval. Addison Wesley, May 1999.

Digital Library

[3]

S. M. Beitzel, E. C. Jensen, A. Chowdhury, and O. Frieder. Varying approaches to topical web query classification. In SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pages 783--784, New York, NY, USA, 2007. ACM.

Digital Library

[4]

S. M. Beitzel, E. C. Jensen, O. Frieder, D. Grossman, D. D. Lewis, A. Chowdhury, and A. Kolcz. Automatic web query classification using labeled and unlabeled training data. In SIGIR '05: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, pages 581--582, New York, NY, USA, 2005. ACM.

Digital Library

[5]

C. Burges, T. Shaked, E. Renshaw, A. Lazier, M. Deeds, N. Hamilton, and G. Hullender. Learning to rank using gradient descent. In ICML '05: Proceedings of the 22nd international conference on Machine learning, pages 89--96, New York, NY, USA, 2005. ACM.

Digital Library

[6]

Z. Cao, T. Qin, T.-Y. Liu, M.-F. Tsai, and H. Li. Learning to rank: from pairwise approach to listwise approach. In ICML '07, volume 227 of ACM International Conference Proceeding Series, pages 129--136. ACM, 2007.

Digital Library

[7]

Y. Freund, R. Iyer, R. E. Schapire, and Y. Singer. An efficient boosting algorithm for combining preferences. J. Mach. Learn. Res., 4:933--969, 2003.

Digital Library

[8]

D. S. Guru and H. S. Nagendraswamy. Clustering of interval-valued symbolic patterns based on mutual similarity value and the concept of -mutual nearest neighborhood. In ACCV (2), pages 234--243, 2006.

Digital Library

[9]

K. Järvelin and J. Kekäläinen. Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Syst., 20(4):422--446, 2002.

Digital Library

[10]

T. Joachims. Making large-scale support vector machine learning practical. In Advances in Kernel Methods: Support Vector Machines.

Digital Library

[11]

T. Joachims. Optimizing search engines using clickthrough data. In Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD). ACM, 2002.

Digital Library

[12]

I. Kang and G. Kim. Query type classification for web document retrieval. In SIGIR '03: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, 2003.

Digital Library

[13]

J. Lafferty and C. Zhai. Document language models, query models, and risk minimization for information retrieval. In SIGIR '01: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, pages 111--119, New York, NY, USA, 2001. ACM.

Digital Library

[14]

U. Lee, Z. Liu, and J. Cho. Automatic identification of user goals in web search. In WWW '05: Proceedings of the 14th international conference on World Wide Web, pages 391--400, New York, NY, USA, 2005. ACM.

Digital Library

[15]

T. Y. Liu, J. Xu, T. Qin, W. Xiong, and H. Li. LETOR: Benchmark dataset for research on learning to rank for information retrieval. In SIGIR '07: Proceedings of the Learning to Rank workshop in the 30th annual international ACM SIGIR conference on Research and development in information retrieval, 2007.

[16]

T.-Y. Liu, Y. Yang, H. Wan, H.-J. Zeng, Z. Chen, and W.-Y. Ma. Support vector machines classification with a very large-scale taxonomy. SIGKDD Explor. Newsl., 7(1):36--43, 2005.

Digital Library

[17]

R. Nallapati. Discriminative models for information retrieval. In SIGIR '04: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, pages 64--71, New York, NY, USA, 2004. ACM.

Digital Library

[18]

J. M. Ponte and W. B. Croft. A language modeling approach to information retrieval. In Research and Development in Information Retrieval, pages 275--281, 1998.

Digital Library

[19]

F. P. Preparata and M. I. Shamos. Computational Geometry:Discriminative models An Introduction (Monographs in Computer Science). Springer, August 1985.

[20]

M. Richardson, A. Prakash, and E. Brill. Beyond PageRank: machine learning for static ranking. In WWW '06: Proceedings of the 15th international conference on World Wide Web, pages 707--715, New York, NY, USA, 2006. ACM.

Digital Library

[21]

S. Robertson. Overview of the okapi projects. In Journal of Documentation, pages 275--281, 1998.

[22]

D. E. Rose and D. Levinson. Understanding user goals in web search. In WWW '04: Proceedings of the 13th international conference on World Wide Web, pages 13--19, New York, NY, USA, 2004. ACM.

Digital Library

[23]

G. Salton. The SMART Retrieval System-Experiments in Automatic Document Processing. Prentice-Hall, Inc., Upper Saddle River, NJ, USA, 1971.

Digital Library

[24]

G. Salton and M. E. Lesk. Computer evaluation of indexing and text processing. J. ACM, 15(1):8--36, 1968.

Digital Library

[25]

D. Shen, J.-T. Sun, Q. Yang, and Z. Chen. Building bridges for web query classification. In SIGIR '06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, pages 131--138, New York, NY, USA, 2006. ACM.

Digital Library

[26]

R. Song, J.-R. Wen, S. Shi, G. Xin, T.-Y. Liu, T. Qin, X. Zheng, J. Zhang, G. Xue, and W.-Y. Ma. Microsoft research asia at web track and terabyte track of trec 2004. In Proceedings of the Thirteenth Text REtrieval Conference Proceedings (TREC-2004), 2004.

[27]

E. Xing, A. Ng, M. Jordan, and S. Russell. Distance metric learning, with application to clustering with side-information. In Advances in NIPS, number vol. 15, 2003.

[28]

J. Xu and H. Li. Adarank: a boosting algorithm for information retrieval. In SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pages 391--398, New York, NY, USA, 2007. ACM.

Digital Library

[29]

Y. Yue, T. Finley, F. Radlinski, and T. Joachims. A support vector method for optimizing average precision. In SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pages 271--278, New York, NY, USA, 2007. ACM.

Digital Library

Cited By

El-Feky SMohamed AAmmar A(2024)EncodKNN: Augmenting KNN with Autoencoder for Computational Cost Reduction2024 Intelligent Methods, Systems, and Applications (IMSA)10.1109/IMSA61967.2024.10652805(641-646)Online publication date: 13-Jul-2024
https://doi.org/10.1109/IMSA61967.2024.10652805
Liu RXu GShang Z(2023)Distributed adaptive nearest neighbor classifier: algorithm and theoryStatistics and Computing10.1007/s11222-023-10267-733:5Online publication date: 3-Jul-2023
https://dl.acm.org/doi/10.1007/s11222-023-10267-7
Kim GLim S(2022)Development of an Interpretable Maritime Accident Prediction System Using Machine Learning TechniquesIEEE Access10.1109/ACCESS.2022.316830210(41313-41329)Online publication date: 2022
https://doi.org/10.1109/ACCESS.2022.3168302
Show More Cited By

Index Terms

Query dependent ranking using K-nearest neighbor
1. Information systems
  1. Information retrieval
    1. Retrieval models and ranking

Recommendations

Re-ranking search results using query logs
CIKM '06: Proceedings of the 15th ACM international conference on Information and knowledge management

This work addresses two common problems in search, frequently occurring with underspecified user queries: the top-ranked results for such queries may not contain documents relevant to the user's search intent, and fresh and relevant pages may not get ...
Query-dependent rank aggregation with local models
AIRS'11: Proceedings of the 7th Asia conference on Information Retrieval Technology

The technologies of learning to rank have been successfully used in information retrieval. General ranking approaches use all training queries to build a single ranking model and apply this model to all different kinds of queries. Such a "global" ...
Ranking with query-dependent loss for web search
WSDM '10: Proceedings of the third ACM international conference on Web search and data mining

Queries describe the users' search intent and therefore they play an essential role in the context of ranking for information retrieval and Web search. However, most of existing approaches for ranking do not explicitly take into consideration the fact ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval

July 2008

934 pages

ISBN:9781605581644

DOI:10.1145/1390334

General Chairs:
Tat-Seng Chua
National University of Singapore
,
Mun-Kew Leong
National Library Board, Singapore
,
Program Chairs:
Syung Hyon Myaeng
Information and Communications University, Korea
,
Douglas W. Oard
University of Maryland, College Park, USA
,
Fabrizio Sebastiani
Consiglio Nazionale delle Ricerche, Italy

Copyright © 2008 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 July 2008

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

SIGIR '08

Sponsor:

SIGIR '08: The 31st Annual International ACM SIGIR Conference

July 20 - 24, 2008

Singapore, Singapore

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

100
Total Citations
View Citations
1,523
Total Downloads

Downloads (Last 12 months)33
Downloads (Last 6 weeks)1

Reflects downloads up to 14 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

El-Feky SMohamed AAmmar A(2024)EncodKNN: Augmenting KNN with Autoencoder for Computational Cost Reduction2024 Intelligent Methods, Systems, and Applications (IMSA)10.1109/IMSA61967.2024.10652805(641-646)Online publication date: 13-Jul-2024
https://doi.org/10.1109/IMSA61967.2024.10652805
Liu RXu GShang Z(2023)Distributed adaptive nearest neighbor classifier: algorithm and theoryStatistics and Computing10.1007/s11222-023-10267-733:5Online publication date: 3-Jul-2023
https://dl.acm.org/doi/10.1007/s11222-023-10267-7
Kim GLim S(2022)Development of an Interpretable Maritime Accident Prediction System Using Machine Learning TechniquesIEEE Access10.1109/ACCESS.2022.316830210(41313-41329)Online publication date: 2022
https://doi.org/10.1109/ACCESS.2022.3168302
Bhalla RJyoti (2021)A Hybrid Classification Approach That Combines K-Nearest Neighbor and Helps Vector Machine Will Provide Results That Are Closer to The True ValueInternational Journal of Scientific Research in Science, Engineering and Technology10.32628/IJSRSET2183131(572-580)Online publication date: 20-Jun-2021
https://doi.org/10.32628/IJSRSET2183131
Usta AAltingovde IOzcan RUlusoy O(2021)Learning to Rank for Educational Search EnginesIEEE Transactions on Learning Technologies10.1109/TLT.2021.307519614:2(211-225)Online publication date: 1-Apr-2021
https://doi.org/10.1109/TLT.2021.3075196
Saadon TLazarovitch NJerszurki DTas E(2021)Predicting net radiation in naturally ventilated greenhouses based on outside global solar radiation for reference evapotranspiration estimationAgricultural Water Management10.1016/j.agwat.2021.107102257(107102)Online publication date: Nov-2021
https://doi.org/10.1016/j.agwat.2021.107102
Barrientos RRiquelme JHernández-García RNavarro CSoto-Silva W(2021)Fast kNN query processing over a multi-node GPU environmentThe Journal of Supercomputing10.1007/s11227-021-03975-2Online publication date: 15-Jul-2021
https://doi.org/10.1007/s11227-021-03975-2
Haddad RHlaoua L(2020)An automatic learning for re-ranking in social information retrieval2020 15th International Workshop on Semantic and Social Media Adaptation and Personalization (SMA10.1109/SMAP49528.2020.9248437(1-6)Online publication date: 29-Oct-2020
https://doi.org/10.1109/SMAP49528.2020.9248437
Riesener MDoelle CMendl-Heinisch MKlumpen N(2020)Identification of evaluation criteria for algorithms used within the context of product developmentProcedia CIRP10.1016/j.procir.2020.02.20791(508-515)Online publication date: 2020
https://doi.org/10.1016/j.procir.2020.02.207
Czerski DŁoziński PAlojzy Kłopotek MStarosta BSydow M(2020)FlexTrustRank: A New Approach to Link Spam CombatingArtificial Intelligence and Soft Computing10.1007/978-3-030-61534-5_12(130-139)Online publication date: 7-Oct-2020
https://doi.org/10.1007/978-3-030-61534-5_12
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents