research-article

Structured learning for non-smooth ranking losses

Authors:

Soumen Chakrabarti,

Chiru BhattacharyyaAuthors Info & Claims

KDD '08: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining

Pages 88 - 96

https://doi.org/10.1145/1401890.1401906

Published: 24 August 2008 Publication History

Abstract

Learning to rank from relevance judgment is an active research area. Itemwise score regression, pairwise preference satisfaction, and listwise structured learning are the major techniques in use. Listwise structured learning has been applied recently to optimize important non-decomposable ranking criteria like AUC (area under ROC curve) and MAP (mean average precision). We propose new, almost-linear-time algorithms to optimize for two other criteria widely used to evaluate search systems: MRR (mean reciprocal rank) and NDCG (normalized discounted cumulative gain) in the max-margin structured learning framework. We also demonstrate that, for different ranking criteria, one may need to use different feature maps. Search applications should not be optimized in favor of a single criterion, because they need to cater to a variety of queries. E.g., MRR is best for navigational queries, while NDCG is best for informational queries. A key contribution of this paper is to fold multiple ranking loss functions into a multi-criteria max-margin optimization. The result is a single, robust ranking model that is close to the best accuracy of learners trained on individual criteria. In fact, experiments over the popular LETOR and TREC data sets show that, contrary to conventional wisdom, a test criterion is often not best served by training with the same individual criterion.

References

[1]

R. Herbrich, T. Graepel, and K. Obermayer, "Support vector learning for ordinal regression,'' in International Conference on Artificial Neural Networks, 1999, pp. 97--102. http://www.research.microsoft.com/~rherb/papers/hergraeober99b.ps.gz

[2]

T. Joachims, "Optimizing search engines using clickthrough data,'' in SIGKDD Conference. ACM, 2002. http://www.cs.cornell.edu/People/tj/publications/joachims_02c.pdf

Digital Library

[3]

C. Burges, T. Shaked, E. Renshaw, A. Lazier, M. Deeds, N. Hamilton, and G. Hullender, "Learning to rank using gradient descent,'' in ICML, 2005. http://research.microsoft.com/~ cburges/papers/ICML_ranking.pdf

Digital Library

[4]

T. Joachims, "A support vector method for multivariate performance measures,'' in ICML, 2005, pp. 377--384. http://www.machinelearning.org/proceedings/icml2005/papers/048_ASupport_Joachims.pdf

Digital Library

[5]

------, "Training linear SVMs in linear time,'' in SIGKDD Conference, 2006, pp. 217--226. http://www.cs.cornell.edu/people/tj/publications/joachims_06a.pdf

Digital Library

[6]

C. J. C. Burges, R. Ragno, and Q. V. Le, "Learning to rank with nonsmooth cost functions,'' in NIPS, 2006. http://research.microsoft.com/~cburges/papers/LambdaRank.pdf

[7]

Z. Cao, T. Qin, T.-Y. Liu, M.-F. Tsai, and H. Li, "Learning to rank: From pairwise approach to listwise approach,'' in ICML, 2007, pp. 129--136. http://www.machinelearning.org/proceedings/icml2007/papers/139.pdf

Digital Library

[8]

M. Taylor, J. Guiver, S. Robertson, and T. Minka, "SoftRank: Optimising non-smooth rank metrics,'' in WSDM. ACM, 2008. http://research.microsoft.com/~joguiver/sigir07LetorSoftRankCam.pdf

Digital Library

[9]

Y. Yue, T. Finley, F. Radlinski, and T. Joachims, "A support vector method for optimizing average precision,'' in SIGIR Conference, 2007. http://www.cs.cornell.edu/People/tj/publications/yue_etal_07a.pdf

Digital Library

[10]

P. Li, C. J. C. Burges, and Q. Wu, "McRank: Learning to rank using multiple classification and gradient boosting,'' in NIPS, 2007, pp. 845--852. http://books.nips.cc/papers/files/nips20/NIPS2007_0845.pdf

[11]

V. Vapnik, S. Golowich, and A. J. Smola, "Support vector method for function approximation, regression estimation, and signal processing,'' in Advances in Neural Information Processing Systems. MIT Press, 1996.

[12]

I. Tsochantaridis, T. Joachims, T. Hofmann, and Y. Altun, "Large margin methods for structured and interdependent output variables,'' JMLR, vol. 6, no. Sep, pp. 1453--1484, 2005. http://ttic.uchicago.edu/~altun/pubs/TsoJoaHofAlt-JMLR.pdf

Digital Library

[13]

E. Voorhees, "Overview of the TREC 2001 question answering track,'' in The Tenth Text REtrieval Conference, ser. NIST Special Publication, vol. 500-250, 2001, pp. 42--51. http://trec.nist.gov/pubs/trec10/t10_proceedings.html

[14]

K. Järvelin and J. Kekäläinen, "IR evaluation methods for retrieving highly relevant documents,'' in SIGIR Conference, 2000, pp. 41--48. http://www.info.uta.fi/tutkimus/fire/archive/KJJKSIGIR00.pdf

Digital Library

[15]

E. Snelson and J. Guiver, "SoftRank with gaussian processes,'' in NIPS 2007 Workshop on Machine Learning for Web Search, 2007. http://research.microsoft.com/CONFERENCES/NIPS07/papers/gprank.pdf

[16]

T.-Y. Liu, T. Qin, J. Xu, W. Xiong, and H. Li, "LETOR: Benchmark dataset for research on learning to rank for information retrieval,'' in LR4IR Workshop, 2007. http://research.microsoft.com/users/LETOR/

[17]

O. Chapelle, Q. Le, and A. Smola, "Large margin optimization of ranking measures,'' in NIPS 2007 Workshop on Machine Learning for Web Search, 2007. http://research.microsoft.com/CONFERENCES/NIPS07/papers/ranking.pdf

[18]

Q. V. Le and A. J. Smola, "Direct optimization of ranking measures,'' Feb. 2008, arXiv:0704.3359v1. http://arxiv.org/pdf/0704.3359

[19]

C. Burges, "Learning to rank for Web search: Some new directions,'' Keynote talk at SIGIR Ranking Workshop, July 2007. http://research.microsoft.com/users/LR4IR-2007/LearningToRankKeynote_Burges.pdf

[20]

R. Ahuja, T. Magnanti, and J. Orlin, Network Flows. Prentice Hall, 1993.

[21]

M.-F. Tsai, T.-Y. Liu, T. Qin, H.-H. Chen, and W.-Y. Ma, "FRank: A ranking method with fidelity loss,'' in SIGIR Conference, 2007, pp. 383--390. http://portal.acm.org/ft_gateway.cfm?id=1277808&type=pdf

Digital Library

[22]

L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone, Classification and Regression Trees. Wadsworth & Brooks/Cole, 1984, iSBN: 0-534-98054-6.

[23]

I. H. Witten and E. Frank, Data Mining: Practical machine learning tools and techniques, 2nd ed. San Francisco: Morgan Kaufmann, 2005. http://www.cs.waikato.ac.nz/ml/weka/

Digital Library

[24]

A. Bordes, L. Bottou, P. Gallinari, and J. Weston, "Solving multiclass support vector machines with LaRank,'' in ICML. ACM, 2007, pp. 89--96. http://www.machinelearning.org/proceedings/icml2007/papers/381.pdf

Digital Library

Cited By

Qiu ZHu QZhong YTu WZhang LYang T(2025)Optimal large-scale stochastic optimization of NDCG surrogates for deep learningMachine Learning10.1007/s10994-024-06631-x114:2Online publication date: 27-Jan-2025
https://doi.org/10.1007/s10994-024-06631-x
Vrijenhoek SBénédict GGutierrez Granada MOdijk D(2024)RADio* – An Introduction to Measuring Normative Diversity in News RecommendationsACM Transactions on Recommender Systems10.1145/36364653:1(1-29)Online publication date: 2-Aug-2024
https://dl.acm.org/doi/10.1145/3636465
Kang Jde Rijke MOosterhuis HHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)Estimating the Hessian Matrix of Ranking Objectives for Stochastic Learning to Rank with Gradient Boosted TreesProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657918(2390-2394)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3657918
Show More Cited By

Index Terms

Structured learning for non-smooth ranking losses

Recommendations

RankT5: Fine-Tuning T5 for Text Ranking with Ranking Losses
SIGIR '23: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval

Pretrained language models such as BERT have been shown to be exceptionally effective for text ranking. However, there are limited studies on how to leverage more powerful sequence-to-sequence models such as T5. Existing attempts usually formulate text ...
Re-ranking search results using query logs
CIKM '06: Proceedings of the 15th ACM international conference on Information and knowledge management

This work addresses two common problems in search, frequently occurring with underspecified user queries: the top-ranked results for such queries may not contain documents relevant to the user's search intent, and fresh and relevant pages may not get ...
Web Page Ranking Using Machine Learning Approach
ACCT '15: Proceedings of the 2015 Fifth International Conference on Advanced Computing & Communication Technologies

This article gives an overview of the currently available literature on web page ranking algorithm using machine learning. Web page ranking algorithm, a well-known approach to rank the web pages available on cyber world. It helps us to know--how the ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

KDD '08: Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining

August 2008

1116 pages

ISBN:9781605581934

DOI:10.1145/1401890

General Chair:
Ying Li
Microsoft adCenter Labs
,
Program Chairs:
Bing Liu
University of Illinois at Chicago
,
Sunita Sarawagi
Indian Institute of Technology, Bombay

Copyright © 2008 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 August 2008

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

KDD08

Sponsor:

KDD08: The 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

August 24 - 27, 2008

Nevada, Las Vegas, USA

Acceptance Rates

KDD '08 Paper Acceptance Rate 118 of 593 submissions, 20%;

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Sponsor:
sigkdd
sigkdd

The 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 3 - 7, 2025

Toronto , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

77
Total Citations
View Citations
920
Total Downloads

Downloads (Last 12 months)22
Downloads (Last 6 weeks)5

Reflects downloads up to 03 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Qiu ZHu QZhong YTu WZhang LYang T(2025)Optimal large-scale stochastic optimization of NDCG surrogates for deep learningMachine Learning10.1007/s10994-024-06631-x114:2Online publication date: 27-Jan-2025
https://doi.org/10.1007/s10994-024-06631-x
Vrijenhoek SBénédict GGutierrez Granada MOdijk D(2024)RADio* – An Introduction to Measuring Normative Diversity in News RecommendationsACM Transactions on Recommender Systems10.1145/36364653:1(1-29)Online publication date: 2-Aug-2024
https://dl.acm.org/doi/10.1145/3636465
Kang Jde Rijke MOosterhuis HHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)Estimating the Hessian Matrix of Ranking Objectives for Stochastic Learning to Rank with Gradient Boosted TreesProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657918(2390-2394)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3657918
Dey VNing X(2024)Improving Anticancer Drug Selection and Prioritization via Neural Learning to RankJournal of Chemical Information and Modeling10.1021/acs.jcim.3c0106064:10(4071-4088)Online publication date: 13-May-2024
https://doi.org/10.1021/acs.jcim.3c01060
Li ZGuo CWang XZhang HWang Y(2024)Integrating listwise ranking into pairwise-based image-text retrievalKnowledge-Based Systems10.1016/j.knosys.2024.111431287(111431)Online publication date: Mar-2024
https://doi.org/10.1016/j.knosys.2024.111431
Akshay AAbedi MShekarchizadeh NBurkhard FKatoch MBigger-Allen AAdam RMonastyrskaya KGheinani A(2023)MLcps: machine learning cumulative performance score for classification problemsGigaScience10.1093/gigascience/giad10812Online publication date: 13-Dec-2023
https://doi.org/10.1093/gigascience/giad108
Yuan XXu XWang XZhang KLiao LWang ZLin C(2023)OSAP‐LossCAAI Transactions on Intelligence Technology10.1049/cit2.121518:4(1191-1212)Online publication date: 28-Mar-2023
https://dl.acm.org/doi/10.1049/cit2.12151
Padhye VLakshmanan K(2023)A deep actor critic reinforcement learning framework for learning to rankNeurocomputing10.1016/j.neucom.2023.126314547(126314)Online publication date: Aug-2023
https://doi.org/10.1016/j.neucom.2023.126314
Jebari CHerrera-Viedma ECobo M(2023)Context-aware citation recommendation of scientific papers: comparative study, gaps and trendsScientometrics10.1007/s11192-023-04773-8128:8(4243-4268)Online publication date: 22-Jun-2023
https://doi.org/10.1007/s11192-023-04773-8
Vrijenhoek SBénédict GGutierrez Granada MOdijk DDe Rijke M(2022)RADio – Rank-Aware Divergence Metrics to Measure Normative Diversity in News RecommendationsProceedings of the 16th ACM Conference on Recommender Systems10.1145/3523227.3546780(208-219)Online publication date: 12-Sep-2022
https://dl.acm.org/doi/10.1145/3523227.3546780
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten