skip to main content
10.1145/3627673.3680043acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article
Open access

Confidence-Aware Multi-Field Model Calibration

Published: 21 October 2024 Publication History

Abstract

Accurately predicting the probabilities of user feedback, such as clicks and conversions, is critical for advertisement ranking and bidding. However, there often exist unwanted mismatches between predicted probabilities and true likelihoods due to the rapid shift of data distributions and intrinsic model biases. Calibration aims to address this issue by post-processing model predictions, and field-aware calibration can adjust model output on different feature field values to satisfy fine-grained advertising demands. Unfortunately, the observed samples corresponding to certain field values can be seriously limited to make confident calibrations, which may yield bias amplification and online disturbance. In this paper, we propose a confidence-aware multi-field calibration method, which adaptively adjusts the calibration intensity based on confidence levels derived from sample statistics. It also utilizes multiple fields for joint model calibration according to their importance to mitigate the impact of data sparsity on a single field. Extensive offline and online experiments show the superiority of our method in boosting advertising performance and reducing prediction deviations.

References

[1]
Yu Bai, Song Mei, Huan Wang, and Caiming Xiong. 2021. Don't just blame over-parametrization for over-confidence: Theoretical analysis of calibration in binary classification. In ICML. PMLR, 566--576.
[2]
Antonio Bella, Cèsar Ferri, José Hernández-Orallo, and María José Ramírez-Quintana. 2010. Calibration of machine learning models. In Handbook of Research on Machine Learning Applications and Trends: Algorithms, Methods, and Techniques. IGI Global, 128--146.
[3]
Olivier Chapelle, Eren Manavoglu, and Romer Rosales. 2014. Simple and scalable response prediction for display advertising. ACM Transactions on Intelligent Systems and Technology (TIST), Vol. 5, 4 (2014), 1--34.
[4]
Junxuan Chen, Baigui Sun, Hao Li, Hongtao Lu, and Xian-Sheng Hua. 2016. Deep ctr prediction in display advertising. In ACM MM. 811--820.
[5]
Chao Deng, Hao Wang, Qing Tan, Jian Xu, and Kun Gai. 2021. Calibrating user response predictions in online advertising. In ECML-PKDD. Springer, 208--223.
[6]
Jakob Gawlikowski, Cedrique Rovile Njieutcheu Tassi, Mohsin Ali, Jongseok Lee, Matthias Humt, Jianxiang Feng, Anna Kruspe, Rudolph Triebel, Peter Jung, Ribana Roscher, et al. 2023. A survey of uncertainty in deep neural networks. Artificial Intelligence Review, Vol. 56, Suppl 1 (2023), 1513--1589.
[7]
Chuan Guo, Geoff Pleiss, Yu Sun, and Kilian Q Weinberger. 2017. On calibration of modern neural networks. In ICML. PMLR, 1321--1330.
[8]
Yuyao Guo, Haoming Li, Xiang Ao, Min Lu, Dapeng Liu, Lei Xiao, Jie Jiang, and Qing He. 2022. Calibrated Conversion Rate Prediction via Knowledge Distillation under Delayed Feedback in Online Advertising. In CIKM. 3983--3987.
[9]
Kartik Gupta, Amir Rahimi, Thalaiyasingam Ajanthan, Thomas Mensink, Cristian Sminchisescu, and Richard Hartley. 2021. Calibration of Neural Networks using Splines. In ICLR. https://openreview.net/forum?id=eQe8DEWNN2W
[10]
Rui Huang, Andrew Geng, and Yixuan Li. 2021. On the importance of gradients for detecting distributional shifts in the wild. NeurIPS, Vol. 34 (2021), 677--689.
[11]
Siguang Huang, Yunli Wang, Lili Mou, Huayue Zhang, Han Zhu, Chuan Yu, and Bo Zheng. 2022. MBCT: Tree-Based Feature-Aware Binning for Individual Uncertainty Calibration. In WWW. 2236--2246.
[12]
Meelis Kull, Miquel Perello Nieto, Markus Kängsepp, Telmo Silva Filho, Hao Song, and Peter Flach. 2019. Beyond temperature scaling: Obtaining well-calibrated multi-class probabilities with dirichlet calibration. NeurIPS, Vol. 32 (2019).
[13]
Meelis Kull, Telmo Silva Filho, and Peter Flach. 2017. Beta calibration: a well-founded and easily implemented improvement on logistic calibration for binary classifiers. In AISTATS. PMLR, 623--631.
[14]
Ananya Kumar, Percy S Liang, and Tengyu Ma. 2019. Verified uncertainty calibration. NeurIPS, Vol. 32 (2019).
[15]
Wonbin Kweon, SeongKu Kang, and Hwanjo Yu. 2022. Obtaining Calibrated Probabilities with Personalized Ranking Models. In AAAI, Vol. 36. 4083--4091.
[16]
Kuang-chih Lee, Burkay Orten, Ali Dasdan, and Wentong Li. 2012. Estimating conversion rate in display advertising from past erformance data. In KDD. 768--776.
[17]
Aditya Krishna Menon, Xiaoqian J Jiang, Shankar Vembu, Charles Elkan, and Lucila Ohno-Machado. 2012. Predicting accurate probabilities with a ranking loss. In ICML, Vol. 2012. NIH Public Access, 703.
[18]
Azadeh Sadat Mozafari, Hugo Siqueira Gomes, Wilson Le ao, Steeven Janny, and Christian Gagné. 2018. Attended temperature scaling: a practical approach for calibrating deep neural networks. arXiv preprint arXiv:1810.11586 (2018).
[19]
Mahdi Pakdaman Naeini, Gregory Cooper, and Milos Hauskrecht. 2015. Obtaining well calibrated probabilities using bayesian binning. In AAAI, Vol. 29.
[20]
Brian Neelon and David B Dunson. 2004. Bayesian isotonic regression and trend analysis. Biometrics, Vol. 60, 2 (2004), 398--406.
[21]
Alexandru Niculescu-Mizil and Rich Caruana. 2005. Predicting good probabilities with supervised learning. In ICML. 625--632.
[22]
Jeremy Nixon, Michael W Dusenberry, Linchuan Zhang, Ghassen Jerfel, and Dustin Tran. 2019. Measuring Calibration in Deep Learning. In CVPR workshops, Vol. 2.
[23]
Yaniv Ovadia, Emily Fertig, Jie Ren, Zachary Nado, David Sculley, Sebastian Nowozin, Joshua Dillon, Balaji Lakshminarayanan, and Jasper Snoek. 2019. Can you trust your model's uncertainty? evaluating predictive uncertainty under dataset shift. NeurIPS, Vol. 32 (2019).
[24]
Feiyang Pan, Xiang Ao, Pingzhong Tang, Min Lu, Dapeng Liu, Lei Xiao, and Qing He. 2020. Field-aware calibration: a simple and empirically strong method for reliable probabilistic predictions. In WWW. 729--739.
[25]
John Platt et al. 1999. Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Advances in large margin classifiers, Vol. 10, 3 (1999), 61--74.
[26]
Geoff Pleiss, Manish Raghavan, Felix Wu, Jon Kleinberg, and Kilian Q Weinberger. 2017. On fairness and calibration. NeurIPS, Vol. 30 (2017).
[27]
Amir Rahimi, Amirreza Shaban, Ching-An Cheng, Richard Hartley, and Byron Boots. 2020. Intra order-preserving functions for calibration of multi-class neural networks. NeurIPS, Vol. 33 (2020), 13456--13467.
[28]
Matthew Richardson, Ewa Dominowska, and Robert Ragno. 2007. Predicting clicks: estimating the click-through rate for new ads. In WWW. 521--530.
[29]
Zheyan Shen, Jiashuo Liu, Yue He, Xingxuan Zhang, Renzhe Xu, Han Yu, and Peng Cui. 2021. Towards out-of-distribution generalization: A survey. arXiv preprint arXiv:2108.13624 (2021).
[30]
Hongyan Tang, Junning Liu, Ming Zhao, and Xudong Gong. 2020. Progressive layered extraction (ple): A novel multi-task learning (mtl) model for personalized recommendations. In RecSys. 269--278.
[31]
Yoav Wald, Amir Feder, Daniel Greenfeld, and Uri Shalit. 2021. On calibration and out-of-domain generalization. NeurIPS, Vol. 34 (2021), 2215--2227.
[32]
Jindong Wang, Cuiling Lan, Chang Liu, Yidong Ouyang, Tao Qin, Wang Lu, Yiqiang Chen, Wenjun Zeng, and Philip Yu. 2022. Generalizing to unseen domains: A survey on domain generalization. TKDE (2022).
[33]
Penghui Wei, Weimin Zhang, Ruijie Hou, Jinquan Liu, Shaoguo Liu, Liang Wang, and Bo Zheng. 2022. Posterior Probability Matters: Doubly-Adaptive Calibration for Neural Predictions in Online Advertising. In SIGIR. 2645--2649.
[34]
Jonathan Wenger, Hedvig Kjellström, and Rudolph Triebel. 2020. Non-parametric calibration for classification. In AISTATS. PMLR, 178--190.
[35]
Edwin B Wilson. 1927. Probable inference, the law of succession, and statistical inference. JASA, Vol. 22, 158 (1927), 209--212.
[36]
Bianca Zadrozny and Charles Elkan. 2001. Obtaining calibrated probability estimates from decision trees and naive bayesian classifiers. In ICML, Vol. 1. 609--616.
[37]
Bianca Zadrozny and Charles Elkan. 2002. Transforming classifier scores into accurate multiclass probability estimates. In KDD. 694--699.
[38]
Jize Zhang, Bhavya Kailkhura, and T Yong-Jin Han. 2020. Mix-n-match: Ensemble and compositional methods for uncertainty calibration in deep learning. In ICML. PMLR, 11117--11128.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '24: Proceedings of the 33rd ACM International Conference on Information and Knowledge Management
October 2024
5705 pages
ISBN:9798400704369
DOI:10.1145/3627673
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 October 2024

Check for updates

Author Tags

  1. confidence-aware
  2. model calibration
  3. multi-field
  4. online advertising

Qualifiers

  • Research-article

Conference

CIKM '24
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 96
    Total Downloads
  • Downloads (Last 12 months)96
  • Downloads (Last 6 weeks)68
Reflects downloads up to 23 Dec 2024

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media