research-article

Adaptive Online Hyper-Parameters Tuning for Ad Event-Prediction Models

Authors:

Michal Aharon,

Amit Kagian,

Oren SomekhAuthors Info & Claims

WWW '17 Companion: Proceedings of the 26th International Conference on World Wide Web Companion

Pages 672 - 679

https://doi.org/10.1145/3041021.3054184

Published: 03 April 2017 Publication History

Get Access

Abstract

Yahoo's native advertising (also known as Gemini native) is one of its fastest growing businesses, reaching a run-rate of several hundred Millions USD in the past year. Driving the Gemini native models that are used to predict both, click probability (pCTR) and conversion probability (pCONV), is OFFSET - a feature enhanced collaborative-filtering (CF) based event prediction algorithm. OFFSET is a one-pass algorithm that updates its model for every new batch of logged data using a stochastic gradient descent (SGD) based approach. As most learning algorithms, OFFSET includes several hyper-parameters that can be tuned to provide best performance for a given system conditions. Since the marketplace environment is very dynamic and influenced by seasonality and other temporal factors, having a fixed single set of hyper-parameters (or configuration) for the learning algorithm is sub-optimal.

In this work we present an online hyper-parameters tuning algorithm, which takes advantage of the system parallel map-reduce based architecture, and strives to adapt the hyper-parameters set to provide the best performance at a specific time interval. Online evaluation via bucket testing of the tuning algorithm showed a significant 4.3% revenue lift overall traffic, and a staggering 8.3% lift over Yahoo Home-Page section traffic. Since then, the tuning algorithm was pushed into production, tuning both click- and conversion-prediction models, and is generating a hefty estimated revenue lift of 5% yearly for Yahoo Gemini native.

The proposed tuning mechanism can be easily generalized to fit any learning algorithm that continuously learns on incoming streaming data, in order to adapt its hyper-parameters to temporal changes.

References

[1]

Michal Aharon, Natalie Aizenberg, Edward Bortnikov, Ronny Lempel, Roi Adadi, Tomer Benyamini, Liron Levin, Ran Roth, and Ohad Serfaty. Off-set: one-pass factorization of feature sets for online recommendation in persistent cold start settings. In Proc. RecSys'2013, pages 375--378, 2013.

Digital Library

Google Scholar

[2]

James Bergstra and Yoshua Bengio. Random search for hyper-parameter optimization. Journal of Machine Learning Research, 13(Feb):281--305, 2012.

Digital Library

Google Scholar

[3]

Simon Chan, Philip Treleaven, and Licia Capra. Continuous hyperparameter optimization for large-scale recommender systems. In Big Data, 2013 IEEE International Conference on, pages 350--358. IEEE, 2013.

Crossref

Google Scholar

[4]

Jeffrey Dean and Sanjay Ghemawat. Mapreduce: simplified data processing on large clusters. Communications of the ACM, 51(1):107--113, 2008.

Digital Library

Google Scholar

[5]

John Duchi, Elad Hazan, and Yoram Singer. Adaptive subgradient methods for online learning and stochastic optimization. The Journal of Machine Learning Research, pages 2121--2159, 2011.

Digital Library

Google Scholar

[6]

Benjamin Edelman, Michael Ostrovsky, and Michael Schwarz. Internet advertising and the generalized second-price auction: Selling billions of dollars worth of keywords. The American economic review, 97(1):242--259, 2007.

Crossref

Google Scholar

[7]

Tom Fawcett. An introduction to ROC analysis. Pattern recognition letters, 27(8):861--874, 2006.

Digital Library

Google Scholar

[8]

Frank Hutter, Holger H Hoos, and Kevin Leyton-Brown. Sequential model-based optimization for general algorithm configuration. In International Conference on Learning and Intelligent Optimization, pages 507--523. Springer, 2011.

Digital Library

Google Scholar

[9]

John A Nelder and Roger Mead. A simplex method for function minimization. The computer journal, 7(4):308--313, 1965.

Crossref

Google Scholar

[10]

Jasper Snoek, Hugo Larochelle, and Ryan P Adams. Practical bayesian optimization of machine learning algorithms. In Advances in neural information processing systems, pages 2951--2959, 2012.

Digital Library

Google Scholar

Cited By

View all

Stram RAbboud RShtoff ASomekh ORaviv AKoren YChua TNgo CKumar RLauw HKa-Wei Lee R(2024)Mystique: A Budget Pacing System for Performance Optimization in Online AdvertisingCompanion Proceedings of the ACM Web Conference 202410.1145/3589335.3648342(433-442)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589335.3648342
Yao ZKong DLu MBai XYang JXiong H(2023)Multi-View Multi-Task Campaign Embedding for Cold-Start Conversion Rate ForecastingIEEE Transactions on Big Data10.1109/TBDATA.2022.31621509:1(280-293)Online publication date: 1-Feb-2023
https://doi.org/10.1109/TBDATA.2022.3162150
Abutbul EKaplan YKrasne NSomekh ODavid ODuvdevany OSegal E(2023)Audience Prospecting for Dynamic-Product-Ads in Native Advertising2023 IEEE International Conference on Big Data (BigData)10.1109/BigData59044.2023.10386796(1571-1580)Online publication date: 15-Dec-2023
https://doi.org/10.1109/BigData59044.2023.10386796
Show More Cited By

Index Terms

Adaptive Online Hyper-Parameters Tuning for Ad Event-Prediction Models
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches
      1. Factorization methods

Recommendations

Ad Close Mitigation for Improved User Experience in Native Advertisements
WSDM '20: Proceedings of the 13th International Conference on Web Search and Data Mining

Verizon Media native advertising (also known as Yahoo Gemini native) serves billions of ad impressions daily, reaching several hundreds of millions USD in revenue yearly. Although we strive to provide the best experience for our users, there will always ...
Do Sellers Benefit from Sponsored Product Listings? Evidence from an Online Marketplace
This paper shows that consumers prefer organic listings in the top-ranked positions to sponsored listings of the same product/position in an online marketplace.
Sponsored product listings on online marketplaces are third-party sellers’ ads blended in organic product listings. This paper investigates a seller’s managerial questions: whether a sponsored listing outperforms an organic listing and how the performance ...
Imprecise probability models for learning multinomial distributions from data. Applications to learning credal networks

This paper considers the problem of learning multinomial distributions from a sample of independent observations. The Bayesian approach usually assumes a prior Dirichlet distribution about the probabilities of the different possible values. However, ...

Comments

Information & Contributors

Information

Published In

WWW '17 Companion: Proceedings of the 26th International Conference on World Wide Web Companion

April 2017

1738 pages

ISBN:9781450349147

General Chairs:
Rick Barrett
W3Events
,
Rick Cummings
Murdoch University
,
Program Chairs:
Eugene Agichtein
Emory University
,
Evgeniy Gabrilovich
Google Research

In-Cooperation

SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web

Publisher

International World Wide Web Conferences Steering Committee

Republic and Canton of Geneva, Switzerland

Publication History

Published: 03 April 2017

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

WWW '17

Sponsor:

IW3C2

WWW '17: 26th International World Wide Web Conference

April 3 - 7, 2017

Perth, Australia

Acceptance Rates

WWW '17 Companion Paper Acceptance Rate 164 of 966 submissions, 17%;

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

13
Total Citations
View Citations
129
Total Downloads

Downloads (Last 12 months)8
Downloads (Last 6 weeks)2

Reflects downloads up to 24 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Stram RAbboud RShtoff ASomekh ORaviv AKoren YChua TNgo CKumar RLauw HKa-Wei Lee R(2024)Mystique: A Budget Pacing System for Performance Optimization in Online AdvertisingCompanion Proceedings of the ACM Web Conference 202410.1145/3589335.3648342(433-442)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3589335.3648342
Yao ZKong DLu MBai XYang JXiong H(2023)Multi-View Multi-Task Campaign Embedding for Cold-Start Conversion Rate ForecastingIEEE Transactions on Big Data10.1109/TBDATA.2022.31621509:1(280-293)Online publication date: 1-Feb-2023
https://doi.org/10.1109/TBDATA.2022.3162150
Abutbul EKaplan YKrasne NSomekh ODavid ODuvdevany OSegal E(2023)Audience Prospecting for Dynamic-Product-Ads in Native Advertising2023 IEEE International Conference on Big Data (BigData)10.1109/BigData59044.2023.10386796(1571-1580)Online publication date: 15-Dec-2023
https://doi.org/10.1109/BigData59044.2023.10386796
Shtoff AKaplan YRaviv A(2023)Improving conversion rate prediction via self-supervised pre-training in online advertising2023 IEEE International Conference on Big Data (BigData)10.1109/BigData59044.2023.10386162(1835-1842)Online publication date: 15-Dec-2023
https://doi.org/10.1109/BigData59044.2023.10386162
Kaplan YShtoff AShadi TSomekh OKoren Y(2022)Conversion-Based Dynamic-Creative-Optimization in Native Advertising2022 IEEE International Conference on Big Data (Big Data)10.1109/BigData55660.2022.10020498(2273-2278)Online publication date: 17-Dec-2022
https://doi.org/10.1109/BigData55660.2022.10020498
Kaplan YKrasne NShtoff ASomekh ODemartini GZuccon GCulpepper JHuang ZTong H(2021)Unbiased Filtering of Accidental Clicks in Verizon Media Native AdvertisingProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3481958(3878-3887)Online publication date: 26-Oct-2021
https://dl.acm.org/doi/10.1145/3459637.3481958
Shtoff AKoren Y(2021)Mitigating Divergence of Latent Factors via Dual Ascent for Low Latency Event Prediction Models2021 IEEE International Conference on Big Data (Big Data)10.1109/BigData52589.2021.9671859(1940-1949)Online publication date: 15-Dec-2021
https://doi.org/10.1109/BigData52589.2021.9671859
Kaplan YKoren YLeibovits RSomekh O(2021)Dynamic Length Factorization Machines for CTR Prediction2021 IEEE International Conference on Big Data (Big Data)10.1109/BigData52589.2021.9671557(1950-1959)Online publication date: 15-Dec-2021
https://doi.org/10.1109/BigData52589.2021.9671557
Silberstein NSomekh OKoren YAharon MPorat DShahar AWu TCaverlee JHu XLalmas MWang W(2020)Ad Close Mitigation for Improved User Experience in Native AdvertisementsProceedings of the 13th International Conference on Web Search and Data Mining10.1145/3336191.3371798(546-554)Online publication date: 20-Jan-2020
https://dl.acm.org/doi/10.1145/3336191.3371798
Koren YSomekh OShahar AItzhaki ACohen TKrasteva MShadi T(2020)Dynamic Creative Optimization in Verizon Media Native Advertising2020 IEEE International Conference on Big Data (Big Data)10.1109/BigData50022.2020.9378251(1654-1662)Online publication date: 10-Dec-2020
https://doi.org/10.1109/BigData50022.2020.9378251
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Cited By

Index Terms

Recommendations

Ad Close Mitigation for Improved User Experience in Native Advertisements

Do Sellers Benefit from Sponsored Product Listings? Evidence from an Online Marketplace

Imprecise probability models for learning multinomial distributions from data. Applications to learning credal networks

Comments

Published In

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Other Metrics

Article Metrics

Other Metrics

Cited By

Login options

Full Access

PDF

eReader

Abstract

References

Cited By

Index Terms

Recommendations

Ad Close Mitigation for Improved User Experience in Native Advertisements

Do Sellers Benefit from Sponsored Product Listings? Evidence from an Online Marketplace

Imprecise probability models for learning multinomial distributions from data. Applications to learning credal networks

Comments

Information

Published In

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations