skip to main content
10.1145/3041021.3054184acmotherconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

Adaptive Online Hyper-Parameters Tuning for Ad Event-Prediction Models

Published: 03 April 2017 Publication History

Abstract

Yahoo's native advertising (also known as Gemini native) is one of its fastest growing businesses, reaching a run-rate of several hundred Millions USD in the past year. Driving the Gemini native models that are used to predict both, click probability (pCTR) and conversion probability (pCONV), is OFFSET - a feature enhanced collaborative-filtering (CF) based event prediction algorithm. OFFSET is a one-pass algorithm that updates its model for every new batch of logged data using a stochastic gradient descent (SGD) based approach. As most learning algorithms, OFFSET includes several hyper-parameters that can be tuned to provide best performance for a given system conditions. Since the marketplace environment is very dynamic and influenced by seasonality and other temporal factors, having a fixed single set of hyper-parameters (or configuration) for the learning algorithm is sub-optimal.
In this work we present an online hyper-parameters tuning algorithm, which takes advantage of the system parallel map-reduce based architecture, and strives to adapt the hyper-parameters set to provide the best performance at a specific time interval. Online evaluation via bucket testing of the tuning algorithm showed a significant 4.3% revenue lift overall traffic, and a staggering 8.3% lift over Yahoo Home-Page section traffic. Since then, the tuning algorithm was pushed into production, tuning both click- and conversion-prediction models, and is generating a hefty estimated revenue lift of 5% yearly for Yahoo Gemini native.
The proposed tuning mechanism can be easily generalized to fit any learning algorithm that continuously learns on incoming streaming data, in order to adapt its hyper-parameters to temporal changes.

References

[1]
Michal Aharon, Natalie Aizenberg, Edward Bortnikov, Ronny Lempel, Roi Adadi, Tomer Benyamini, Liron Levin, Ran Roth, and Ohad Serfaty. Off-set: one-pass factorization of feature sets for online recommendation in persistent cold start settings. In Proc. RecSys'2013, pages 375--378, 2013.
[2]
James Bergstra and Yoshua Bengio. Random search for hyper-parameter optimization. Journal of Machine Learning Research, 13(Feb):281--305, 2012.
[3]
Simon Chan, Philip Treleaven, and Licia Capra. Continuous hyperparameter optimization for large-scale recommender systems. In Big Data, 2013 IEEE International Conference on, pages 350--358. IEEE, 2013.
[4]
Jeffrey Dean and Sanjay Ghemawat. Mapreduce: simplified data processing on large clusters. Communications of the ACM, 51(1):107--113, 2008.
[5]
John Duchi, Elad Hazan, and Yoram Singer. Adaptive subgradient methods for online learning and stochastic optimization. The Journal of Machine Learning Research, pages 2121--2159, 2011.
[6]
Benjamin Edelman, Michael Ostrovsky, and Michael Schwarz. Internet advertising and the generalized second-price auction: Selling billions of dollars worth of keywords. The American economic review, 97(1):242--259, 2007.
[7]
Tom Fawcett. An introduction to ROC analysis. Pattern recognition letters, 27(8):861--874, 2006.
[8]
Frank Hutter, Holger H Hoos, and Kevin Leyton-Brown. Sequential model-based optimization for general algorithm configuration. In International Conference on Learning and Intelligent Optimization, pages 507--523. Springer, 2011.
[9]
John A Nelder and Roger Mead. A simplex method for function minimization. The computer journal, 7(4):308--313, 1965.
[10]
Jasper Snoek, Hugo Larochelle, and Ryan P Adams. Practical bayesian optimization of machine learning algorithms. In Advances in neural information processing systems, pages 2951--2959, 2012.

Cited By

View all

Index Terms

  1. Adaptive Online Hyper-Parameters Tuning for Ad Event-Prediction Models

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    WWW '17 Companion: Proceedings of the 26th International Conference on World Wide Web Companion
    April 2017
    1738 pages
    ISBN:9781450349147

    Sponsors

    • IW3C2: International World Wide Web Conference Committee

    In-Cooperation

    Publisher

    International World Wide Web Conferences Steering Committee

    Republic and Canton of Geneva, Switzerland

    Publication History

    Published: 03 April 2017

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. ad click-prediction
    2. ad ranking
    3. hyper-parameters tuning
    4. learning
    5. map-reduce
    6. native ads

    Qualifiers

    • Research-article

    Conference

    WWW '17
    Sponsor:
    • IW3C2

    Acceptance Rates

    WWW '17 Companion Paper Acceptance Rate 164 of 966 submissions, 17%;
    Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)8
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 24 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media