skip to main content
10.1145/3219819.3219919acmotherconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Applying the Delta Method in Metric Analytics: A Practical Guide with Novel Ideas

Published: 19 July 2018 Publication History

Abstract

During the last decade, the information technology industry has adopted a data-driven culture, relying on online metrics to measure and monitor business performance. Under the setting of big data, the majority of such metrics approximately follow normal distributions, opening up potential opportunities to model them directly without extra model assumptions and solve big data problems via closed-form formulas using distributed algorithms at a fraction of the cost of simulation-based procedures like bootstrap. However, certain attributes of the metrics, such as their corresponding data generating processes and aggregation levels, pose numerous challenges for constructing trustworthy estimation and inference procedures. Motivated by four real-life examples in metric development and analytics for large-scale A/B testing, we provide a practical guide to applying the Delta method, one of the most important tools from the classic statistics literature, to address the aforementioned challenges. We emphasize the central role of the Delta method in metric analytics by highlighting both its classic and novel applications.

Supplementary Material

MP4 File (lu_metric_analytics.mp4)

References

[1]
Susan Athey and Guido W Imbens . 2017. The econometrics of randomized experiments. Handbook of Economic Field Experiments Vol. 1 (2017), 73--140.
[2]
Lars Backstrom and Jon Kleinberg . 2011. Network bucket testing. In Proceedings of the 20th international conference on World wide web. ACM, 615--624.
[3]
Douglas Bates, Martin M"achler, Ben Bolker, and Steve Walker . 2014 a. Fitting linear mixed-effects models using lme4. arXiv preprint arXiv:1406.5823 (2014).
[4]
Douglas Bates, Martin Maechler, Ben Bolker, Steven Walker, et almbox. . 2014 b. lme4: Linear mixed-effects models using Eigen and S4. R package version Vol. 1, 7 (2014), 1--23.
[5]
Dennis D Boos and Jacqueline M Hughes-Oliver . 2000. How large does n have to be for Z and t intervals The American Statistician Vol. 54, 2 (2000), 121--128.
[6]
Léon Bottou . 2010. Large-scale machine learning with stochastic gradient descent. In Proceedings of COMPSTAT'2010. Springer, 177--186.
[7]
Morton B Brown and Robert A Wolfe . 1983. Estimation of the variance of percentile estimates. Computational Statistics & Data Analysis Vol. 1 (1983), 167--174.
[8]
Roman Budylin, Alexey Drutsa, Ilya Katsev, and Valeriya Tsoy . 2018. Consistent Transformation of Ratio Metrics for Efficient Online Controlled Experiments. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining. ACM, 55--63.
[9]
Bob Carpenter, Andrew Gelman, Matt Hoffman, Daniel Lee, Ben Goodrich, Michael Betancourt, Michael A Brubaker, Jiqiang Guo, Peter Li, and Allen Riddell . 2016. Stan: A probabilistic programming language. Journal of Statistical Software Vol. 20 (2016), 1--37.
[10]
George Casella and Roger L Berger . 2002. Statistical Inference, Second Edition. Duxbury Press: Pacific Grove, CA.
[11]
Ronnie Chaiken, Bob Jenkins, Per-Åke Larson, Bill Ramsey, Darren Shakib, Simon Weaver, and Jingren Zhou . 2008. SCOPE: Easy and efficient parallel processing of massive data sets. Proceedings of the VLDB Endowment Vol. 1 (2008), 1265--1276.
[12]
Corinna Cortes and Vladimir Vapnik . 1995. Support-vector networks. Machine Learning Vol. 20 (1995), 273--297.
[13]
M. Davidian, A.A. Tsiatis, and S. Leon . 2005. Semiparametric Estimation of Treatment Effect in a Pretest-Posttest Study with Missing Data. Statist. Sci. Vol. 20 (2005), 295--301. Issue 3.
[14]
A. Deng, J. Lu, and J. Litz . 2017. Trustworthy analysis of online A/B tests: Pitfalls, challenges and solutions Proceedings of the Tenth ACM International Conference on Web Search and Data Mining. 641--649.
[15]
A. Deng and X. Shi . 2016. Data-driven metric development for online controlled experiments: Seven lessons learned. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.
[16]
Alex Deng, Ya Xu, Ron Kohavi, and Toby Walker . 2013. Improving the sensitivity of online controlled experiments by utilizing pre-experiment data. In Proceedings of the 6th ACM WSDM Conference. 123--132.
[17]
Pavel Dmitriev, Somit Gupta, Dong Woo Kim, and Garnet Vaz . 2017. A Dirty Dozen: Twelve Common Metric Interpretation Pitfalls in Online Controlled Experiments. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '17). ACM, New York, NY, USA, 1427--1436.
[18]
Pavel Dmitriev and Xian Wu . 2016. Measuring Metrics. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. ACM, 429--437.
[19]
Allan Donner . 1987. Statistical methodology for paired cluster designs. American Journal of Epidemiology Vol. 126, 5 (1987), 972--979.
[20]
Dean Eckles, Brian Karrer, and Johan Ugander . 2017. Design and analysis of experiments in networks: Reducing bias from interference. Journal of Causal Inference Vol. 5, 1 (2017).
[21]
Jianqing Fan, Fang Han, and Han Liu . 2014. Challenges of big data analysis. National Science Review Vol. 1 (2014), 293--314.
[22]
Edgar C Fieller . 1940. The biological standardization of insulin. Supplement to the Journal of the Royal Statistical Society Vol. 7, 1 (1940), 1--64.
[23]
Edgar C Fieller . 1954. Some problems in interval estimation. Journal of the Royal Statistical Society. Series B (Methodological) (1954), 175--185.
[24]
Ronald Aylmer Fisher . 1922. On the mathematical foundations of theoretical statistics. Philosophical Transactions of the Royal Society of London. Series A, Containing Papers of a Mathematical or Physical Character Vol. 222 (1922), 309--368.
[25]
Pedro A Forero, Alfonso Cano, and Georgios B Giannakis . 2010. Consensus-based distributed support vector machines. Journal of Machine Learning Research Vol. 11, May (2010), 1663--1707.
[26]
Andrew Gelman and Jennifer Hill . 2006. Data analysis using regression and multilevel/hierarchical models. Cambridge University Press.
[27]
Huan Gui, Ya Xu, Anmol Bhasin, and Jiawei Han . 2015. Network A/B testing: From sampling to estimation Proceedings of the 24th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 399--409.
[28]
Yu Guo and Alex Deng . 2015. Flexible Online Repeated Measures Experiment. arXiv preprint arXiv:1501.00450 (2015).
[29]
Peter Hall . 2013. The bootstrap and Edgeworth expansion. Springer Science & Business Media.
[30]
Joe Hirschberg and Jenny Lye . 2010. A geometric comparison of the delta and Fieller confidence intervals. The American Statistician Vol. 64 (2010), 234--241.
[31]
Michael I Jordan, Jason D Lee, and Yun Yang . 2018. Communication-efficient distributed statistical inference. J. Amer. Statist. Assoc. Vol. in press (2018).
[32]
Eugene Kharitonov, Alexey Drutsa, and Pavel Serdyukov . 2017. Learning sensitive combinations of a/b test metrics Proceedings of the Tenth ACM International Conference on Web Search and Data Mining. ACM, 651--659.
[33]
Neil Klar and Allan Donner . 2001. Current and future challenges in the design and analysis of cluster randomization trials. Statistics in medicine Vol. 20, 24 (2001), 3729--3740.
[34]
Ariel Kleiner, Ameet Talwalkar, Purnamrita Sarkar, and Michael I Jordan . 2014. A scalable bootstrap for massive data. Journal of the Royal Statistical Society: Series B (Statistical Methodology) Vol. 76, 4 (2014), 795--816.
[35]
Ronny Kohavi, Thomas Crook, Roger Longbotham, Brian Frasca, Randy Henne, Juan Lavista Ferres, and Tamir Melamed . 2009 a. Online experimentation at Microsoft. In Proceedings of the Third International Workshop on Data Mining Case Studies, held at the 5th ACM SIGKDD Conference. 11--23.
[36]
Ron Kohavi, Alex Deng, Brian Frasca, Toby Walker, Ya Xu, and Nils Pohlmann . 2013. Online Controlled Experiments at Large Scale. Proceedings of the 19th ACM SIGKDD Conference (2013).
[37]
Ron Kohavi, Randal M Henne, and Dan Sommerfield . 2007. Practical guide to controlled experiments on the web: listen to your customers not to the hippo. In Proceedings of the 13th ACM SIGKDD Conference. 959--967.
[38]
Ron Kohavi, Roger Longbotham, Dan Sommerfield, and Randal M Henne . 2009 b. Controlled experiments on the web: survey and practical guide. Data mining and knowledge discovery Vol. 18, 1 (2009), 140--181.
[39]
R. Kohavi, R. Longbotham, and T. Walker . 2010. Online Experiments: Practical Lessons. Computer Vol. 43, 9 (Sept . 2010), 82--85.
[40]
Daniel Krewski . 1976. Distribution-free confidence intervals for quantile intervals. J. Amer. Statist. Assoc. Vol. 71, 354 (1976), 420--422.
[41]
Kung-Yee Liang and Scott L Zeger . 1986. Longitudinal data analysis using generalized linear models. Biometrika Vol. 73, 1 (1986), 13--22.
[42]
John S Meyer . 1987. Outer and inner confidence intervals for finite population quantile intervals. J. Amer. Statist. Assoc. Vol. 82, 397 (1987), 201--204.
[43]
Walter Rudin et almbox. . 1964. Principles of mathematical analysis. Vol. Vol. 3. McGraw-hill New York.
[44]
Diane Tang, Ashish Agarwal, Deirdre O'Brien, and Mike Meyer . 2010. Overlapping Experiment Infrastructure: More, Better, Faster Experimentation. Proceedings of the 16th ACM SIGKDD Conference (2010).
[45]
Aad W Van der Vaart . 2000. Asymptotic statistics. Vol. Vol. 3. Cambridge university press.
[46]
Ulrike Von Luxburg and Volker H Franz . 2009. A geometric approach to confidence sets for ratios: Fieller's theorem, generalizations and bootstrap. Statistica Sinica (2009), 1095--1117.
[47]
Dongli Wang and Yan Zhou . 2012. Distributed support vector machines: An overview. In Control and Decision Conference (CCDC), 2012 24th Chinese. IEEE, 3897--3901.
[48]
Larry Wasserman . 2003. All of Statistics: A Concise Course in Statistical Inference. Springer.
[49]
Huizhi Xie and Juliette Aurisset . 2016. Improving the sensitivity of online controlled experiments: Case studies at netflix. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 645--654.
[50]
Ya Xu, Nanyu Chen, Addrian Fernandez, Omar Sinno, and Anmol Bhasin . 2015. From infrastructure to culture: A/B testing challenges in large scale social networks. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 2227--2236.
[51]
Matei Zaharia, Reynold S Xin, Patrick Wendell, Tathagata Das, Michael Armbrust, Ankur Dave, Xiangrui Meng, Josh Rosen, Shivaram Venkataraman, Michael J Franklin, et almbox. . 2016. Apache Spark: A unified engine for big data processing. Commun. ACM Vol. 59, 11 (2016), 56--65.
[52]
Martin Zinkevich, Markus Weimer, Lihong Li, and Alex J Smola . 2010. Parallelized stochastic gradient descent. In Advances in Neural Information Processing Systems. 2595--2603.

Cited By

View all
  • (2024)Influence of Defense Mechanisms on Sport Burnout: A Multiple Mediation Analysis Effects of Resilience, Stress and RecoverySports10.3390/sports1210027412:10(274)Online publication date: 11-Oct-2024
  • (2024)Exploring the moderating effects of SIRT1 and gene polymorphisms rs7895833 on the relationship between hemoglobin levels and physical frailty in elderly adults with comorbid chronic diseases: A moderated mediation analysisF1000Research10.12688/f1000research.133517.312(510)Online publication date: 8-May-2024
  • (2024)Exploring the moderating effects of SIRT1 and gene polymorphisms rs7895833 on the relationship between hemoglobin levels and physical frailty in elderly adults with comorbid chronic diseases: A moderated mediation analysisF1000Research10.12688/f1000research.133517.212(510)Online publication date: 16-Apr-2024
  • Show More Cited By

Index Terms

  1. Applying the Delta Method in Metric Analytics: A Practical Guide with Novel Ideas

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
    July 2018
    2925 pages
    ISBN:9781450355520
    DOI:10.1145/3219819
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 19 July 2018

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. a/b testing
    2. big data
    3. distributed algorithm
    4. large sample theory
    5. longitudinal study
    6. online metrics
    7. quantile inference
    8. randomization

    Qualifiers

    • Research-article

    Conference

    KDD '18
    Sponsor:

    Acceptance Rates

    KDD '18 Paper Acceptance Rate 107 of 983 submissions, 11%;
    Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)84
    • Downloads (Last 6 weeks)3
    Reflects downloads up to 09 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Influence of Defense Mechanisms on Sport Burnout: A Multiple Mediation Analysis Effects of Resilience, Stress and RecoverySports10.3390/sports1210027412:10(274)Online publication date: 11-Oct-2024
    • (2024)Exploring the moderating effects of SIRT1 and gene polymorphisms rs7895833 on the relationship between hemoglobin levels and physical frailty in elderly adults with comorbid chronic diseases: A moderated mediation analysisF1000Research10.12688/f1000research.133517.312(510)Online publication date: 8-May-2024
    • (2024)Exploring the moderating effects of SIRT1 and gene polymorphisms rs7895833 on the relationship between hemoglobin levels and physical frailty in elderly adults with comorbid chronic diseases: A moderated mediation analysisF1000Research10.12688/f1000research.133517.212(510)Online publication date: 16-Apr-2024
    • (2024)Metric Decomposition in A/B TestsProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671556(4885-4895)Online publication date: 25-Aug-2024
    • (2024)"Ask Me Anything": How Comcast Uses LLMs to Assist Agents in Real TimeProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3661345(2827-2831)Online publication date: 10-Jul-2024
    • (2024)Practical Batch Bayesian Sampling Algorithms for Online Adaptive Traffic ExperimentationCompanion Proceedings of the ACM Web Conference 202410.1145/3589335.3648347(471-480)Online publication date: 13-May-2024
    • (2023)Quantifying the Effectiveness of Advertising: A Bootstrap Proportion Test for Brand Lift TestingSSRN Electronic Journal10.2139/ssrn.4466340Online publication date: 2023
    • (2023)The Value of External Data for Digital Platforms: Evidence from a Field Experiment on Search SuggestionsSSRN Electronic Journal10.2139/ssrn.4452804Online publication date: 2023
    • (2023)Exploring the moderating effects of SIRT1 protein expression and gene polymorphisms rs7895833 on the relationship between hemoglobin levels and physical frailty in elderly adults with comorbid chronic diseases: A moderated mediation analysisF1000Research10.12688/f1000research.133517.112(510)Online publication date: 17-May-2023
    • (2023)Detection and Mitigation of Algorithmic Bias via Predictive ParityProceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency10.1145/3593013.3594117(1801-1816)Online publication date: 12-Jun-2023
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media