skip to main content
10.1145/3531146.3533070acmotherconferencesArticle/Chapter ViewAbstractPublication PagesfacctConference Proceedingsconference-collections
research-article
Public Access

Dynamic Privacy Budget Allocation Improves Data Efficiency of Differentially Private Gradient Descent

Published: 20 June 2022 Publication History

Abstract

Protecting privacy in learning while maintaining the model performance has become increasingly critical in many applications that involve sensitive data. A popular private learning framework is differentially private learning composed of many privatized gradient iterations by noising and clipping. Under the privacy constraint, it has been shown that the dynamic policies could improve the final iterate loss, namely the quality of published models. In this talk, we will introduce these dynamic techniques for learning rate, batch size, noise magnitude and gradient clipping. Also, we discuss how the dynamic policy could change the convergence bounds which further provides insight of the impact of dynamic methods.

References

[1]
Martin Abadi, Andy Chu, Ian Goodfellow, H. Brendan McMahan, Ilya Mironov, Kunal Talwar, and Li Zhang. 2016. Deep Learning with Differential Privacy. In CCS: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security(CCS ’16). ACM, New York, NY, USA, 308–318.
[2]
Raef Bassily, Vitaly Feldman, Kunal Talwar, and Abhradeep Guha Thakurta. 2019. Private Stochastic Convex Optimization with Optimal Rates. In Advances in Neural Information Processing Systems 32. Curran Associates, Inc., 11282–11291.
[3]
R. Bassily, A. Smith, and A. Thakurta. 2014. Private Empirical Risk Minimization: Efficient Algorithms and Tight Error Bounds. In 2014 IEEE 55th Annual Symposium on Foundations of Computer Science. 464–473.
[4]
Mark Bun and Thomas Steinke. 2016. Concentrated Differential Privacy: Simplifications, Extensions, and Lower Bounds. In Theory of Cryptography. Vol. 9985. Springer Berlin Heidelberg, Berlin, Heidelberg, 635–658.
[5]
Kamalika Chaudhuri, Claire Monteleoni, and Anand D. Sarwate. 2011. Differentially Private Empirical Risk Minimization. Journal of Machine Learning Research 12, Mar (2011), 1069–1109.
[6]
Junhong Cheng, Wenyan Liu, Xiaoling Wang, Xingjian Lu, Jing Feng, Yi Li, and Chaofan Duan. 2020. Adaptive Distributed Differential Privacy with SGD. Workshop on Privacy-Preserving Artificial Intelligence (2020), 6.
[7]
Rachel Cummings, Sara Krehbiel, Kevin A Lai, and Uthaipon Tantipongpipat. 2018. Differential Privacy for Growing Databases. In Advances in Neural Information Processing Systems 31. Curran Associates, Inc., 8864–8873.
[8]
Damien Desfontaines and Balázs Pejó. 2019. SoK: Differential Privacies. arXiv:1906.01337 [cs] (June 2019).
[9]
Cynthia Dwork, Alan Karr, Kobbi Nissim, and Lars Vilhuber. 2020. On Privacy in the Age of COVID-19. Journal of Privacy and Confidentiality 10, 2 (June 2020).
[10]
Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith. 2006. Calibrating Noise to Sensitivity in Private Data Analysis. In Theory of Cryptography(Lecture Notes in Computer Science). Springer Berlin Heidelberg, 265–284.
[11]
Vitaly Feldman, Tomer Koren, and Kunal Talwar. 2020. Private stochastic convex optimization: optimal rates in linear time. In Proceedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing(STOC 2020). Association for Computing Machinery, New York, NY, USA, 439–449.
[12]
Matt Fredrikson, Somesh Jha, and Thomas Ristenpart. 2015. Model Inversion Attacks That Exploit Confidence Information and Basic Countermeasures. In CCS: Proceedings of the 22Nd ACM SIGSAC Conference on Computer and Communications Security(CCS ’15). ACM, New York, NY, USA, 1322–1333.
[13]
Junyuan Hong, Haotao Wang, Zhangyang Wang, and Jiayu Zhou. 2021. Learning Model-Based Privacy Protection under Budget Constraints. In AAAI. 9.
[14]
Xixi Huang, Jian Guan, Bin Zhang, Shuhan Qi, Xuan Wang, and Qing Liao. 2019. Differentially Private Convolutional Neural Networks with Adaptive Gradient Descent. In 2019 IEEE Fourth International Conference on Data Science in Cyberspace (DSC). 642–648.
[15]
Prateek Jain, Dheeraj Nagaraj, and Praneeth Netrapalli. 2019. Making the Last Iterate of SGD Information Theoretically Optimal. In Conference on Learning Theory. 1752–1755.
[16]
Hamed Karimi, Julie Nutini, and Mark Schmidt. 2016. Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak-Łojasiewicz Condition. In Machine Learning and Knowledge Discovery in Databases(Lecture Notes in Computer Science). Springer International Publishing, Cham, 795–811.
[17]
Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In the 3rd International Conference for Learning Representations. San Diego, CA.
[18]
Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner. 1998. Gradient-based learning applied to document recognition. Proc. IEEE 86, 11 (Nov. 1998), 2278–2324.
[19]
Jaewoo Lee and Daniel Kifer. 2018. Concentrated Differentially Private Gradient Descent with Adaptive per-Iteration Privacy Budget. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining(KDD ’18). ACM, New York, NY, USA, 1656–1665.
[20]
Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas. 2017. Communication-Efficient Learning of Deep Networks from Decentralized Data. In Artificial Intelligence and Statistics. 1273–1282.
[21]
H. Brendan McMahan, Daniel Ramage, Kunal Talwar, and Li Zhang. 2018. Learning Differentially Private Recurrent Language Models. In International Conference on Learning Representations.
[22]
Yurii Nesterov and B.T. Polyak. 2006. Cubic regularization of Newton method and its global performance. Mathematical Programming 108, 1 (Aug. 2006), 177–205.
[23]
Venkatadheeraj Pichapati, Ananda Theertha Suresh, Felix X. Yu, Sashank J. Reddi, and Sanjiv Kumar. 2019. AdaCliP: Adaptive Clipping for Private SGD. arXiv:1908.07643 [cs, stat] (Oct. 2019).
[24]
B. T. Polyak. 1963. Gradient methods for the minimisation of functionals. U. S. S. R. Comput. Math. and Math. Phys. 3, 4 (Jan. 1963), 864–878.
[25]
B. T. Polyak. 1964. Some methods of speeding up the convergence of iteration methods. U. S. S. R. Comput. Math. and Math. Phys. 4, 5 (Jan. 1964), 1–17.
[26]
Ning Qian. 1999. On the momentum term in gradient descent learning algorithms. Neural Networks 12, 1 (Jan. 1999), 145–151.
[27]
Sashank J. Reddi, Ahmed Hefny, Suvrit Sra, Barnabas Poczos, and Alex Smola. 2016. Stochastic Variance Reduction for Nonconvex Optimization. In International Conference on Machine Learning. 314–323.
[28]
Alfréd Rényi. 1961. On measures of entropy and information. In Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Contributions to the Theory of Statistics. The Regents of the University of California.
[29]
Shai Shalev-Shwartz and Shai Ben-David. 2014. Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press.
[30]
Shai Shalev-Shwartz, Nathan Srebro, and Karthik Sridharan. 2009. Stochastic Convex Optimization. In Proceedings of the 22nd Annual Conference on Learning Theory, COLT ’09. 11.
[31]
R. Shokri, M. Stronati, C. Song, and V. Shmatikov. 2017. Membership Inference Attacks Against Machine Learning Models. In 2017 IEEE Symposium on Security and Privacy (SP). 3–18.
[32]
Om Thakkar, Galen Andrew, and H. Brendan McMahan. 2019. Differentially Private Learning with Adaptive Clipping. arXiv:1905.03871 [cs, stat] (May 2019).
[33]
Di Wang, Changyou Chen, and Jinhui Xu. 2019. Differentially Private Empirical Risk Minimization with Non-convex Loss Functions. In International Conference on Machine Learning. 6526–6535.
[34]
Di Wang, Minwei Ye, and Jinhui Xu. 2017. Differentially Private Empirical Risk Minimization Revisited: Faster and More General. In Advances in Neural Information Processing Systems 30. Curran Associates, Inc., 2722–2731.
[35]
Michael W. Weiner, Dallas P. Veitch, Paul S. Aisen, Laurel A. Beckett, Nigel J. Cairns, Robert C. Green, Danielle Harvey, Clifford R. Jack, William Jagust, Enchi Liu, John C. Morris, Ronald C. Petersen, Andrew J. Saykin, Mark E. Schmidt, Leslie Shaw, Li Shen, Judith A. Siuciak, Holly Soares, Arthur W. Toga, and John Q. Trojanowski. 2013. The Alzheimer’s Disease Neuroimaging Initiative: A review of papers published since its inception. Alzheimer’s & Dementia 9, 5 (Sept. 2013), e111–e194.
[36]
Xi Wu, Fengan Li, Arun Kumar, Kamalika Chaudhuri, Somesh Jha, and Jeffrey Naughton. 2017. Bolt-on Differential Privacy for Scalable Stochastic Gradient Descent-based Analytics. In Proceedings of the 2017 ACM International Conference on Management of Data(SIGMOD ’17). ACM, New York, NY, USA, 1307–1322.
[37]
Yun Xie, Peng Li, Chao Wu, and Qiuling Wu. 2021. Differential Privacy Stochastic Gradient Descent with Adaptive Privacy Budget Allocation. In 2021 IEEE International Conference on Consumer Electronics and Computer Engineering (ICCECE). 227–231.
[38]
Zhiying Xu, Shuyu Shi, Alex X. Liu, Jun Zhao, and Lin Chen. 2020. An Adaptive and Fast Convergent Approach to Differentially Private Deep Learning. the Proceedings of IEEE International Conference on Computer Communications (2020).
[39]
Da Yu, Huishuai Zhang, Wei Chen, Jian Yin, and Tie-Yan Liu. 2020. Gradient Perturbation is Underrated for Differentially Private Convex Optimization. In Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence Organization, Yokohama, Japan, 3117–3123.
[40]
Lei Yu, Ling Liu, Calton Pu, Mehmet Emre Gursoy, and Stacey Truex. 2019. Differentially Private Model Publishing for Deep Learning. proceedings of 40th IEEE Symposium on Security and Privacy (April 2019).
[41]
Xinyue Zhang, Jiahao Ding, Maoqiang Wu, Stephen T. C. Wong, Hien Van Nguyen, and Miao Pan. 2021. Adaptive Privacy Preserving Deep Learning Algorithms for Medical Data. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 1169–1178.
[42]
Yingxue Zhou, Xiangyi Chen, Mingyi Hong, Zhiwei Steven Wu, and Arindam Banerjee. 2020. Private Stochastic Non-Convex Optimization: Adaptive Algorithms and Tighter Generalization Bounds. arXiv:2006.13501 [cs, stat] (Aug. 2020).

Cited By

View all
  • (2024)Distributed Harmonization: Federated Clustered Batch Effect Adjustment and GeneralizationProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671590(5105-5115)Online publication date: 25-Aug-2024

Index Terms

  1. Dynamic Privacy Budget Allocation Improves Data Efficiency of Differentially Private Gradient Descent
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Other conferences
        FAccT '22: Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency
        June 2022
        2351 pages
        ISBN:9781450393522
        DOI:10.1145/3531146
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 20 June 2022

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. machine learning
        2. privacy

        Qualifiers

        • Research-article
        • Research
        • Refereed limited

        Funding Sources

        Conference

        FAccT '22
        Sponsor:

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)375
        • Downloads (Last 6 weeks)82
        Reflects downloads up to 30 Oct 2024

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)Distributed Harmonization: Federated Clustered Batch Effect Adjustment and GeneralizationProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671590(5105-5115)Online publication date: 25-Aug-2024

        View Options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format.

        HTML Format

        Get Access

        Login options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media