skip to main content
10.1145/3580305.3599499acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Serverless Federated AUPRC Optimization for Multi-Party Collaborative Imbalanced Data Mining

Published: 04 August 2023 Publication History

Abstract

To address the big data challenges, serverless multi-party collaborative training has recently attracted attention in the data mining community, since they can cut down the communications cost by avoiding the server node bottleneck. However, traditional serverless multi-party collaborative training algorithms were mainly designed for balanced data mining tasks and are intended to optimize accuracy (e.g., cross-entropy). The data distribution in many real-world applications is skewed and classifiers, which are trained to improve accuracy, perform poorly when applied to imbalanced data tasks since models could be significantly biased toward the primary class. Therefore, the Area Under Precision-Recall Curve (AUPRC) was introduced as an effective metric. Although multiple single-machine methods have been designed to train models for AUPRC maximization, the algorithm for multi-party collaborative training has never been studied. The change from the single-machine to the multi-party setting poses critical challenges. For example, existing single-machine-based AUPRC maximization algorithms maintain an inner state for local each data point, thus these methods are not applicable to large-scale multi-party collaborative training due to the dependence on each local data point.
To address the above challenge, in this paper, we reformulate the serverless multi-party collaborative AUPRC maximization problem as a conditional stochastic optimization problem in a serverless multi-party collaborative learning setting and propose a new ServerLess biAsed sTochastic gradiEnt (SLATE) algorithm to directly optimize the AUPRC. After that, we use the variance reduction technique and propose ServerLess biAsed sTochastic gradiEnt with Momentum-based variance reduction (SLATE-M) algorithm to improve the convergence rate, which matches the best theoretical convergence result reached by the single-machine online method. To the best of our knowledge, this is the first work to solve the multi-party collaborative AUPRC maximization problem. Finally, extensive experiments show the advantages of directly optimizing the AUPRC with distributed learning methods and also verify the efficiency of our new algorithms (i.e., SLATE and SLATE-M).

Supplementary Material

MP4 File (promo.mp4)
AUPRC Optimization, Paper: rtfp1155

References

[1]
Donald Bamber. 1975. The area above the ordinal dominance graph and the area below the receiver operating characteristic graph. Journal of mathematical psychology, Vol. 12, 4 (1975), 387--415.
[2]
Runxue Bao, Xidong Wu, Wenhan Xian, and Heng Huang. 2022. Doubly sparse asynchronous learning for stochastic composite optimization. In Thirty-First International Joint Conference on Artificial Intelligence (IJCAI). 1916--1922.
[3]
Kendrick Boyd, Kevin H Eng, and C David Page. 2013. Area under the precision-recall curve: point estimates and confidence intervals. In Joint European conference on machine learning and knowledge discovery in databases. Springer, 451--466.
[4]
Andrew Brock, Jeff Donahue, and Karen Simonyan. 2018. Large scale GAN training for high fidelity natural image synthesis. arXiv preprint arXiv:1809.11096 (2018).
[5]
Andrew Brown, Weidi Xie, Vicky Kalogeiton, and Andrew Zisserman. 2020. Smooth-ap: Smoothing the path towards large-scale image retrieval. In European Conference on Computer Vision. Springer, 677--694.
[6]
Corinna Cortes and Mehryar Mohri. 2003. AUC optimization vs. error rate minimization. Advances in neural information processing systems, Vol. 16 (2003).
[7]
Ashok Cutkosky and Francesco Orabona. 2019. Momentum-based variance reduction in non-convex sgd. Advances in neural information processing systems, Vol. 32 (2019).
[8]
Jesse Davis and Mark Goadrich. 2006. The relationship between Precision-Recall and ROC curves. In Proceedings of the 23rd international conference on Machine learning. 233--240.
[9]
Jeffrey Dean, Greg Corrado, Rajat Monga, Kai Chen, Matthieu Devin, Mark Mao, Marc'aurelio Ranzato, Andrew Senior, Paul Tucker, Ke Yang, et al. 2012. Large scale distributed deep networks. Advances in neural information processing systems, Vol. 25 (2012).
[10]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).
[11]
Priya Goyal, Piotr Dollár, Ross Girshick, Pieter Noordhuis, Lukasz Wesolowski, Aapo Kyrola, Andrew Tulloch, Yangqing Jia, and Kaiming He. 2017. Accurate, large minibatch sgd: Training imagenet in 1 hour. arXiv preprint arXiv:1706.02677 (2017).
[12]
Zhishuai Guo, Rong Jin, Jiebo Luo, and Tianbao Yang. 2022. FeDXL: Provable Federated Learning for Deep X-Risk Optimization. arXiv preprint arXiv:2210.14396 (2022).
[13]
Zhishuai Guo, Mingrui Liu, Zhuoning Yuan, Li Shen, Wei Liu, and Tianbao Yang. 2020. Communication-efficient distributed stochastic auc maximization with deep neural networks. In International Conference on Machine Learning. PMLR, 3864--3874.
[14]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.
[15]
Yifan Hu, Siqi Zhang, Xin Chen, and Niao He. 2020. Biased stochastic first-order methods for conditional stochastic optimization and applications in meta learning. Advances in Neural Information Processing Systems, Vol. 33 (2020), 2759--2770.
[16]
Yuelyu Ji, Yuhe Gao, Runxue Bao, Qi Li, Disheng Liu, Yiming Sun, and Ye Ye. 2023. Prediction of COVID-19 Patients' Emergency Room Revisit using Multi-Source Transfer Learning. In 2023 IEEE 11th International Conference on Healthcare Informatics (ICHI). IEEE.
[17]
Wei Jiang, Gang Li, Yibo Wang, Lijun Zhang, and Tianbao Yang. 2022. Multi-block-Single-probe Variance Reduced Estimator for Coupled Compositional Optimization. arXiv preprint arXiv:2207.08540 (2022).
[18]
Thorsten Joachims. 2005. A support vector method for multivariate performance measures. In Proceedings of the 22nd international conference on Machine learning. 377--384.
[19]
Mu Li, David G Andersen, Jun Woo Park, Alexander J Smola, Amr Ahmed, Vanja Josifovski, James Long, Eugene J Shekita, and Bor-Yiing Su. 2014. Scaling distributed machine learning with the parameter server. In 11th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 14). 583--598.
[20]
Xiangru Lian, Ce Zhang, Huan Zhang, Cho-Jui Hsieh, Wei Zhang, and Ji Liu. 2017. Can decentralized algorithms outperform centralized algorithms? a case study for decentralized parallel stochastic gradient descent. Advances in Neural Information Processing Systems, Vol. 30 (2017).
[21]
Mingrui Liu, Zhuoning Yuan, Yiming Ying, and Tianbao Yang. 2019. Stochastic auc maximization with deep neural networks. arXiv preprint arXiv:1908.10831 (2019).
[22]
Mingrui Liu, Wei Zhang, Youssef Mroueh, Xiaodong Cui, Jarret Ross, Tianbao Yang, and Payel Das. 2020. A decentralized parallel algorithm for training generative adversarial nets. Advances in Neural Information Processing Systems, Vol. 33 (2020), 11056--11070.
[23]
Songtao Lu, Xinwei Zhang, Haoran Sun, and Mingyi Hong. 2019. GNSD: A gradient-tracking based nonconvex stochastic algorithm for decentralized optimization. In 2019 IEEE Data Science Workshop (DSW). IEEE, 315--321.
[24]
Brian McFee and Gert R Lanckriet. 2010. Metric learning to rank. In Proceedings of the 27th international conference on machine learning (ICML-10). 775--782.
[25]
Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas. 2017. Communication-efficient learning of deep networks from decentralized data. In Artificial intelligence and statistics. PMLR, 1273--1282.
[26]
Pritish Mohapatra, CV Jawahar, and M Pawan Kumar. 2014. Efficient optimization for average precision svm. Advances in Neural Information Processing Systems, Vol. 27 (2014).
[27]
Maher Nouiehed, Maziar Sanjabi, Tianjian Huang, Jason D Lee, and Meisam Razaviyayn. 2019. Solving a class of non-convex min-max games using iterative first order methods. Advances in Neural Information Processing Systems, Vol. 32 (2019).
[28]
Taoxing Pan, Jun Liu, and Jie Wang. 2020. D-SPIDER-SFO: A decentralized optimization algorithm with faster convergence rate for nonconvex problems. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 1619--1626.
[29]
Qi Qi, Youzhi Luo, Zhao Xu, Shuiwang Ji, and Tianbao Yang. 2021. Stochastic Optimization of Areas Under Precision-Recall Curves with Provable Convergence. Advances in Neural Information Processing Systems, Vol. 34 (2021).
[30]
Jianhui Sun, Ying Yang, Guangxu Xun, and Aidong Zhang. 2023. Scheduling Hyperparameters to Improve Generalization: From Centralized SGD to Asynchronous SGD. ACM Transactions on Knowledge Discovery from Data, Vol. 17, 2 (2023), 1--37.
[31]
Hanlin Tang, Xiangru Lian, Ming Yan, Ce Zhang, and Ji Liu. 2018. D 2: Decentralized training over decentralized data. In International Conference on Machine Learning. PMLR, 4848--4856.
[32]
Bokun Wang and Tianbao Yang. 2022. Finite-Sum Coupled Compositional Stochastic Optimization: Theory and Applications. In International Conference on Machine Learning, ICML 2022, 17-23 July 2022, Baltimore, Maryland, USA (Proceedings of Machine Learning Research, Vol. 162). PMLR, 23292--23317.
[33]
Guanghui Wang, Ming Yang, Lijun Zhang, and Tianbao Yang. 2021. Momentum Accelerates the Convergence of Stochastic AUPRC Maximization. arXiv preprint arXiv:2107.01173 (2021).
[34]
Yadi Wei, Rishit Sheth, and Roni Khardon. 2021. Direct loss minimization for sparse gaussian processes. In International Conference on Artificial Intelligence and Statistics. PMLR, 2566--2574.
[35]
Xidong Wu, Zhengmian Hu, and Heng Huang. 2023. Decentralized Riemannian Algorithm for Nonconvex Minimax Problems. In Proceedings of the AAAI Conference on Artificial Intelligence.
[36]
Xidong Wu, Feihu Huang, Zhengmian Hu, and Heng Huang. 2022b. Faster Adaptive Federated Learning. arXiv preprint arXiv:2212.00974 (2022).
[37]
Xidong Wu, Feihu Huang, and Heng Huang. 2022a. Fast Stochastic Recursive Momentum Methods for Imbalanced Data Mining. In 2021 IEEE International Conference on Data Mining (ICDM). IEEE.
[38]
Wenhan Xian, Feihu Huang, Yanfu Zhang, and Heng Huang. 2021. A faster decentralized algorithm for nonconvex minimax problems. Advances in Neural Information Processing Systems, Vol. 34 (2021), 25865--25877.
[39]
Ran Xin, Usman Khan, and Soummya Kar. 2021. A hybrid variance-reduced method for decentralized stochastic non-convex optimization. In International Conference on Machine Learning. PMLR, 11459--11469.
[40]
Yiming Ying, Longyin Wen, and Siwei Lyu. 2016. Stochastic online AUC maximization. Advances in neural information processing systems, Vol. 29 (2016).
[41]
Kun Yuan, Qing Ling, and Wotao Yin. 2016. On the convergence of decentralized gradient descent. SIAM Journal on Optimization, Vol. 26, 3 (2016), 1835--1854.
[42]
Zhuoning Yuan, Zhishuai Guo, Nitesh Chawla, and Tianbao Yang. 2022. Compositional training for end-to-end deep AUC maximization. In International Conference on Learning Representations.
[43]
Zhuoning Yuan, Zhishuai Guo, Yi Xu, Yiming Ying, and Tianbao Yang. 2021. Federated deep AUC maximization for hetergeneous data with a constant communication complexity. In International Conference on Machine Learning. PMLR, 12219--12229.
[44]
Xin Zhang, Jia Liu, Zhengyuan Zhu, and Elizabeth Serena Bentley. 2021b. Gt-storm: Taming sample, communication, and memory complexities in decentralized non-convex learning. In Proceedings of the Twenty-second International Symposium on Theory, Algorithmic Foundations, and Protocol Design for Mobile Networks and Mobile Computing. 271--280.
[45]
Xin Zhang, Zhuqing Liu, Jia Liu, Zhengyuan Zhu, and Songtao Lu. 2021a. Taming Communication and Sample Complexities in Decentralized Policy Evaluation for Cooperative Multi-Agent Reinforcement Learning. Advances in Neural Information Processing Systems, Vol. 34 (2021), 18825--18838.
[46]
Xinwen Zhang, Yihan Zhang, Tianbao Yang, Richard Souvenir, and Hongchang Gao. 2023. Federated Compositional Deep AUC Maximization. arXiv preprint arXiv:2304.10101 (2023).
[47]
Yanfu Zhang, Runxue Bao, Jian Pei, and Heng Huang. 2022. Toward Unified Data and Algorithm Fairness via Adversarial Data Augmentation and Adaptive Model Fine-tuning. In 2022 IEEE International Conference on Data Mining (ICDM). IEEE, 1317--1322.
[48]
Peilin Zhao, Steven CH Hoi, Rong Jin, and Tianbo YANG. 2011. Online AUC maximization. (2011).

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
KDD '23: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
August 2023
5996 pages
ISBN:9798400701030
DOI:10.1145/3580305
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 August 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. auprc
  2. federated learning
  3. imbalanced data
  4. serverless federated learning
  5. stochastic optimization

Qualifiers

  • Research-article

Funding Sources

  • NSF IIS
  • DBI
  • CNS
  • CCF

Conference

KDD '23
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)296
  • Downloads (Last 6 weeks)25
Reflects downloads up to 06 Nov 2024

Other Metrics

Citations

Cited By

View all

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media