skip to main content
10.1145/3474369.3486872acmconferencesArticle/Chapter ViewAbstractPublication PagesccsConference Proceedingsconference-collections
research-article

FedV: Privacy-Preserving Federated Learning over Vertically Partitioned Data

Published: 15 November 2021 Publication History

Abstract

Federated learning (FL) has been proposed to allow collaborative training of machine learning (ML) models among multiple parties to keep their data private and only model updates are shared. Most existing approaches have focused on horizontal FL, while many real scenarios follow a vertically-partitioned FL setup, where a complete feature set is formed only when all the datasets from the parties are combined, and the labels are only available to a single party. Privacy-preserving vertical FL is challenging because complete sets of labels and features are not owned by one entity. Existing approaches for vertical FL require multiple peer-to-peer communications among parties, leading to lengthy training times, and are restricted to (approximated) linear models and just two parties. To close this gap, we propose FedV, a framework for secure gradient computation in vertical settings for several widely used ML models such as linear models, logistic regression, and support vector machines. FedV removes the need for peer-to-peer communication among parties by using functional encryption schemes and works for larger and changing sets of parties. We empirically demonstrate the applicability for multiple ML models and show a reduction of 10%-70% of training time and 80% to 90% in data transfer to the comparable state-of-the-art approaches.

Supplementary Material

MP4 File (AISec21-aisec44.mp4)
In this video, we present our accepted paper for the ACM AISec 2021 Workshop, titled "FedV: Privacy-Preserving Federated Learning over Vertically Partitioned Data." While the bulk of federated learning solutions focus on horizontal federated learning, this talk will discuss vertical federated learning, which presents a slightly distinct set of challenges, as each partner in a vertical FL scenario cannot train a local model independently as horizontal federated learning does. We propose a novel and effective strategy for training machine learning models in vertical FL settings, following the path of crypto-based solutions for vertical FL training. Our proposed method has several advantages: it has a straightforward communication topology, does not require peer-to-peer connection, has a smaller communication payload, is efficient and secure in terms of computation, and supports a variety of machine learning models.

References

[1]
Michel Abdalla, Fabrice Benhamouda, Markulf Kohlweiss, and Hendrik Waldner. 2019. Decentralizing inner-product functional encryption. In PKC. Springer, Springer, 128--157.
[2]
Michel Abdalla, Florian Bourse, Angelo De Caro, and David Pointcheval. 2015. Simple functional encryption schemes for inner products. In PKC. Springer, 733--751.
[3]
Michel Abdalla, Dario Catalano, Dario Fiore, Romain Gay, and Bogdan Ursu. 2018. Multi-input functional encryption for inner products: function-hiding realizations and constructions without pairings. In Crypto. Springer, 597--627.
[4]
Toshinori Araki, Assi Barak, Jun Furukawa, Marcel Keller, Kazuma Ohara, and Hikaru Tsuchida. 2018. How to Choose Suitable Secure Multiparty Computation Using Generalized SPDZ. In CCS. ACM, 2198--2200.
[5]
Eugene Bagdasaryan, Andreas Veit, Yiqing Hua, Deborah Estrin, and Vitaly Shmatikov. 2018. How to backdoor federated learning. arXiv preprint arXiv:1807.00459 (2018).
[6]
Carsten Baum, Ivan Damgård, Tomas Toft, and Rasmus Zakarias. 2016. Better preprocessing for secure multiparty computation. In ACNS. Springer, 327--345.
[7]
Keith Bonawitz, Vladimir Ivanov, Ben Kreuter, Antonio Marcedone, H Brendan McMahan, Sarvar Patel, Daniel Ramage, Aaron Segal, and Karn Seth. 2017. Practical secure aggregation for privacy-preserving machine learning. In CSS. ACM, ACM, 1175--1191.
[8]
Dan Boneh, Amit Sahai, and Brent Waters. 2011. Functional encryption: Definitions and challenges. In TCC. Springer, 253--273.
[9]
Melissa Chase, Yevgeniy Dodis, Yuval Ishai, Daniel Kraschewski, Tianren Liu, Rafail Ostrovsky, and Vinod Vaikuntanathan. 2019. Reusable non-interactive secure computation. In Crypto. Springer, 462--488.
[10]
Bryant Chen, Wilka Carvalho, Nathalie Baracaldo, Heiko Ludwig, Benjamin Edwards, Taesung Lee, Ian Molloy, and Biplav Srivastava. 2018. Detecting backdoor attacks on deep neural networks by activation clustering. arXiv preprint arXiv:1811.03728 (2018).
[11]
Tianyi Chen, Xiao Jin, Yuejiao Sun, and Wotao Yin. 2020. Vafl: a method of vertical asynchronous federated learning. arXiv preprint arXiv:2007.06081 (2020).
[12]
Kewei Cheng, Tao Fan, Yilun Jin, Yang Liu, Tianjian Chen, and Qiang Yang. 2019. SecureBoost: A Lossless Federated Learning Framework. arXiv preprint arXiv:1901.08755 (2019).
[13]
Jérémy Chotard, Edouard Dufour Sans, Romain Gay, Duong Hieu Phan, and David Pointcheval. 2018. Decentralized multi-client functional encryption for inner product. In Asiacrypt. Springer, 703--732.
[14]
Peter Christen. 2012. Data matching: concepts and techniques for record linkage, entity resolution, and duplicate detection. Springer Science & Business Media.
[15]
Luca Corinzia and Joachim M Buhmann. 2019. Variational Federated Multi-Task Learning. arXiv preprint arXiv:1906.06268 (2019).
[16]
Ivan Damgård and Mads Jurik. 2001. A generalisation, a simpli. cation and some applications of paillier's probabilistic public-key system. In PKC. Springer, Springer, 119--136.
[17]
Ivan Damgård, Valerio Pastro, Nigel Smart, and Sarah Zakarias. 2012. Multiparty computation from somewhat homomorphic encryption. In Crypto. Springer, 643--662.
[18]
Changyu Dong, Liqun Chen, and Zikai Wen. 2013. When private set intersection meets big data: an efficient and scalable protocol. In CCS. ACM, 789--800.
[19]
Dheeru Dua and Casey Graff. 2017. UCI Machine Learning Repository. http://archive.ics.uci.edu/ml
[20]
Matt Fredrikson, Somesh Jha, and Thomas Ristenpart. 2015. Model inversion attacks that exploit confidence information and basic countermeasures. In CCS. ACM, ACM, 1322--1333.
[21]
Adrià Gascón, Phillipp Schoppmann, Borja Balle, Mariana Raykova, Jack Doerner, Samee Zahur, and David Evans. 2016. Secure Linear Regression on Vertically Partitioned Datasets. IACR Cryptology ePrint Archive, Vol. 2016 (2016), 892.
[22]
Robin C Geyer, Tassilo Klein, and Moin Nabi. 2017. Differentially private federated learning: A client level perspective. arXiv preprint arXiv:1712.07557 (2017).
[23]
Bin Gu, Zhiyuan Dang, Xiang Li, and Heng Huang. 2020. Federated doubly stochastic kernel learning for vertically partitioned data. In KDD. 2483--2493.
[24]
Neil Haller, Craig Metz, Phil Nesser, and Mike Straw. 1998. A one-time password system. Network Working Group Request for Comments, Vol. 2289 (1998).
[25]
Stephen Hardy, Wilko Henecka, Hamish Ivey-Law, Richard Nock, Giorgio Patrini, Guillaume Smith, and Brian Thorne. 2017. Private federated learning on vertically partitioned data via entity resolution and additively homomorphic encryption. arXiv preprint arXiv:1711.10677 (2017).
[26]
Yan Huang, David Evans, Jonathan Katz, and Lior Malka. 2011. Faster secure two-party computation using garbled circuits. In USENIX Sec, Vol. 201. 331--335.
[27]
Mihaela Ion, Ben Kreuter, Ahmet Erhan Nergiz, Sarvar Patel, Mariana Raykova, Shobhit Saxena, Karn Seth, David Shanahan, and Moti Yung. 2019. On Deploying Secure Computing Commercially: Private Intersection-Sum Protocols and their Business Applications. In IACR Cryptology ePrint Archive. IACR.
[28]
Marcel Keller, Valerio Pastro, and Dragos Rotaru. 2018. Overdrive: making SPDZ great again. In Eurocrypt. Springer, 158--189.
[29]
Vladimir Kolesnikov, Ranjit Kumaresan, Mike Rosulek, and Ni Trieu. 2016. Efficient batched oblivious PRF with applications to private set intersection. In CCS. ACM, 818--829.
[30]
Jakub Konevc nỳ, H Brendan McMahan, Felix X Yu, Peter Richtárik, Ananda Theertha Suresh, and Dave Bacon. 2016. Federated learning: Strategies for improving communication efficiency. arXiv preprint arXiv:1610.05492 (2016).
[31]
Yann LeCun, Corinna Cortes, and CJ Burges. 2010. MNIST handwritten digit database. ATT Labs [Online]. Available: http://yann.lecun.com/exdb/mnist, Vol. 2 (2010).
[32]
Tian Li, Anit Kumar Sahu, Ameet Talwalkar, and Virginia Smith. 2019. Federated learning: Challenges, methods, and future directions. arXiv preprint arXiv:1908.07873 (2019).
[33]
Adriana López-Alt, Eran Tromer, and Vinod Vaikuntanathan. 2012. On-the-fly multiparty computation on the cloud via multikey fully homomorphic encryption. In STOC. ACM, 1219--1234.
[34]
Heiko Ludwig, Nathalie Baracaldo, Gegi Thomas, Yi Zhou, Ali Anwar, Shashank Rajamoni, Yuya Ong, Jayaram Radhakrishnan, Ashish Verma, Mathieu Sinn, et almbox. 2020. Ibm federated learning: an enterprise framework white paper v0. 1. arXiv preprint arXiv:2007.10987 (2020).
[35]
Milad Nasr, Reza Shokri, and Amir Houmansadr. 2018. Machine learning with membership privacy using adversarial regularization. In CCS. ACM, ACM, 634--646.
[36]
Milad Nasr, Reza Shokri, and Amir Houmansadr. 2019. Comprehensive Privacy Analysis of Deep Learning: Stand-alone and Federated Learning under Passive and Active White-box Inference Attacks. In S&P. IEEE.
[37]
Yurii Nesterov. 1998. Introductory lectures on convex programming volume i: Basic course. Lecture notes, Vol. 3, 4 (1998), 5.
[38]
Richard Nock, Stephen Hardy, Wilko Henecka, Hamish Ivey-Law, Giorgio Patrini, Guillaume Smith, and Brian Thorne. 2018. Entity resolution and federated learning get a federated resolution. arXiv preprint arXiv:1803.04035 (2018).
[39]
Rainer Schnell, Tobias Bachteler, and Jörg Reiher. 2011. A novel error-tolerant anonymous linking code. GRLC (2011).
[40]
Igor R Shafarevich and Alexey O Remizov. 2012. Linear algebra and geometry. Springer Science & Business Media.
[41]
Reza Shokri, Marco Stronati, Congzheng Song, and Vitaly Shmatikov. 2017. Membership inference attacks against machine learning models. In S&P. IEEE, 3--18.
[42]
Aleksandra B Slavkovic, Yuval Nardi, and Matthew M Tibbits. 2007. Secure Logistic Regression of Horizontally and Vertically Partitioned Distributed Databases. In ICDMW. IEEE, 723--728.
[43]
Stacey Truex, Nathalie Baracaldo, Ali Anwar, Thomas Steinke, Heiko Ludwig, and Rui Zhang. 2019. A Hybrid Approach to Privacy-Preserving Federated Learning. In AISec. ACM.
[44]
Jaideep Vaidya. 2008. A survey of privacy-preserving methods across vertically partitioned data. In Privacy-preserving data mining. Springer, 337--358.
[45]
Jaideep Vaidya, Chris Clifton, Murat Kantarcioglu, and A Scott Patterson. 2008. Privacy-preserving decision trees over vertically partitioned data. TKDD, Vol. 2, 3 (2008), 14.
[46]
Praneeth Vepakomma, Otkrist Gupta, Tristan Swedish, and Ramesh Raskar. 2018. Split learning for health: Distributed deep learning without sharing raw patient data. arXiv preprint arXiv:1812.00564 (2018).
[47]
Chang Wang, Jian Liang, Mingkai Huang, Bing Bai, Kun Bai, and Hao Li. 2020. Hybrid Differentially Private Federated Learning on Vertically Partitioned Data. arXiv preprint arXiv:2009.02763 (2020).
[48]
Xiao Wang, Samuel Ranellucci, and Jonathan Katz. 2017a. Authenticated garbling and efficient maliciously secure two-party computation. In CCS. ACM, 21--37.
[49]
Xiao Wang, Samuel Ranellucci, and Jonathan Katz. 2017b. Global-scale secure multiparty computation. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. ACM, 39--56.
[50]
Runhua Xu, Nathalie Baracaldo, Yi Zhou, Ali Anwar, and Heiko Ludwig. 2019. HybridAlpha: An Efficient Approach for Privacy-Preserving Federated Learning. In AISec. ACM.
[51]
Kai Yang, Tao Fan, Tianjian Chen, Yuanming Shi, and Qiang Yang. 2019. A quasi-newton method based vertical federated learning framework for logistic regression. arXiv preprint arXiv:1912.00513 (2019).
[52]
Hwanjo Yu, Jaideep Vaidya, and Xiaoqian Jiang. 2006. Privacy-preserving svm classification on vertically partitioned data. In PAKDD. Springer, 647--656.
[53]
Qingsong Zhang, Bin Gu, Cheng Deng, and Heng Huang. 2021. Secure bilevel asynchronous vertical federated learning with backward updating. arXiv preprint arXiv:2103.00958 (2021).
[54]
Yue Zhao, Meng Li, Liangzhen Lai, Naveen Suda, Damon Civin, and Vikas Chandra. 2018. Federated learning with non-iid data. arXiv preprint arXiv:1806.00582 (2018).

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
AISec '21: Proceedings of the 14th ACM Workshop on Artificial Intelligence and Security
November 2021
210 pages
ISBN:9781450386579
DOI:10.1145/3474369
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 November 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. functional encryption
  2. privacy-preserving
  3. vertical federated learning

Qualifiers

  • Research-article

Conference

CCS '21
Sponsor:

Acceptance Rates

Overall Acceptance Rate 94 of 231 submissions, 41%

Upcoming Conference

CCS '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)268
  • Downloads (Last 6 weeks)19
Reflects downloads up to 06 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media