skip to main content
10.1145/3538712.3538741acmotherconferencesArticle/Chapter ViewAbstractPublication PagesssdbmConference Proceedingsconference-collections
research-article

Data Leakage Mitigation of User-Defined Functions on Secure Personal Data Management Systems

Published: 23 August 2022 Publication History

Abstract

Personal Data Management Systems (PDMSs) arrive at a rapid pace providing individuals with appropriate tools to collect, manage and share their personal data. At the same time, the emergence of Trusted Execution Environments (TEEs) opens new perspectives in solving the critical and conflicting challenge of securing users’ data while enabling a rich ecosystem of data-driven applications. In this paper, we propose a PDMS architecture leveraging TEEs as a basis for security. Unlike existing solutions, our architecture allows for data processing extensiveness through the integration of any user-defined functions, albeit untrusted by the data owner. In this context, we focus on aggregate computations of large sets of database objects and provide a first study to mitigate the very large potential data leakage. We introduce the necessary security building blocks and show that an upper bound on data leakage can be guaranteed to the PDMS user. We then propose practical evaluation strategies ensuring that the potential data leakage remains minimal with a reasonable performance overhead. Finally, we validate our proposal with an Intel SGX-based PDMS implementation on real data sets.

References

[1]
2007. mydex. https://mydex.org/
[2]
2009. Digi.me. https://digi.me
[3]
2012. BitsAbout.me. https://bitsabout.me
[4]
2017. Personium. https://personium.io/
[5]
2019. Snowflake - Secure UDF. https://docs.snowflake.com/en/sql-reference/udf-secure.html
[6]
Tristan Allard, Nicolas Anciaux, Luc Bouganim, Yanli Guo, Lionel Le Folgoc, Benjamin Nguyen, Philippe Pucheral, Indrajit Ray, Indrakshi Ray, and Shaoyi Yin. 2010. Secure personal data servers: a vision paper. Proceedings of the VLDB Endowment 3, 1-2 (2010), 25–35.
[7]
Nicolas Anciaux, Philippe Bonnet, Luc Bouganim, Benjamin Nguyen, Philippe Pucheral, Iulian Sandu Popa, and Guillaume Scerri. 2019. Personal data management systems: The security and functionality standpoint. Information Systems 80(2019), 13–35.
[8]
Nicolas Anciaux, Philippe Bonnet, Luc Bouganim, Benjamin Nguyen, Iulian Sandu Popa, and Philippe Pucheral. 2013. Trusted Cells : A Sea Change for Personnal Data Services. In Proceedings of the 6th biennal Conference on Innovative Database Research (CIDR 2013).
[9]
Stefan Brenner, Michael Behlendorf, and Rüdiger Kapitza. 2018. Trusted Execution, and the Impact of Security on Performance. In Proceedings of the 3rd Workshop on System Software for Trusted Execution. 28–33.
[10]
Robin Carpentier, Iulian Sandu Popa, and Nicolas Anciaux. 2021. Poster: Reducing Data Leakage on Personal Data Management Systems. In IEEE European Symposium on Security and Privacy, EuroS&P 2021. 716–718.
[11]
Robin Carpentier, Iulian Sandu Popa, and Nicolas Anciaux. 2022. Local Personal Data Processing with Third Party Code and Bounded Leakage. In Proceedings of the 11th International Conference on Data Science, Technology and Applications, DATA 2022.
[12]
Robin Carpentier, Floris Thiant, Iulian Sandu Popa, Nicolas Anciaux, and Luc Bouganim. 2022. An Extensive and Secure Personal Data Management System Using SGX. In Proceedings of the 25th International Conference on Extending Database Technology, EDBT 2022. 2:570–2:573.
[13]
Amir Chaudhry, Jon Crowcroft, Heidi Howard, Anil Madhavapeddy, Richard Mortier, Hamed Haddadi, and Derek McAuley. 2015. Personal Data: Thinking Inside the Box. Aarhus Series on Human Centered Computing 1, 1 (2015).
[14]
Victor Costan and Srinivas Devadas. 2016. Intel SGX Explained. IACR Cryptol. ePrint Arch.(2016), 86.
[15]
Cozy. 2012. Cozy Cloud. https://cozy.io/
[16]
Yves-Alexandre de Montjoye, Erez Shmueli, Samuel S. Wang, and Alex Sandy Pentland. 2014. openPDS: Protecting the Privacy of Metadata through SafeAnswers. PLoS ONE 9, 7 (2014).
[17]
Cynthia Dwork. 2006. Differential Privacy. In Automata, Languages and Programming, 33rd International Colloquium, ICALP 2006, Proceedings, Part II, Vol. 4052. 1–12.
[18]
Edgeless Systems. 2021. EdgelessDB. https://www.edgeless.systems/products/edgelessdb/
[19]
Taher ElGamal. 1985. A public key cryptosystem and a signature scheme based on discrete logarithms. IEEE transactions on Information Theory 31, 4 (1985), 469–472.
[20]
European Council. 2016. Regulation EU 2016/679 of the European Parliament and of the Council. Official Journal of the European Union (OJ) 59, 1-88 (2016), 294.
[21]
Craig Gentry. 2009. A Fully Homomorphic Encryption Scheme. Ph. D. Dissertation.
[22]
Georges Hebrail and Alice Berard. 2012. Individual household electric power consumption Data Set. https://archive.ics.uci.edu/ml/datasets/individual+household+electric+power+consumption
[23]
Anders T. Gjerdrum, Robert Pettersen, Håvard D. Johansen, and Dag Johansen. 2017. Performance of Trusted Computing in Cloud Infrastructures with Intel SGX. In Proceedings of the 7th International Conference on Cloud Computing and Services Science, CLOSER 2017. 668–675.
[24]
David Goltzsche, Manuel Nieke, Thomas Knauth, and Rüdiger Kapitza. 2019. AccTEE: A WebAssembly-Based Two-Way Sandbox for Trusted Resource Accounting. In Proceedings of the 20th International Middleware Conference(Middleware ’19). 123–135.
[25]
Google. 2011. Google BigQuery - Authorized Functions. https://cloud.google.com/bigquery/docs/authorized-functions
[26]
Ziyang Han and Haibo Hu. 2021. ProDB: A memory-secure database using hardware enclave and practical oblivious RAM. Information Systems 96(2021).
[27]
T. Hardjono, D.L. Shrier, and A. Pentland. 2019. Trusted Data, revised and expanded edition: A New Framework for Identity and Data Sharing. MIT Press.
[28]
Tyler Hunt, Zhiting Zhu, Yuanzhong Xu, Simon Peter, and Emmett Witchel. 2018. Ryoan: A distributed sandbox for untrusted computation on secret data. ACM Transactions on Computer Systems (TOCS) 35, 4 (2018), 1–32.
[29]
Intel. 2021. Intel SGX for Linux OS v2.15 - Developer Reference.
[30]
Microsoft. 2017. Open Enclave SDK. https://openenclave.io
[31]
Microsoft. 2019. Azure SQL - Always encrypted with secure enclaves. https://docs.microsoft.com/en-us/sql/relational-databases/security/encryption/always-encrypted-enclaves
[32]
Microsoft Research Asia. 2016. GeoLife GPS Trajectories. https://www.microsoft.com/en-us/download/details.aspx?id=52367
[33]
Andrew C. Myers. 1999. JFlow: Practical Mostly-Static Information Flow Control. In Proceedings of the 26th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages(POPL ’99). 228–241.
[34]
E. Novak, P. T. Aung, and T. Do. 2020. VPN+ Towards Detection and Remediation of Information Leakage on Smartphones. In 2020 21st IEEE International Conference on Mobile Data Management (MDM). 39–48.
[35]
Sandro Pinto and Nuno Santos. 2019. Demystifying arm trustzone: A comprehensive survey. ACM Computing Surveys (CSUR) 51, 6 (2019), 1–36.
[36]
Christian Priebe, Kapil Vaswani, and Manuel Costa. 2018. EnclaveDB: A Secure Database Using SGX. In 2018 IEEE Symposium on Security and Privacy, SP 2018. 264–278.
[37]
Jingjing Ren, Ashwin Rao, Martina Lindorfer, Arnaud Legout, and David Choffnes. 2016. ReCon: Revealing and Controlling PII Leaks in Mobile Network Traffic. In Proceedings of the 14th Annual International Conference on Mobile Systems, Applications, and Services(MobiSys ’16). 361–374.
[38]
Indrajit Roy, Srinath T. V. Setty, Ann Kilzer, Vitaly Shmatikov, and Emmett Witchel. 2010. Airavat: Security and Privacy for MapReduce. In Proceedings of the 7th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2010. 297–312.
[39]
A. Sabelfeld and A.C. Myers. 2003. Language-based information-flow security. IEEE Journal on Selected Areas in Communications 21, 1(2003), 5–19.
[40]
A.V. Sambra, E. Mansour, S. Hawke, M. Zereba, N. Greco, A. Ghanem, D. Zagidulin, A. Aboulnaga, and T. Berners-Lee. 2016. Solid: A platform for decentralized social applications based on linked data.
[41]
Mingshen Sun, Tao Wei, and John C.S. Lui. 2016. TaintART: A Practical Multi-Level Information-Flow Tracking System for Android RunTime. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security(CCS ’16). 331–342.
[42]
Samuel Weiser, Luca Mayr, Michael Schwarz, and Daniel Gruss. 2019. SGXJail: Defeating Enclave Malware via Confinement. In 22nd International Symposium on Research in Attacks, Intrusions and Defenses (RAID 2019). 353–366.
[43]
Ruide Zhang, Ning Zhang, Assad Moini, Wenjing Lou, and Y Thomas Hou. 2020. PrivacyScope: Automatic Analysis of Private Data Leakage in TEE-Protected Applications. In 2020 IEEE 40th International Conference on Distributed Computing Systems (ICDCS). 34–44.
[44]
Wenchao Zhou, Yifan Cai, Yanqing Peng, Sheng Wang, Ke Ma, and Feifei Li. 2021. VeriDB: An SGX-Based Verifiable Database. In Proceedings of the 2021 International Conference on Management of Data(SIGMOD/PODS ’21). 2182–2194.

Cited By

View all

Index Terms

  1. Data Leakage Mitigation of User-Defined Functions on Secure Personal Data Management Systems

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      SSDBM '22: Proceedings of the 34th International Conference on Scientific and Statistical Database Management
      July 2022
      201 pages
      ISBN:9781450396677
      DOI:10.1145/3538712
      Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 23 August 2022

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Information leakage
      2. Personal Data Management Systems
      3. Trusted Execution Environment
      4. Untrusted Code
      5. User-defined functions

      Qualifiers

      • Research-article
      • Research
      • Refereed limited

      Funding Sources

      Conference

      SSDBM 2022

      Acceptance Rates

      Overall Acceptance Rate 56 of 146 submissions, 38%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)36
      • Downloads (Last 6 weeks)3
      Reflects downloads up to 01 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format.

      HTML Format

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media