skip to main content
10.1145/3658644.3690272acmconferencesArticle/Chapter ViewAbstractPublication PagesccsConference Proceedingsconference-collections
research-article
Open access

QueryCheetah: Fast Automated Discovery of Attribute Inference Attacks Against Query-Based Systems

Published: 09 December 2024 Publication History

Abstract

Query-based systems (QBSs) are one of the key approaches for sharing data. QBSs allow analysts to request aggregate information from a private protected dataset. Attacks are a crucial part of ensuring QBSs are truly privacy-preserving. The development and testing of attacks is however very labor-intensive and unable to cope with the increasing complexity of systems. Automated approaches have been shown to be promising but are currently extremely computationally intensive, limiting their applicability in practice. We here propose QueryCheetah, a fast and effective method for automated discovery of privacy attacks against QBSs. We instantiate QueryCheetah on attribute inference attacks and show it to discover stronger attacks than previous methods while being 18 times faster than the state-of-the-art automated approach. We then show how QueryCheetah allows system developers to thoroughly evaluate the privacy risk, including for various attacker strengths and target individuals. We finally show how QueryCheetah can be used out-of-the-box to find attacks in larger syntaxes and workarounds around ad-hoc defenses.

References

[1]
About customer list custom audiences. https://www.facebook.com/business/he lp/341425252616329. [Accessed 06-02--2024].
[2]
Aircloak vulnerabilities. https://aircloak.com/security/#anchor-diffix-vulnerabil ities. [Accessed 01-02--2024].
[3]
Attack Challenge | Aircloak. https://aircloak.com/solutions/attack-challenge-en/. [Accessed 01-02--2024].
[4]
Data Collaboration Service - AWS Clean Rooms - AWS. https://aws.amazon.c om/clean-rooms/. [Accessed 01-02--2024].
[5]
Google Maps 101: How AI helps predict traffic and determine routes. https: //blog.google/products/maps/google-maps-101-how-ai-helps-predict-trafficand-determine-routes/. [Accessed 02-02--2024].
[6]
OpenSAFELY: Secure analytics platform for NHS electronic health records. https: //www.opensafely.org/. [Accessed 01-02--2024].
[7]
Strava Metro Home. https://metro.strava.com/. [Accessed 02-02--2024].
[8]
Uber Newsroom. https://www.uber.com/newsroom/introducing- ubermovement-2/. [Accessed 06-02--2024].
[9]
Census-Income (KDD). UCI Machine Learning Repository, 2000.
[10]
John M Abowd. The us census bureau adopts differential privacy. In Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, pages 2867--2867, 2018.
[11]
Felix Abramovich and Marianna Pensky. Classification with many classes: Challenges and pluses. Journal of Multivariate Analysis, 174:104536, 2019.
[12]
Peter AR Ade, Nabila Aghanim, M Arnaud, Mark Ashdown, J Aumont, Carlo Baccigalupi, AJ Banday, RB Barreiro, JG Bartlett, N Bartolo, et al. Planck 2015 results-xiii. cosmological parameters. Astronomy & Astrophysics, 594:A13, 2016.
[13]
Giuseppe Ateniese, Luigi V Mancini, Angelo Spognardi, Antonio Villani, Domenico Vitali, and Giovanni Felici. Hacking smart machines with smarter ones: How to extract meaningful data from machine learning classifiers. International Journal of Security and Networks, 10(3):137--150, 2015.
[14]
Borja Balle, Giovanni Cherubin, and Jamie Hayes. Reconstructing training data with informed adversaries. In 2022 IEEE Symposium on Security and Privacy (SP), pages 1138--1156. IEEE, 2022.
[15]
Felix Bauer. The World's Only Anonymization Bug Bounty -- Round Two! - Aircloak. https://aircloak.com/the-worlds-only-anonymization-bug-bountyround-two/. [Accessed 01-02--2024].
[16]
Barry Becker and Ronny Kohavi. Adult. UCI Machine Learning Repository, 1996.
[17]
Benjamin Bichsel, Timon Gehr, Dana Drachsler-Cohen, Petar Tsankov, and Martin Vechev. Dp-finder: Finding differential privacy violations by sampling and optimization. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, pages 508--524, 2018.
[18]
Benjamin Bichsel, Samuel Steffen, Ilija Bogunovic, and Martin Vechev. Dp-sniper: Black-box discovery of differential privacy violations using classifiers. In 2021 IEEE Symposium on Security and Privacy (SP), pages 391--409. IEEE, 2021.
[19]
Franziska Boenisch, Reinhard Munz, Marcel Tiepelt, Simon Hanisch, Christiane Kuhn, and Paul Francis. Side-channel attacks on query-based data anonymization. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security, pages 1254--1265, 2021.
[20]
Nicholas Carlini, Steve Chien, Milad Nasr, Shuang Song, Andreas Terzis, and Florian Tramer. Membership inference attacks from first principles. In 2022 IEEE Symposium on Security and Privacy (SP), pages 1897--1914. IEEE, 2022.
[21]
Girish Chandrashekar and Ferat Sahin. A survey on feature selection methods. Computers & Electrical Engineering, 40(1):16--28, 2014.
[22]
Yujie Chen, Peter Ivan Cowling, Fiona AC Polack, and Philip James Mourdjis. A multi-arm bandit neighbourhood search for routing and scheduling problems. 2016.
[23]
Raj Chetty, Matthew O Jackson, Theresa Kuchler, Johannes Stroebel, Nathaniel Hendren, Robert B Fluegge, Sara Gong, Federico Gonzalez, Armelle Grondin, Matthew Jacob, et al. Social capital i: measurement and associations with economic mobility. Nature, 608(7921):108--121, 2022.
[24]
James Chipperfield, Daniel Gow, and Bronwyn Loong. The australian bureau of statistics and releasing frequency tables via a remote server. Statistical Journal of the IAOS, 32(1):53--64, 2016.
[25]
Aloni Cohen and Kobbi Nissim. Linear program reconstruction in practice. arXiv:1810.05692, 2018.
[26]
Ana-Maria Cretu, Florimond Houssiau, Antoine Cully, and Yves-Alexandre de Montjoye. Querysnout: Automating the discovery of attribute inference attacks against query-based systems. In Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security, pages 623--637, 2022.
[27]
Dorothy E Denning and Peter J Denning. The tracker: A threat to statistical database security. ACM Transactions on Database Systems (TODS), 4(1):76--96, 1979.
[28]
Zeyu Ding, Yuxin Wang, Guanhong Wang, Danfeng Zhang, and Daniel Kifer. Detecting violations of differential privacy. In Proc. of the 2018 ACM SIGSAC Conference on Computer and Communications Security, pages 475--489, 2018.
[29]
Irit Dinur and Kobbi Nissim. Revealing information while preserving privacy. In Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pages 202--210, 2003.
[30]
Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith. Calibrating noise to sensitivity in private data analysis. In Theory of Cryptography: Third Theory of Cryptography Conference, TCC 2006, New York, NY, USA, March 4--7, 2006. Proceedings 3, pages 265--284. Springer, 2006.
[31]
Cynthia Dwork, Frank McSherry, and Kunal Talwar. The price of privacy and the limits of lp decoding. In Proceedings of the thirty-ninth annual ACM symposium on Theory of computing, pages 85--94, 2007.
[32]
Cynthia Dwork, Adam Smith, Thomas Steinke, Jonathan Ullman, and Salil Vadhan. Robust traceability from trace amounts. In 2015 IEEE 56th Annual Symposium on Foundations of Computer Science, pages 650--669. IEEE, 2015.
[33]
Cynthia Dwork and Sergey Yekhanin. New efficient attacks on statistical disclosure control mechanisms. In Annual International Cryptology Conference, pages 469--480. Springer, 2008.
[34]
Ivan P Fellegi. On the question of statistical confidentiality. Journal of the American Statistical Association, 67(337):7--18, 1972.
[35]
Paul Francis. Specification of diffix cedar. Technical report, Technical Report MPI-SWS-2020-006, MPI-SWS, 2020.
[36]
Paul Francis. Specification of Diffix Dogwood. 2021.
[37]
Paul Francis, Sebastian Probst Eide, and Reinhard Munz. Diffix: High-utility database anonymization. In Privacy Technologies and Policy: 5th Annual Privacy Forum, APF 2017, Vienna, Austria, June 7--8, 2017, Revised Selected Papers 5, pages 141--158. Springer, 2017.
[38]
Paul Francis, Sebastian Probst-Eide, Pawel Obrok, Cristian Berneanu, Sasa Juric, and Reinhard Munz. Diffix-birch: Extending diffix-aspen. arXiv:1806.02075, 2018.
[39]
Paul Francis, Sebastian Probst-Eide, David Wagner, Felix Bauer, Cristian Berneanu, and Edon Gashi. Diffix elm: Simple diffix. arXiv:2201.04351, 2022.
[40]
Andrea Gadotti, Florimond Houssiau, Luc Rocher, Benjamin Livshits, and YvesAlexandre De Montjoye. When the signal is in the noise: Exploiting diffix's sticky noise. In 28th USENIX Security Symposium (USENIX Security 19), pages 1081--1098, 2019.
[41]
Karan Ganju, Qi Wang, Wei Yang, Carl A Gunter, and Nikita Borisov. Property inference attacks on fully connected neural networks using permutation invariant representations. In Proceedings of the 2018 ACM SIGSAC conference on computer and communications security, pages 619--633, 2018.
[42]
Nils Homer, Szabolcs Szelinger, Margot Redman, David Duggan, Waibhav Tembe, Jill Muehling, John V Pearson, Dietrich A Stephan, Stanley F Nelson, and David W Craig. Resolving individuals contributing trace amounts of dna to highly complex mixtures using high-density snp genotyping microarrays. PLoS genetics, 4(8):e1000167, 2008.
[43]
Florimond Houssiau, Luc Rocher, and Yves-Alexandre de Montjoye. On the difficulty of achieving differential privacy in practice: user-level guarantees in aggregate location data. Nature communications, 13(1):29, 2022.
[44]
Matthew Jagielski, Jonathan Ullman, and Alina Oprea. Auditing differentially private machine learning: How private is private sgd? Advances in Neural Information Processing Systems, 33:22205--22216, 2020.
[45]
Axel Oehmichen, Shubham Jain, Andrea Gadotti, and Yves-Alexandre de Montjoye. Opal: High performance platform for large-scale privacy-preserving location data analytics. In 2019 IEEE International Conference on Big Data (Big Data), pages 1332--1342. IEEE, 2019.
[46]
Christine M O'Keefe, Stephen Haslett, David Steel, and Ray Chambers. Table builder problem-confidentiality for linked tables. 2008.
[47]
Differential Privacy. Reconstruction Attacks in Practice. https://differentialpriva cy.org/diffix-attack/. [Accessed 01-02--2024].
[48]
Peter Putten. Insurance Company Benchmark (COIL 2000). UCI Machine Learning Repository, 2000.
[49]
Apostolos Pyrgelis. On Location, Time, and Membership: Studying How Aggregate Location Data Can Harm Users? Privacy. https://www.benthamsgaze.o rg/2018/10/02/on-location- time-and-membership-studying-how-aggregatelocation-data-can-harm-users-privacy/. [Accessed 01-02--2024].
[50]
Apostolos Pyrgelis, Carmela Troncoso, and Emiliano De Cristofaro. Knock knock, who?s there? membership inference on aggregate location data. In 25th Annual Network and Distributed System Security Symposium, NDSS 2018, San Diego, California, USA, February 18--21, 2018.
[51]
Yosef Rinott, Christine M O'Keefe, Natalie Shlomo, and Chris Skinner. Confidentiality and differential privacy in the dissemination of frequency tables. Statistical Science, 33(3):358--385, 2018.
[52]
Ryan Rogers, Adrian Rivera Cardoso, Koray Mancuhan, Akash Kaura, Nikhil Gahlawat, Neha Jain, Paul Ko, and Parvez Ahammad. A members first approach to enabling linkedin's labor market insights at scale. arXiv:2010.13981, 2020.
[53]
Ryan Rogers, Subbu Subramaniam, Sean Peng, David Durfee, Seunghyun Lee, Santosh Kumar Kancha, Shraddha Sahay, and Parvez Ahammad. Linkedin's audience engagements api: A privacy preserving data analytics system at scale. arXiv:2002.05839, 2020.
[54]
Indrajit Roy, Srinath TV Setty, Ann Kilzer, Vitaly Shmatikov, and Emmett Witchel. Airavat: Security and privacy for mapreduce. In NSDI, volume 10, pages 297--312, 2010.
[55]
Seref Sagiroglu and Duygu Sinanc. Big data: A review. In 2013 International Conf. on Collaboration Technologies and Systems (CTS), pages 42--47. IEEE, 2013.
[56]
Claude E Shannon. Xxii. programming a computer for playing chess. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, 41(314):256--275, 1950.
[57]
Reza Shokri, Marco Stronati, Congzheng Song, and Vitaly Shmatikov. Membership inference attacks against machine learning models. In 2017 IEEE symposium on security and privacy (SP), pages 3--18. IEEE, 2017.
[58]
Theresa Stadler, Bristena Oprisanu, and Carmela Troncoso. Synthetic data-- anonymisation groundhog day. In 31st USENIX Security Symposium (USENIX Security 22), pages 1451--1468, 2022.
[59]
Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, et al. Llama: Open and efficient foundation language models. arXiv:2302.13971, 2023.
[60]
Florian Tramer, Andreas Terzis, Thomas Steinke, Shuang Song, Matthew Jagielski, and Nicholas Carlini. Debugging differential privacy: A case study for privacy auditing. arXiv:2202.12219, 2022.
[61]
Yuxin Wang, Zeyu Ding, Daniel Kifer, and Danfeng Zhang. Checkdp: An automated and integrated approach for proving differential privacy or finding precise counterexamples. In Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security, pages 919--938, 2020.

Index Terms

  1. QueryCheetah: Fast Automated Discovery of Attribute Inference Attacks Against Query-Based Systems

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      CCS '24: Proceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security
      December 2024
      5188 pages
      ISBN:9798400706363
      DOI:10.1145/3658644
      This work is licensed under a Creative Commons Attribution International 4.0 License.

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 09 December 2024

      Check for updates

      Author Tags

      1. attribute inference attacks
      2. automating privacy attacks
      3. query-based systems

      Qualifiers

      • Research-article

      Funding Sources

      Conference

      CCS '24
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 1,261 of 6,999 submissions, 18%

      Upcoming Conference

      CCS '25

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 113
        Total Downloads
      • Downloads (Last 12 months)113
      • Downloads (Last 6 weeks)89
      Reflects downloads up to 25 Jan 2025

      Other Metrics

      Citations

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Login options

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media