research-article

Open access

QueryCheetah: Fast Automated Discovery of Attribute Inference Attacks Against Query-Based Systems

Authors:

Bozhidar Stevanoski,

Ana-Maria Cretu,

Yves-Alexandre de MontjoyeAuthors Info & Claims

CCS '24: Proceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security

Pages 3451 - 3465

https://doi.org/10.1145/3658644.3690272

Published: 09 December 2024 Publication History

Abstract

Query-based systems (QBSs) are one of the key approaches for sharing data. QBSs allow analysts to request aggregate information from a private protected dataset. Attacks are a crucial part of ensuring QBSs are truly privacy-preserving. The development and testing of attacks is however very labor-intensive and unable to cope with the increasing complexity of systems. Automated approaches have been shown to be promising but are currently extremely computationally intensive, limiting their applicability in practice. We here propose QueryCheetah, a fast and effective method for automated discovery of privacy attacks against QBSs. We instantiate QueryCheetah on attribute inference attacks and show it to discover stronger attacks than previous methods while being 18 times faster than the state-of-the-art automated approach. We then show how QueryCheetah allows system developers to thoroughly evaluate the privacy risk, including for various attacker strengths and target individuals. We finally show how QueryCheetah can be used out-of-the-box to find attacks in larger syntaxes and workarounds around ad-hoc defenses.

References

[1]

About customer list custom audiences. https://www.facebook.com/business/he lp/341425252616329. [Accessed 06-02--2024].

[2]

Aircloak vulnerabilities. https://aircloak.com/security/#anchor-diffix-vulnerabil ities. [Accessed 01-02--2024].

[3]

Attack Challenge | Aircloak. https://aircloak.com/solutions/attack-challenge-en/. [Accessed 01-02--2024].

[4]

Data Collaboration Service - AWS Clean Rooms - AWS. https://aws.amazon.c om/clean-rooms/. [Accessed 01-02--2024].

[5]

Google Maps 101: How AI helps predict traffic and determine routes. https: //blog.google/products/maps/google-maps-101-how-ai-helps-predict-trafficand-determine-routes/. [Accessed 02-02--2024].

[6]

OpenSAFELY: Secure analytics platform for NHS electronic health records. https: //www.opensafely.org/. [Accessed 01-02--2024].

[7]

Strava Metro Home. https://metro.strava.com/. [Accessed 02-02--2024].

[8]

Uber Newsroom. https://www.uber.com/newsroom/introducing- ubermovement-2/. [Accessed 06-02--2024].

[9]

Census-Income (KDD). UCI Machine Learning Repository, 2000.

[10]

John M Abowd. The us census bureau adopts differential privacy. In Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, pages 2867--2867, 2018.

Digital Library

[11]

Felix Abramovich and Marianna Pensky. Classification with many classes: Challenges and pluses. Journal of Multivariate Analysis, 174:104536, 2019.

Digital Library

[12]

Peter AR Ade, Nabila Aghanim, M Arnaud, Mark Ashdown, J Aumont, Carlo Baccigalupi, AJ Banday, RB Barreiro, JG Bartlett, N Bartolo, et al. Planck 2015 results-xiii. cosmological parameters. Astronomy & Astrophysics, 594:A13, 2016.

[13]

Giuseppe Ateniese, Luigi V Mancini, Angelo Spognardi, Antonio Villani, Domenico Vitali, and Giovanni Felici. Hacking smart machines with smarter ones: How to extract meaningful data from machine learning classifiers. International Journal of Security and Networks, 10(3):137--150, 2015.

Digital Library

[14]

Borja Balle, Giovanni Cherubin, and Jamie Hayes. Reconstructing training data with informed adversaries. In 2022 IEEE Symposium on Security and Privacy (SP), pages 1138--1156. IEEE, 2022.

[15]

Felix Bauer. The World's Only Anonymization Bug Bounty -- Round Two! - Aircloak. https://aircloak.com/the-worlds-only-anonymization-bug-bountyround-two/. [Accessed 01-02--2024].

[16]

Barry Becker and Ronny Kohavi. Adult. UCI Machine Learning Repository, 1996.

[17]

Benjamin Bichsel, Timon Gehr, Dana Drachsler-Cohen, Petar Tsankov, and Martin Vechev. Dp-finder: Finding differential privacy violations by sampling and optimization. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, pages 508--524, 2018.

Digital Library

[18]

Benjamin Bichsel, Samuel Steffen, Ilija Bogunovic, and Martin Vechev. Dp-sniper: Black-box discovery of differential privacy violations using classifiers. In 2021 IEEE Symposium on Security and Privacy (SP), pages 391--409. IEEE, 2021.

[19]

Franziska Boenisch, Reinhard Munz, Marcel Tiepelt, Simon Hanisch, Christiane Kuhn, and Paul Francis. Side-channel attacks on query-based data anonymization. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security, pages 1254--1265, 2021.

Digital Library

[20]

Nicholas Carlini, Steve Chien, Milad Nasr, Shuang Song, Andreas Terzis, and Florian Tramer. Membership inference attacks from first principles. In 2022 IEEE Symposium on Security and Privacy (SP), pages 1897--1914. IEEE, 2022.

[21]

Girish Chandrashekar and Ferat Sahin. A survey on feature selection methods. Computers & Electrical Engineering, 40(1):16--28, 2014.

Digital Library

[22]

Yujie Chen, Peter Ivan Cowling, Fiona AC Polack, and Philip James Mourdjis. A multi-arm bandit neighbourhood search for routing and scheduling problems. 2016.

[23]

Raj Chetty, Matthew O Jackson, Theresa Kuchler, Johannes Stroebel, Nathaniel Hendren, Robert B Fluegge, Sara Gong, Federico Gonzalez, Armelle Grondin, Matthew Jacob, et al. Social capital i: measurement and associations with economic mobility. Nature, 608(7921):108--121, 2022.

[24]

James Chipperfield, Daniel Gow, and Bronwyn Loong. The australian bureau of statistics and releasing frequency tables via a remote server. Statistical Journal of the IAOS, 32(1):53--64, 2016.

[25]

Aloni Cohen and Kobbi Nissim. Linear program reconstruction in practice. arXiv:1810.05692, 2018.

[26]

Ana-Maria Cretu, Florimond Houssiau, Antoine Cully, and Yves-Alexandre de Montjoye. Querysnout: Automating the discovery of attribute inference attacks against query-based systems. In Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security, pages 623--637, 2022.

Digital Library

[27]

Dorothy E Denning and Peter J Denning. The tracker: A threat to statistical database security. ACM Transactions on Database Systems (TODS), 4(1):76--96, 1979.

[28]

Zeyu Ding, Yuxin Wang, Guanhong Wang, Danfeng Zhang, and Daniel Kifer. Detecting violations of differential privacy. In Proc. of the 2018 ACM SIGSAC Conference on Computer and Communications Security, pages 475--489, 2018.

Digital Library

[29]

Irit Dinur and Kobbi Nissim. Revealing information while preserving privacy. In Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pages 202--210, 2003.

Digital Library

[30]

Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith. Calibrating noise to sensitivity in private data analysis. In Theory of Cryptography: Third Theory of Cryptography Conference, TCC 2006, New York, NY, USA, March 4--7, 2006. Proceedings 3, pages 265--284. Springer, 2006.

Digital Library

[31]

Cynthia Dwork, Frank McSherry, and Kunal Talwar. The price of privacy and the limits of lp decoding. In Proceedings of the thirty-ninth annual ACM symposium on Theory of computing, pages 85--94, 2007.

Digital Library

[32]

Cynthia Dwork, Adam Smith, Thomas Steinke, Jonathan Ullman, and Salil Vadhan. Robust traceability from trace amounts. In 2015 IEEE 56th Annual Symposium on Foundations of Computer Science, pages 650--669. IEEE, 2015.

Digital Library

[33]

Cynthia Dwork and Sergey Yekhanin. New efficient attacks on statistical disclosure control mechanisms. In Annual International Cryptology Conference, pages 469--480. Springer, 2008.

Digital Library

[34]

Ivan P Fellegi. On the question of statistical confidentiality. Journal of the American Statistical Association, 67(337):7--18, 1972.

[35]

Paul Francis. Specification of diffix cedar. Technical report, Technical Report MPI-SWS-2020-006, MPI-SWS, 2020.

[36]

Paul Francis. Specification of Diffix Dogwood. 2021.

[37]

Paul Francis, Sebastian Probst Eide, and Reinhard Munz. Diffix: High-utility database anonymization. In Privacy Technologies and Policy: 5th Annual Privacy Forum, APF 2017, Vienna, Austria, June 7--8, 2017, Revised Selected Papers 5, pages 141--158. Springer, 2017.

[38]

Paul Francis, Sebastian Probst-Eide, Pawel Obrok, Cristian Berneanu, Sasa Juric, and Reinhard Munz. Diffix-birch: Extending diffix-aspen. arXiv:1806.02075, 2018.

[39]

Paul Francis, Sebastian Probst-Eide, David Wagner, Felix Bauer, Cristian Berneanu, and Edon Gashi. Diffix elm: Simple diffix. arXiv:2201.04351, 2022.

[40]

Andrea Gadotti, Florimond Houssiau, Luc Rocher, Benjamin Livshits, and YvesAlexandre De Montjoye. When the signal is in the noise: Exploiting diffix's sticky noise. In 28th USENIX Security Symposium (USENIX Security 19), pages 1081--1098, 2019.

[41]

Karan Ganju, Qi Wang, Wei Yang, Carl A Gunter, and Nikita Borisov. Property inference attacks on fully connected neural networks using permutation invariant representations. In Proceedings of the 2018 ACM SIGSAC conference on computer and communications security, pages 619--633, 2018.

Digital Library

[42]

Nils Homer, Szabolcs Szelinger, Margot Redman, David Duggan, Waibhav Tembe, Jill Muehling, John V Pearson, Dietrich A Stephan, Stanley F Nelson, and David W Craig. Resolving individuals contributing trace amounts of dna to highly complex mixtures using high-density snp genotyping microarrays. PLoS genetics, 4(8):e1000167, 2008.

[43]

Florimond Houssiau, Luc Rocher, and Yves-Alexandre de Montjoye. On the difficulty of achieving differential privacy in practice: user-level guarantees in aggregate location data. Nature communications, 13(1):29, 2022.

[44]

Matthew Jagielski, Jonathan Ullman, and Alina Oprea. Auditing differentially private machine learning: How private is private sgd? Advances in Neural Information Processing Systems, 33:22205--22216, 2020.

[45]

Axel Oehmichen, Shubham Jain, Andrea Gadotti, and Yves-Alexandre de Montjoye. Opal: High performance platform for large-scale privacy-preserving location data analytics. In 2019 IEEE International Conference on Big Data (Big Data), pages 1332--1342. IEEE, 2019.

[46]

Christine M O'Keefe, Stephen Haslett, David Steel, and Ray Chambers. Table builder problem-confidentiality for linked tables. 2008.

[47]

Differential Privacy. Reconstruction Attacks in Practice. https://differentialpriva cy.org/diffix-attack/. [Accessed 01-02--2024].

[48]

Peter Putten. Insurance Company Benchmark (COIL 2000). UCI Machine Learning Repository, 2000.

[49]

Apostolos Pyrgelis. On Location, Time, and Membership: Studying How Aggregate Location Data Can Harm Users? Privacy. https://www.benthamsgaze.o rg/2018/10/02/on-location- time-and-membership-studying-how-aggregatelocation-data-can-harm-users-privacy/. [Accessed 01-02--2024].

[50]

Apostolos Pyrgelis, Carmela Troncoso, and Emiliano De Cristofaro. Knock knock, who?s there? membership inference on aggregate location data. In 25th Annual Network and Distributed System Security Symposium, NDSS 2018, San Diego, California, USA, February 18--21, 2018.

[51]

Yosef Rinott, Christine M O'Keefe, Natalie Shlomo, and Chris Skinner. Confidentiality and differential privacy in the dissemination of frequency tables. Statistical Science, 33(3):358--385, 2018.

[52]

Ryan Rogers, Adrian Rivera Cardoso, Koray Mancuhan, Akash Kaura, Nikhil Gahlawat, Neha Jain, Paul Ko, and Parvez Ahammad. A members first approach to enabling linkedin's labor market insights at scale. arXiv:2010.13981, 2020.

[53]

Ryan Rogers, Subbu Subramaniam, Sean Peng, David Durfee, Seunghyun Lee, Santosh Kumar Kancha, Shraddha Sahay, and Parvez Ahammad. Linkedin's audience engagements api: A privacy preserving data analytics system at scale. arXiv:2002.05839, 2020.

[54]

Indrajit Roy, Srinath TV Setty, Ann Kilzer, Vitaly Shmatikov, and Emmett Witchel. Airavat: Security and privacy for mapreduce. In NSDI, volume 10, pages 297--312, 2010.

Digital Library

[55]

Seref Sagiroglu and Duygu Sinanc. Big data: A review. In 2013 International Conf. on Collaboration Technologies and Systems (CTS), pages 42--47. IEEE, 2013.

[56]

Claude E Shannon. Xxii. programming a computer for playing chess. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, 41(314):256--275, 1950.

[57]

Reza Shokri, Marco Stronati, Congzheng Song, and Vitaly Shmatikov. Membership inference attacks against machine learning models. In 2017 IEEE symposium on security and privacy (SP), pages 3--18. IEEE, 2017.

[58]

Theresa Stadler, Bristena Oprisanu, and Carmela Troncoso. Synthetic data-- anonymisation groundhog day. In 31st USENIX Security Symposium (USENIX Security 22), pages 1451--1468, 2022.

[59]

Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, et al. Llama: Open and efficient foundation language models. arXiv:2302.13971, 2023.

[60]

Florian Tramer, Andreas Terzis, Thomas Steinke, Shuang Song, Matthew Jagielski, and Nicholas Carlini. Debugging differential privacy: A case study for privacy auditing. arXiv:2202.12219, 2022.

[61]

Yuxin Wang, Zeyu Ding, Daniel Kifer, and Danfeng Zhang. Checkdp: An automated and integrated approach for proving differential privacy or finding precise counterexamples. In Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security, pages 919--938, 2020.

Digital Library

Index Terms

QueryCheetah: Fast Automated Discovery of Attribute Inference Attacks Against Query-Based Systems
1. Security and privacy
  1. Human and societal aspects of security and privacy
    1. Privacy protections
    2. Usability in security and privacy

Recommendations

QuerySnout: Automating the Discovery of Attribute Inference Attacks against Query-Based Systems
CCS '22: Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security

Although query-based systems (QBS) have become one of the main solutions to share data anonymously, building QBSes that robustly protect the privacy of individuals contributing to the dataset is a hard problem. Theoretical solutions relying on ...
Graph Embedding for Recommendation against Attribute Inference Attacks
WWW '21: Proceedings of the Web Conference 2021

In recent years, recommender systems play a pivotal role in helping users identify the most suitable items that satisfy personal preferences. As user-item interactions can be naturally modelled as graph-structured data, variants of graph convolutional ...
Automated discovery of adaptive attacks on adversarial defenses
NIPS '21: Proceedings of the 35th International Conference on Neural Information Processing Systems

Reliable evaluation of adversarial defenses is a challenging task, currently limited to an expert who manually crafts attacks that exploit the defenses inner workings or approaches based on an ensemble of fixed attacks, none of which may be effective for ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CCS '24: Proceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security

December 2024

5188 pages

ISBN:9798400706363

DOI:10.1145/3658644

General Chairs:
Bo Luo
University of Kansas, USA
,
Xiaojing Liao
Indiana University Bloomington, USA
,
Jun Xu
University of Utah, USA
,
Program Chairs:
Engin Kirda
Northeastern University, USA
,
David Lie
University of Toronto, Canada

Copyright © 2024 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

SIGSAC: ACM Special Interest Group on Security, Audit, and Control

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 December 2024

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Engineering and Physical Sciences Research Council
PETRAS National Centre of Excellence for IoT Systems Cybersecurity

Conference

CCS '24

Sponsor:

SIGSAC

CCS '24: ACM SIGSAC Conference on Computer and Communications Security

October 14 - 18, 2024

UT, Salt Lake City, USA

Acceptance Rates

Overall Acceptance Rate 1,261 of 6,999 submissions, 18%

Upcoming Conference

CCS '25

Sponsor:
sigsac

ACM SIGSAC Conference on Computer and Communications Security

October 13 - 17, 2025

Taipei , Taiwan

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
113
Total Downloads

Downloads (Last 12 months)113
Downloads (Last 6 weeks)89

Reflects downloads up to 25 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten