QuerySnout: Automating the discovery of attribute inference attacks against query-based systems
Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications …, 2022•dl.acm.org
Although query-based systems (QBS) have become one of the main solutions to share data
anonymously, building QBSes that robustly protect the privacy of individuals contributing to
the dataset is a hard problem. Theoretical solutions relying on differential privacy
guarantees are difficult to implement correctly with reasonable accuracy, while ad-hoc
solutions might contain unknown vulnerabilities. Evaluating the privacy provided by QBSes
must thus be done by evaluating the accuracy of a wide range of privacy attacks. However …
anonymously, building QBSes that robustly protect the privacy of individuals contributing to
the dataset is a hard problem. Theoretical solutions relying on differential privacy
guarantees are difficult to implement correctly with reasonable accuracy, while ad-hoc
solutions might contain unknown vulnerabilities. Evaluating the privacy provided by QBSes
must thus be done by evaluating the accuracy of a wide range of privacy attacks. However …
Although query-based systems (QBS) have become one of the main solutions to share data anonymously, building QBSes that robustly protect the privacy of individuals contributing to the dataset is a hard problem. Theoretical solutions relying on differential privacy guarantees are difficult to implement correctly with reasonable accuracy, while ad-hoc solutions might contain unknown vulnerabilities. Evaluating the privacy provided by QBSes must thus be done by evaluating the accuracy of a wide range of privacy attacks. However, existing attacks against QBSes require time and expertise to develop, need to be manually tailored to the specific systems attacked, and are limited in scope. In this paper, we develop QuerySnout, the first method to automatically discover vulnerabilities in query-based systems. QuerySnout takes as input a target record and the QBS as a black box, analyzes its behavior on one or more datasets, and outputs a multiset of queries together with a rule to combine answers to them in order to reveal the sensitive attribute of the target record. QuerySnout uses evolutionary search techniques based on a novel mutation operator to find a multiset of queries susceptible to lead to an attack, and a machine learning classifier to infer the sensitive attribute from answers to the queries selected. We showcase the versatility of QuerySnout by applying it to two attack scenarios (assuming access to either the private dataset or to a different dataset from the same distribution), three real-world datasets, and a variety of protection mechanisms. We show the attacks found by QuerySnout to consistently equate or outperform, sometimes by a large margin, the best attacks from the literature. We finally show how QuerySnout can be extended to QBSes that require a budget, and apply QuerySnout to a simple QBS based on the Laplace mechanism. Taken together, our results show how powerful and accurate attacks against QBSes can already be found by an automated system, allowing for highly complex QBSes to be automatically tested "at the pressing of a button". We believe this line of research to be crucial to improve the robustness of systems providing privacy-preserving access to personal data in theory and in practice.
ACM Digital Library