Exploring the Space of Black-box Attacks on Deep Neural Networks

Bhagoji, Arjun Nitin; He, Warren; Li, Bo; Song, Dawn

Computer Science > Machine Learning

arXiv:1712.09491 (cs)

[Submitted on 27 Dec 2017]

Title:Exploring the Space of Black-box Attacks on Deep Neural Networks

Authors:Arjun Nitin Bhagoji, Warren He, Bo Li, Dawn Song

View PDF

Abstract:Existing black-box attacks on deep neural networks (DNNs) so far have largely focused on transferability, where an adversarial instance generated for a locally trained model can "transfer" to attack other learning models. In this paper, we propose novel Gradient Estimation black-box attacks for adversaries with query access to the target model's class probabilities, which do not rely on transferability. We also propose strategies to decouple the number of queries required to generate each adversarial sample from the dimensionality of the input. An iterative variant of our attack achieves close to 100% adversarial success rates for both targeted and untargeted attacks on DNNs. We carry out extensive experiments for a thorough comparative evaluation of black-box attacks and show that the proposed Gradient Estimation attacks outperform all transferability based black-box attacks we tested on both MNIST and CIFAR-10 datasets, achieving adversarial success rates similar to well known, state-of-the-art white-box attacks. We also apply the Gradient Estimation attacks successfully against a real-world Content Moderation classifier hosted by Clarifai. Furthermore, we evaluate black-box attacks against state-of-the-art defenses. We show that the Gradient Estimation attacks are very effective even against these defenses.

Comments:	25 pages, 7 figures, 10 tables
Subjects:	Machine Learning (cs.LG); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:1712.09491 [cs.LG]
	(or arXiv:1712.09491v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1712.09491

Submission history

From: Arjun Nitin Bhagoji [view email]
[v1] Wed, 27 Dec 2017 04:39:02 UTC (635 KB)

Computer Science > Machine Learning

Title:Exploring the Space of Black-box Attacks on Deep Neural Networks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Exploring the Space of Black-box Attacks on Deep Neural Networks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators