skip to main content
10.1145/3583780.3614957acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article
Public Access

Manipulating Out-Domain Uncertainty Estimation in Deep Neural Networks via Targeted Clean-Label Poisoning

Published: 21 October 2023 Publication History

Abstract

Robust out-domain uncertainty estimation has gained growing attention for its capacity of providing adversary-resistant uncertainty estimates on out-domain samples. However, existing work on robust uncertainty estimation mainly focuses on evasion attacks that happen during test time. The threat of poisoning attacks against uncertainty models is largely unexplored. Compared to evasion attacks, poisoning attacks do not necessarily modify test data, and therefore, would be more practical in real-world applications. In this work, we systematically investigate the robustness of state-of-the-art uncertainty estimation algorithms against data poisoning attacks, with the ultimate objective of developing robust uncertainty training methods. In particular, we focus on attacking the out-domain uncertainty estimation. Under the proposed attack, the training process of models is affected. A fake high-confidence region is established around the targeted out-domain sample, which originally would have been rejected by the model due to low confidence. More fatally, our attack is clean-label and targeted: it leaves the poisoned data with clean labels and attacks a specific targeted test sample without degrading the overall model performance. We evaluate the proposed attack on several image benchmark datasets and a real-world application of COVID-19 misinformation detection. The extensive experimental results on different tasks suggest that the state-of-the-art uncertainty estimation methods could be extremely vulnerable and easily corrupted by our proposed attack.

Supplementary Material

MP4 File (full1241-video.mp4)
This is a presentation record for a CIKM 2023 paper: Manipulating Out-Domain Uncertainty Estimation in Deep Neural Networks via Targeted Clean-label Poisoning

References

[1]
Ismail Alarab and Simant Prakoonwit. 2021. Adversarial Attack for Uncertainty Estimation: Identifying Critical Regions in Neural Networks. arXiv preprint arXiv:2107.07618 (2021).
[2]
Maximilian Augustin, Alexander Meinke, and Matthias Hein. 2020. Adversarial robustness on in-and out-distribution improves explainability. In European Conference on Computer Vision. Springer, 228--245.
[3]
Battista Biggio, Blaine Nelson, and Pavel Laskov. 2012. Poisoning attacks against support vector machines. arXiv preprint arXiv:1206.6389 (2012).
[4]
Julian Bitterwolf, Alexander Meinke, and Matthias Hein. 2020. Certifiably adversarially robust detection of out-of-distribution data. Advances in Neural Information Processing Systems 33 (2020), 16085--16095.
[5]
John Bradshaw, Alexander G de G Matthews, and Zoubin Ghahramani. 2017. Adversarial examples, uncertainty, and transfer testing robustness in gaussian process hybrid deep networks. arXiv preprint arXiv:1707.02476 (2017).
[6]
Cody Buntain and Jennifer Golbeck. 2017. Automatically identifying fake news in popular twitter threads. In 2017 IEEE International Conference on Smart Cloud (SmartCloud). IEEE, 208--215.
[7]
Xinyun Chen, Chang Liu, Bo Li, Kimberly Lu, and Dawn Song. 2017. Targeted backdoor attacks on deep learning systems using data poisoning. arXiv preprint arXiv:1712.05526 (2017).
[8]
Limeng Cui and Dongwon Lee. 2020. Coaid: Covid-19 healthcare misinformation dataset. arXiv:2006.00885 (2020).
[9]
Yarin Gal and Zoubin Ghahramani. 2016. Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In international conference on machine learning. PMLR, 1050--1059.
[10]
Ido Galil and Ran El-Yaniv. 2021. Disrupting Deep Uncertainty Estimation Without Harming Accuracy. In Thirty-Fifth Conference on Neural Information Processing Systems.
[11]
Jonas Geiping, Liam H Fowl, Gowthami Somepalli, Micah Goldblum, Michael Moeller, and Tom Goldstein. 2021. What Doesn't Kill You Makes You Robust (er): How to Adversarially Train against Data Poisoning. (2021).
[12]
Arthur Gretton, Karsten M Borgwardt, Malte J Rasch, Bernhard Schölkopf, and Alexander Smola. 2012. A Kernel Two-Sample Test. The Journal of Machine Learning Research 13, 1 (2012), 723--773.
[13]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.
[14]
Matthias Hein, Maksym Andriushchenko, and Julian Bitterwolf. 2019. Why relu networks yield high-confidence predictions far away from the training data and how to mitigate the problem. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 41--50.
[15]
W Ronny Huang, Jonas Geiping, Liam Fowl, Gavin Taylor, and Tom Goldstein. 2019. MetaPoison: Learning to craft adversarial poisoning examples via meta-learning. (2019).
[16]
Anna-Kathrin Kopetzki, Bertrand Charpentier, Daniel Zügner, Sandhya Giri, and Stephan Günnemann. 2021. Evaluating robustness of predictive uncertainty estimation: Are Dirichlet-based models reliable?. In International Conference on Machine Learning. PMLR, 5707--5718.
[17]
Alex Krizhevsky, Geoffrey Hinton, et al. 2009. Learning multiple layers of features from tiny images. (2009).
[18]
Balaji Lakshminarayanan, Alexander Pritzel, and Charles Blundell. 2016. Simple and scalable predictive uncertainty estimation using deep ensembles. arXiv preprint arXiv:1612.01474 (2016).
[19]
Quoc Le and Tomas Mikolov. 2014. Distributed representations of sentences and documents. In International conference on machine learning. PMLR, 1188--1196.
[20]
Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proc. IEEE 86, 11 (1998), 2278--2324.
[21]
Jeremiah Zhe Liu, Zi Lin, Shreyas Padhy, Dustin Tran, Tania Bedrax-Weiss, and Balaji Lakshminarayanan. 2020. Simple and principled uncertainty estimation with deterministic deep learning via distance awareness. arXiv preprint arXiv:2006.10108 (2020).
[22]
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019).
[23]
Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. 2017. Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083 (2017).
[24]
Alexander Meinke, Julian Bitterwolf, and Matthias Hein. 2021. Provably Robust Detection of Out-of-distribution Data (almost) for free. arXiv preprint arXiv:2106.04260 (2021).
[25]
Alexander Meinke and Matthias Hein. 2019. Towards neural networks that provably know when they don't know. arXiv preprint arXiv:1909.12180 (2019).
[26]
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013).
[27]
Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, and Andrew Y Ng. 2011. Reading digits in natural images with unsupervised feature learning. (2011).
[28]
Parth Patwa, Shivam Sharma, Srinivas Pykl, Vineeth Guptha, Gitanjali Kumari, Md Shad Akhtar, Asif Ekbal, Amitava Das, and Tanmoy Chakraborty. 2021. Fighting an infodemic: Covid-19 fake news dataset. In International Workshop on Combating Online Hostile Posts in Regional Languages during Emergency Situation. Springer, 21--29.
[29]
Neehar Peri, Neal Gupta, W Ronny Huang, Liam Fowl, Chen Zhu, Soheil Feizi, Tom Goldstein, and John P Dickerson. 2020. Deep k-nn defense against clean-label data poisoning attacks. In European Conference on Computer Vision. Springer, 55--70.
[30]
Aditi Raghunathan, Sang Michael Xie, Fanny Yang, John C Duchi, and Percy Liang. 2019. Adversarial training can hurt generalization. arXiv preprint arXiv:1906.06032 (2019).
[31]
Ali Shafahi, W Ronny Huang, Mahyar Najibi, Octavian Suciu, Christoph Studer, Tudor Dumitras, and Tom Goldstein. 2018. Poison frogs! targeted clean-label poisoning attacks on neural networks. arXiv preprint arXiv:1804.00792 (2018).
[32]
Kai Shu, Deepak Mahudeswaran, Suhang Wang, Dongwon Lee, and Huan Liu. 2020. Fakenewsnet: A data repository with news content, social context, and spatiotemporal information for studying fake news on social media. Big data 8, 3 (2020), 171--188.
[33]
Andrei Nikolaevich Tikhonov. 1963. On the solution of ill-posed problems and the method of regularization. In Doklady akademii nauk, Vol. 151. Russian Academy of Sciences, 501--504.
[34]
Joost van Amersfoort, Lewis Smith, Andrew Jesson, Oscar Key, and Yarin Gal. 2021. On feature collapse and deep kernel learning for single forward pass uncertainty. arXiv preprint arXiv:2102.11409 (2021).
[35]
Joost Van Amersfoort, Lewis Smith, Yee Whye Teh, and Yarin Gal. 2020. Uncertainty estimation using a single deep deterministic neural network. In International Conference on Machine Learning. PMLR, 9690--9700.
[36]
Zhenrui Yue, Bernhard Kratzwald, and Stefan Feuerriegel. 2021. Contrastive Domain Adaptation for Question Answering using Limited Text Corpora. arXiv preprint arXiv:2108.13854 (2021).
[37]
Huimin Zeng, Zhenrui Yue, Yang Zhang, Ziyi Kou, Lanyu Shang, and Dong Wang. 2022. On Attacking Out-Domain Uncertainty Estimation in Deep Neural Networks. In 31st International Joint Conference on Artificial Intelligence, IJCAI 2022. International Joint Conferences on Artificial Intelligence, 4893--4899.
[38]
Chen Zhu, W Ronny Huang, Hengduo Li, Gavin Taylor, Christoph Studer, and Tom Goldstein. 2019. Transferable clean-label poisoning attacks on deep neural nets. In International Conference on Machine Learning. PMLR.

Index Terms

  1. Manipulating Out-Domain Uncertainty Estimation in Deep Neural Networks via Targeted Clean-Label Poisoning

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge Management
    October 2023
    5508 pages
    ISBN:9798400701245
    DOI:10.1145/3583780
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 21 October 2023

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. out-domain detection
    2. uncertainty estimation

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    CIKM '23
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

    Upcoming Conference

    CIKM '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 130
      Total Downloads
    • Downloads (Last 12 months)91
    • Downloads (Last 6 weeks)13
    Reflects downloads up to 21 Dec 2024

    Other Metrics

    Citations

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media