research-article

PSP-Mal: Evading Malware Detection via Prioritized Experience-based Reinforcement Learning with Shapley Prior

Authors: Dazhi Zhan, Wei Bai, Xin Liu,

Lei Zhang, Shize Guo, Zhisong PanAuthors Info & Claims

ACSAC '23: Proceedings of the 39th Annual Computer Security Applications Conference

Pages 580 - 593

https://doi.org/10.1145/3627106.3627178

Published: 04 December 2023 Publication History

Abstract

With the widespread application of machine learning techniques in malware detection, researchers have proposed various adversarial attack methods to generate adversarial examples (AEs) of malware, thereby evading detection. Previous studies have shown that the reinforcement learning (RL) framework can enable black-box attacks by performing a sequence of function-preserving operations, which produces functional evasive malware samples. However, it is difficult to obtain the useful guidance and feedbacks from the environment for agent training in the black-box scenario, which results in the RL framework being unable to learn the effective evasion policy. In this paper, we propose the Shapley prior and establish a prior-guidance-based RL framework, namely PSP-Mal, to generate AEs against Portable Executable (PE) malware detectors. Our framework improves on existing methods in three aspects: 1) We explore feature effects of the black-box model by computing Shapley values and further propose the Shapley prior to represent the expected impact of operations. 2) A novel prioritized experience utilization mechanism is established regarding the Shapley prior guidance in the RL framework. 3) The actions are expanded into item-content pairs and we use the Thompson sampling to choose effective content, which helps to reduce randomness and ensure repeatability. We compare the attack performance of our framework with other methods, and experimental results demonstrate that our algorithm is more effective. The evasion rates of PSP-Mal against the LightGBM models trained on EMBER and SOREL-20M reach 76.88% and 72.03%, respectively.

References

[1]

Naveed Akhtar and Ajmal Mian. 2018. Threat of adversarial attacks on deep learning in computer vision: A survey. IEEE Access 6 (2018), 14410–14430.

[2]

Hyrum S Anderson, Anant Kharkar, Bobby Filar, and Phil Roth. 2017. Evading machine learning malware detection. Black Hat 2017 (2017).

[3]

Hyrum S Anderson and Phil Roth. 2018. Ember: an open dataset for training static pe malware machine learning models. arXiv preprint arXiv:1804.04637 (2018).

[4]

Zahra Bazrafshan, Hashem Hashemi, Seyed Mehdi Hazrati Fard, and Ali Hamzeh. 2013. A survey on heuristic malware detection techniques. In the 5th Conference on Information and Knowledge Technology. IEEE, 113–120.

[5]

Nicholas Carlini and David Wagner. 2018. Audio adversarial examples: Targeted attacks on speech-to-text. In IEEE Security and Privacy Workshops (SPW). IEEE, 1–7.

[6]

Olivier Chapelle and Lihong Li. 2011. An empirical evaluation of thompson sampling. Advances in Neural Information Processing Systems 24 (2011).

[7]

Bingcai Chen, Zhongru Ren, Chao Yu, Iftikhar Hussain, and Jintao Liu. 2019. Adversarial examples for cnn-based malware detectors. IEEE Access 7 (2019), 54360–54371.

[8]

Jun Chen, Jingfei Jiang, Rongchun Li, and Yong Dou. 2020. Generating adversarial examples for static PE malware detector based on deep reinforcement learning. In Journal of Physics: Conference Series, Vol. 1575. IOP Publishing, 012011.

[9]

George E Dahl, Jack W Stokes, Li Deng, and Dong Yu. 2013. Large-scale malware classification using random projections and neural networks. In IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 3422–3426.

[10]

Luca Demetrio, Battista Biggio, Giovanni Lagorio, Fabio Roli, and Alessandro Armando. 2019. Explaining vulnerabilities of deep learning to adversarial malware binaries. Italian Conference on Cybersecurity (2019).

[11]

Luca Demetrio, Battista Biggio, Giovanni Lagorio, Fabio Roli, and Alessandro Armando. 2021. Functionality-preserving black-box optimization of adversarial windows malware. IEEE Transactions on Information Forensics and Security 16 (2021), 3469–3478.

[12]

Luca Demetrio, Scott E Coull, Battista Biggio, Giovanni Lagorio, Alessandro Armando, and Fabio Roli. 2021. Adversarial exemples: a survey and experimental evaluation of practical attacks on machine learning for windows malware detection. ACM Transactions on Privacy and Security 24, 4 (2021), 1–31.

Digital Library

[13]

Tianyu Du, Shouling Ji, Jinfeng Li, Qinchen Gu, Ting Wang, and Raheem Beyah. 2020. Sirenattack: Generating adversarial audio for end-to-end acoustic systems. In Proceedings of the 15th ACM Asia Conference on Computer and Communications Security. 357–369.

Digital Library

[14]

Mohammadreza Ebrahimi, Jason Pacheco, Weifeng Li, James Lee Hu, and Hsinchun Chen. 2021. Binary Black-Box Attacks Against Static Malware Detectors with Reinforcement Learning in Discrete Action Spaces. In IEEE Security and Privacy Workshops (SPW). IEEE, 85–91.

[15]

Yong Fang, Yuetian Zeng, Beibei Li, Liang Liu, and Lei Zhang. 2020. DeepDetectNet vs RLAttackNet: An adversarial method to improve deep learning-based static malware detection model. Plos One 15, 4 (2020), e0231626.

[16]

Zhiyang Fang, Junfeng Wang, Jiaxuan Geng, Yingjie Zhou, and Xuan Kan. 2021. A3CMal: Generating adversarial samples to force targeted misclassification by reinforcement learning. Applied Soft Computing 109 (2021), 107505.

Digital Library

[17]

Zhiyang Fang, Junfeng Wang, Boya Li, Siqi Wu, Yingjie Zhou, and Haiying Huang. 2019. Evading anti-malware engines with deep reinforcement learning. IEEE Access 7 (2019), 48867–48879.

[18]

Jerome H Friedman. 2001. Greedy function approximation: a gradient boosting machine. Annals of Statistics (2001), 1189–1232.

[19]

Daniel Gibert, Matt Fredrikson, Carles Mateu, Jordi Planes, and Quan Le. 2022. Enhancing the insertion of NOP instructions to obfuscate malware via deep reinforcement learning. Computers & Security 113 (2022), 102543.

Digital Library

[20]

Daniel Gibert, Carles Mateu, Jordi Planes, and Ramon Vicens. 2018. Classification of malware by using structural entropy on convolutional neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32.

[21]

Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. 2014. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2014).

[22]

Richard Harang and Ethan M Rudd. 2020. SOREL-20M: A large scale benchmark dataset for malicious PE detection. arXiv preprint arXiv:2012.07634 (2020).

[23]

Weiwei Hu and Ying Tan. 2017. Black-box attacks against RNN based malware detection algorithms. arXiv preprint arXiv:1705.08131 (2017).

[24]

Masataka Kawai, Kaoru Ota, and Mianxing Dong. 2019. Improved malgan: Avoiding malware detector by leaning cleanware features. In the International Conference on Artificial Intelligence in Information and Communication (ICAIIC). IEEE, 040–045.

[25]

Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, and Tie-Yan Liu. 2017. Lightgbm: A highly efficient gradient boosting decision tree. Advances in Neural Information Processing Systems 30 (2017).

Digital Library

[26]

Aminollah Khormali, Ahmed Abusnaina, Songqing Chen, DaeHun Nyang, and Aziz Mohaisen. 2019. COPYCAT: practical adversarial attacks on visualization-based malware detection. arXiv preprint arXiv:1909.09735 (2019).

[27]

Bojan Kolosnjaji, Ambra Demontis, Battista Biggio, and Maiorca. 2018. Adversarial malware binaries: Evading deep learning for malware detection in executables. In the 26th European Signal Processing Conference (EUSIPCO). IEEE, 533–537.

[28]

Jeremy Z Kolter and Marcus A Maloof. 2004. Learning to detect malicious executables in the wild. In the 10th ACM International Conference on Knowledge Discovery and Data Mining. 470–478.

Digital Library

[29]

Felix Kreuk, Assi Barak, and Aviv-Reuven. 2018. Deceiving end-to-end deep learning malware detectors using adversarial examples. arXiv preprint arXiv:1802.04528 (2018).

[30]

Raphael Labaca-Castro, Sebastian Franz, and Gabi Dreo Rodosek. 2021. AIMED-RL: Exploring adversarial malware examples with reinforcement learning. In Machine Learning and Knowledge Discovery in Databases. Applied Data Science Track: European Conference, ECML PKDD 2021. Springer, 37–52.

Digital Library

[31]

Xintong Li and Qi Li. 2021. An IRL-based malware adversarial generation method to evade anti-malware engines. Computers & Security 104 (2021), 102118.

Digital Library

[32]

Scott M Lundberg, Gabriel Erion, Hugh Chen, Alex DeGrave, Jordan M Prutkin, Bala Nair, Ronit Katz, Jonathan Himmelfarb, Nisha Bansal, and Su-In Lee. 2020. From local explanations to global understanding with explainable AI for trees. Nature machine intelligence 2, 1 (2020), 56–67.

[33]

Scott M Lundberg and Su-In Lee. 2017. A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems 30 (2017).

[34]

Christoph Molnar. 2020. Interpretable machine learning. Lulu. com.

[35]

Andrew Y Ng, Stuart Russell, 2000. Algorithms for inverse reinforcement learning. In International Conference on Machine Learning, Vol. 1. 2.

[36]

Fabio Pierazzi, Feargus Pendlebury, Jacopo Cortellazzi, and Lorenzo Cavallaro. 2020. Intriguing properties of adversarial ml attacks in the problem space. In IEEE Symposium on Security and Privacy (SP). IEEE, 1332–1349.

[37]

Tony Quertier, Benjamin Marais, Stéphane Morucci, and Bertrand Fournel. 2022. MERLIN–Malware Evasion with Reinforcement LearnINg. arXiv preprint arXiv:2203.12980 (2022).

[38]

Edward Raff, Jon Barker, Jared Sylvester, Robert Brandon, Bryan Catanzaro, and Charles K Nicholas. 2018. Malware detection by eating a whole exe. In Proceedings of the AAAI Conference on Artificial Intelligence.

[39]

Ishai Rosenberg, Asaf Shabtai, Lior Rokach, and Yuval Elovici. 2018. Generic black-box end-to-end attack against state of the art API call based malware classifiers. In the 21st International Symposium Research in Attacks, Intrusions and Defense. Springer, 490–510.

[40]

V Sai Sathyanarayan, Pankaj Kohli, and Bezawada Bruhadeshwar. 2008. Signature generation and detection of malware families. In Information Security and Privacy: 13th Australasian Conference. Springer, 336–349.

Digital Library

[41]

Joshua Saxe and Konstantin Berlin. 2015. Deep neural network based malware detection using two dimensional binary program features. In the 10th International Conference on Malicious and Unwanted Software. IEEE, 11–20.

Digital Library

[42]

Tom Schaul, John Quan, Ioannis Antonoglou, and David Silver. 2015. Prioritized experience replay. arXiv preprint arXiv:1511.05952 (2015).

[43]

Giorgio Severi, Jim Meyer, Scott E Coull, and Alina Oprea. 2021. Explanation-Guided Backdoor Poisoning Attacks Against Malware Classifiers. In USENIX Security Symposium. 1487–1504.

[44]

Wei Song, Xuezixiang Li, Sadia Afroz, Deepali Garg, Dmitry Kuznetsov, and Heng Yin. 2020. Mab-malware: A reinforcement learning framework for attacking static malware classifiers. arXiv preprint arXiv:2003.03100 (2020).

[45]

Octavian Suciu, Scott E Coull, and Jeffrey Johns. 2019. Exploring adversarial examples in malware detection. In IEEE Security and Privacy Workshops (SPW). IEEE, 8–14.

[46]

Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. 2013. Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199 (2013).

[47]

Hado Van Hasselt, Arthur Guez, and David Silver. 2016. Deep reinforcement learning with double q-learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 30.

[48]

Xiruo Wang and Risto Miikkulainen. 2020. MDEA: Malware detection with evolutionary adversarial learning. In IEEE Congress on Evolutionary Computation (CEC). IEEE, 1–8.

Digital Library

[49]

Ziyu Wang, Tom Schaul, Matteo Hessel, Hado Hasselt, Marc Lanctot, and Nando Freitas. 2016. Dueling network architectures for deep reinforcement learning. In International Conference on Machine Learning. PMLR, 1995–2003.

[50]

Cangshuai Wu, Jiangyong Shi, Yuexiang Yang, and Wenhua Li. 2018. Enhancing machine learning based malware detection model by reinforcement learning. In Proceedings of the 8th International Conference on Communication and Network Security. 74–78.

Digital Library

[51]

Di Wu, Binxing Fang, Junnan Wang, Qixu Liu, and Xiang Cui. 2019. Evading machine learning botnet detection models via deep reinforcement learning. In IEEE International Conference on Communications (ICC). IEEE, 1–6.

[52]

Ilsun You and Kangbin Yim. 2010. Malware obfuscation techniques: A brief survey. In the International Conference on Broadband, Wireless Computing, Communication and Applications. IEEE, 297–300.

Digital Library

[53]

Junkun Yuan, Shaofang Zhou, Lanfen Lin, Feng Wang, and Jia Cui. 2020. Black-box adversarial attacks against deep learning based malware binaries detection with GAN. In the European Conference on Artificial Intelligence. IOS Press, 2536–2542.

[54]

Dazhi Zhan, Yexin Duan, Yue Hu, Lujia Yin, Zhisong Pan, and Shize Guo. 2023. AMGmal: Adaptive mask-guided adversarial attack against malware detection with minimal perturbation. Computers & Security 127 (2023), 103103.

Digital Library

[55]

Lan Zhang, Peng Liu, Yoon-Ho Choi, and Ping Chen. 2022. Semantics-preserving reinforcement learning attack against graph neural networks for malware detection. IEEE Transactions on Dependable and Secure Computing 20, 2 (2022), 1390–1402.

Digital Library

[56]

Fangtian Zhong, Pengfei Hu, Guoming Zhang, Hong Li, and Xiuzhen Cheng. 2022. Reinforcement learning based adversarial malware example generation against black-box detectors. Computers & Security 121 (2022), 102869.

Digital Library

Index Terms

PSP-Mal: Evading Malware Detection via Prioritized Experience-based Reinforcement Learning with Shapley Prior
1. Computing methodologies
  1. Machine learning
2. Security and privacy
  1. Intrusion/anomaly detection and malware mitigation
    1. Malware and its mitigation

Recommendations

Leveraging Reinforcement Learning and Generative Adversarial Networks to Craft Mutants of Windows Malware against Black-box Malware Detectors
SoICT '22: Proceedings of the 11th International Symposium on Information and Communication Technology

To build an effective malware detector, it is required to collect a diversity of malware samples and their evolution, since malware authors always try to evade detectors through strategies of malware mutation. So, this paper explores the ability to ...
Enhancing Machine Learning Based Malware Detection Model by Reinforcement Learning
ICCNS '18: Proceedings of the 8th International Conference on Communication and Network Security

Malware detection is getting more and more attention due to the rapid growth of new malware. As a result, machine learning (ML) has become a popular way to detect malware variants. However, machine learning models can also be cheated. Through ...
Malware detection using adaptive data compression
AISec '08: Proceedings of the 1st ACM workshop on Workshop on AISec

A popular approach in current commercial anti-malware software detects malicious programs by searching in the code of programs for scan strings that are byte sequences indicative of malicious code. The scan strings, also known as the signatures of ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ACSAC '23: Proceedings of the 39th Annual Computer Security Applications Conference

December 2023

836 pages

ISBN:9798400708862

DOI:10.1145/3627106

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 December 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Badges

Results Reproduced / v1.1

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

National Natural Science Foundation of China

Conference

ACSAC '23

ACSAC '23: Annual Computer Security Applications Conference

December 4 - 8, 2023

TX, Austin, USA

Acceptance Rates

Overall Acceptance Rate 104 of 497 submissions, 21%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
160
Total Downloads

Downloads (Last 12 months)160
Downloads (Last 6 weeks)18

Reflects downloads up to 16 Oct 2024

Other Metrics

View Author Metrics

Citations

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents