research-article

Real-Time Defensive Strategy Selection via Deep Reinforcement Learning

Authors:

Axel Charpentier,

Christopher Neal,

Nora Boulahia-Cuppens,

Frederic Cuppens,

Reda YaichAuthors Info & Claims

ARES '23: Proceedings of the 18th International Conference on Availability, Reliability and Security

Article No.: 15, Pages 1 - 11

https://doi.org/10.1145/3600160.3600176

Published: 29 August 2023 Publication History

Abstract

As computer networks face increasingly sophisticated attacks there is a need to create adaptive defensive systems that can select appropriate countermeasures to thwart attacks. The use of Deep Reinforcement Learning to train defensive agents is an avenue to study to meet this demand. In this paper we describe a simulated computer network environment wherein we conduct attacks and train defensive agents that employ Moving Target Defense and Deception strategies. We train an attacking agent, using Proximal Policy Optimization, to learn a policy to extract sensitive network data as quickly as possible from the environment. We then train a defending agent to prevent the attacker from reaching its objective. Our results demonstrate how the defender is able to learn a policy to inhibit the attacker.

References

[1]

Red Hat Ansible. 2022. Ansible is simple IT automation. https://www.ansible.com/

[2]

Ahmed H Anwar, Charles Kamhoua, and Nandi Leslie. 2020. Honeypot allocation over attack graphs in cyber deception games. In 2020 International Conference on Computing, Networking and Communications (ICNC). IEEE, 502–506.

[3]

Xinzhong Chai, Yasen Wang, Chuanxu Yan, Yuan Zhao, Wenlong Chen, and Xiaolei Wang. 2020. DQ-MOTAG: deep reinforcement learning-based moving target defense against DDoS attacks. In 2020 IEEE Fifth International Conference on Data Science in Cyberspace (DSC). IEEE, 375–379.

[4]

Jinyin Chen, Shulong Hu, Haibin Zheng, Changyou Xing, and Guomin Zhang. 2023. GAIL-PT: An intelligent penetration testing framework with generative adversarial imitation learning. Computers & Security 126 (2023), 103055. https://doi.org/10.1016/j.cose.2022.103055

Digital Library

[5]

Jin-Hee Cho, Dilli P Sharma, Hooman Alavizadeh, Seunghyun Yoon, Noam Ben-Asher, Terrence J Moore, Dong Seong Kim, Hyuk Lim, and Frederica F Nelson. 2020. Toward proactive, adaptive defense: A survey on moving target defense. IEEE Communications Surveys & Tutorials 22, 1 (2020), 709–745.

Digital Library

[6]

Ankur Chowdhary, Dijiang Huang, Jayasurya Sevalur Mahendran, Daniel Romo, Yuli Deng, and Abdulhakim Sabur. 2020. Autonomous Security Analysis and Penetration Testing. In 2020 16th International Conference on Mobility, Sensing and Networking (MSN). 508–515. https://doi.org/10.1109/MSN50589.2020.00086

[7]

Debian. 2022. Debian. https://www.debian.org/index.fr.html

[8]

Daniel Fraunholz, Simon Duque Anton, Christoph Lipps, Daniel Reti, Daniel Krohmer, Frederic Pohl, Matthias Tammen, and Hans Dieter Schotten. 2018. Demystifying deception technology: A survey. arXiv preprint arXiv:1804.06196 (2018).

[9]

Rohit Gangupantulu, Tyler Cody, Paul Park, Abdul Rahman, Logan Eisenbeiser, Dan Radke, Ryan Clark, and Christopher Redino. 2022. Using Cyber Terrain in Reinforcement Learning for Penetration Testing. In 2022 IEEE International Conference on Omni-layer Intelligent Systems (COINS). 1–8. https://doi.org/10.1109/COINS54846.2022.9855011

[10]

Xiao Han, Nizar Kheir, and Davide Balzarotti. 2018. Deception techniques in computer security: A research perspective. ACM Computing Surveys (CSUR) 51, 4 (2018), 1–36.

Digital Library

[11]

Karel Horák, Quanyan Zhu, and Branislav Bošanskỳ. 2017. Manipulating adversary’s belief: A dynamic game approach to deception by design for proactive network security. In Decision and Game Theory for Security: 8th International Conference, GameSec 2017, Vienna, Austria, October 23-25, 2017, Proceedings. Springer, 273–294.

[12]

Sunghwan Kim, Seunghyun Yoon, Jin-Hee Cho, Dong Seong Kim, Terrence J Moore, Frederica Free-Nelson, and Hyuk Lim. 2022. DIVERGENCE: deep reinforcement learning-based adaptive traffic inspection and moving target defense countermeasure framework. IEEE Transactions on Network and Service Management (2022).

[13]

Cheng Lei, Duo-He Ma, and Hong-Qi Zhang. 2017. Optimal strategy selection for moving target defense based on Markov game. IEEE Access 5 (2017), 156–169.

[14]

Xiannuan Liang and Yang Xiao. 2012. Game theory for network security. IEEE Communications Surveys & Tutorials 15, 1 (2012), 472–486.

[15]

Shichao Liu, Peter X Liu, and Abdulmotaleb El Saddik. 2014. A stochastic game approach to the security issue of networked control systems under jamming attacks. Journal of the Franklin Institute 351, 9 (2014), 4570–4583.

[16]

Dirk Merkel. 2014. Docker: lightweight linux containers for consistent development and deployment. Linux journal 2014, 239 (2014), 2.

Digital Library

[17]

Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. 2013. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013).

[18]

Oracle. 2022. Oracle VM virtualbox. https://www.oracle.com/virtualization/virtualbox/index.html

[19]

Jeffrey Pawlick, Edward Colbert, and Quanyan Zhu. 2019. A game-theoretic taxonomy and survey of defensive deception for cybersecurity and privacy. ACM Computing Surveys (CSUR) 52, 4 (2019), 1–28.

Digital Library

[20]

Kexiang Qian, Daojuan Zhang, Peng Zhang, Zhihong Zhou, Xiuzhen Chen, and Shengxiong Duan. 2021. Ontology and Reinforcement Learning Based Intelligent Agent Automatic Penetration Test. In 2021 IEEE International Conference on Artificial Intelligence and Computer Applications (ICAICA). 556–561. https://doi.org/10.1109/ICAICA52286.2021.9497911

[21]

Antonin Raffin, Ashley Hill, Adam Gleave, Anssi Kanervisto, Maximilian Ernestus, and Noah Dormann. 2021. Stable-Baselines3: Reliable Reinforcement Learning Implementations. Journal of Machine Learning Research 22, 268 (2021), 1–8. http://jmlr.org/papers/v22/20-1364.html

[22]

John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017).

[23]

Sailik Sengupta, Ankur Chowdhary, Abdulhakim Sabur, Adel Alshamrani, Dijiang Huang, and Subbarao Kambhampati. 2020. A survey of moving target defenses for network security. IEEE Communications Surveys & Tutorials 22, 3 (2020), 1909–1941.

[24]

Jing-lei Tan, Cheng Lei, Hong-qi Zhang, and Yu-qiao Cheng. 2019. Optimal strategy selection approach to moving target defense based on Markov robust game. computers & security 85 (2019), 63–76.

[25]

Longfei Wei, Arif I Sarwat, and Walid Saad. 2016. Risk assessment of coordinated cyber-physical attacks against power grids: A stochastic game approach. In 2016 IEEE Industry Applications Society Annual Meeting. IEEE, 1–7.

[26]

Seunghyun Yoon, Jin-Hee Cho, Dong Seong Kim, Terrence J Moore, Frederica Free-Nelson, and Hyuk Lim. 2021. Desolater: Deep reinforcement learning-based resource allocation and moving target defense deployment framework. IEEE Access 9 (2021), 70700–70714.

[27]

Huan Zhang, Kangfeng Zheng, Xiujuan Wang, Shoushan Luo, and Bin Wu. 2019. Efficient strategy selection for moving target defense under multiple attacks. IEEE Access 7 (2019), 65982–65995.

[28]

Huan Zhang, Kangfeng Zheng, Xiujuan Wang, Shoushan Luo, and Bin Wu. 2020. Strategy selection for moving target defense in incomplete information game. Computers, Materials & Continua 62, 2 (2020), 763–786.

[29]

Jianjun Zheng and Akbar Siami Namin. 2019. A survey on the moving target defense strategies: An architectural perspective. Journal of Computer Science and Technology 34 (2019), 207–233.

[30]

Shicheng Zhou, Jingju Liu, Dongdong Hou, Xiaofeng Zhong, and Yue Zhang. 2021. Autonomous Penetration Testing Based on Improved Deep Q-Network. Applied Sciences 11, 19 (2021). https://doi.org/10.3390/app11198823

Index Terms

Real-Time Defensive Strategy Selection via Deep Reinforcement Learning
1. Computing methodologies
  1. Machine learning
  2. Modeling and simulation
2. Security and privacy
  1. Network security
  2. Systems security

Recommendations

Deep Reinforcement Learning-Based Defense Strategy Selection
ARES '22: Proceedings of the 17th International Conference on Availability, Reliability and Security

Deception and Moving Target Defense techniques are two types of approaches that aim to increase the cost of the attacks by providing false information or uncertainty to the attacker’s perception. Given the growing number of these strategies and the fact ...
Agent manipulator: Stealthy strategy attacks on deep reinforcement learning
Abstract
Deep reinforcement learning (DRL) is a primary machine learning approach for solving sequential decision problems. To exploit the potential vulnerabilities of DRL, we propose a poisoning attack method that injects a backdoor for the DRL model by ...
Insider Threat Mitigation Using Moving Target Defense and Deception
MIST '17: Proceedings of the 2017 International Workshop on Managing Insider Security Threats

The insider threat has been subject of extensive study and many approaches from technical perspective to behavioral perspective and psychological perspective have been proposed to detect or mitigate it. However, it still remains one of the most ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ARES '23: Proceedings of the 18th International Conference on Availability, Reliability and Security

August 2023

1440 pages

ISBN:9798400707728

DOI:10.1145/3600160

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 August 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

ARES 2023

ARES 2023: The 18th International Conference on Availability, Reliability and Security

August 29 - September 1, 2023

Benevento, Italy

Acceptance Rates

Overall Acceptance Rate 228 of 451 submissions, 51%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
139
Total Downloads

Downloads (Last 12 months)102
Downloads (Last 6 weeks)2

Reflects downloads up to 06 Nov 2024

Other Metrics

View Author Metrics

Citations

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents