research-article

Malware classification with LSTM and GRU language models and a character-level CNN

Authors:

Ben Athiwaratkun,

Jack W. StokesAuthors Info & Claims

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Pages 2482 - 2486

https://doi.org/10.1109/ICASSP.2017.7952603

Published: 05 March 2017 Publication History

Abstract

Malicious software, or malware, continues to be a problem for computer users, corporations, and governments. Previous research [1] has explored training file-based, malware classifiers using a two-stage approach. In the first stage, a malware language model is used to learn the feature representation which is then input to a second stage malware classifier. In Pascanu et al. [1], the language model is either a standard recurrent neural network (RNN) or an echo state network (ESN). In this work, we propose several new malware classification architectures which include a long short-term memory (LSTM) language model and a gated recurrent unit (GRU) language model. We also propose using an attention mechanism similar to [12] from the machine translation literature, in addition to temporal max pooling used in [1], as an alternative way to construct the file representation from neural features. Finally, we propose a new single-stage malware classifier based on a character-level convolutional neural network (CNN). Results show that the LSTM with temporal max pooling and logistic regression offers a 31.3% improvement in the true positive rate compared to the best system in [1] at a false positive rate of 1%.

7. References

[1]

Razvan Pascanu, Jack W. Stokes, Hermineh Sanossian, Mady Marinescu, and Anil Thomas, “Malware classification with recurrent networks,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2015, pp. 1916–1920.

[2]

Microsoft Malware Protection Center, “Submit a sample,” https://www.microsoft.com/en-us/security/portal/submission/submit.aspx/, 2016.

[3]

George E. Dahl, Jack W. Stokes, Li Deng, and Dong Yu, “Large-scale malware classification using random projections and neural networks,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2013.

[4]

Joshua Saxe and Konstantin Berlin, “Deep neural network based malware detection using two dimensional binary program features,” arXiv preprint arXiv: 1508. 03096v2, 2015.

[5]

Wenyi Huang and Jack W. Stokes, “Mtnet: A multi-task neural network for dynamic malware classfication,” in Proceedings of Detection of Intrusions and Malware, and Vulnerability Assessment (DIMVA), 2016, pp. 399–418.

[6]

Seonhee Seok and Howon Kim, “Visualized malware classification based-on convolutional neural network,” in Proceedings of Korea Institutes of Information Security and Cryptology, 2016, pp. 197–208.

[7]

T. Mikolov, M Karafiat, L. Burget, J. Cernocky, and S Khundanpur, “Recurrent neural network based language model,” in Proceedings of Interspeech, 2010.

[8]

H. Jaeger and H. Haas, “Harnessing nonlinearity: Predicting chaotic systems and saving energy in wireless communication,” in Science, 2004.

[9]

Junyoung Chung, Caglar Gulcehre, KyungHyun Cho, and Yoshua Bengio, “Empirical evaluation of gated recurrent neural networks on sequence modeling,” in NIPS 2014 Deep Learning and Representation Learning Workshop, 2014.

[10]

Sepp Hochreiter and Jurgen Schmidhuber, “Long short-term memory,” in Proceedings of Neural Computation, 1997, pp. 1735–1780.

[11]

Kyunghyun Cho, Bart van Merrienboer, Dzmitry Bahdanau, and Yoshua Bengio, “On the properties of neural machine translation: Encoder-decoder approaches,” in Proceedings of the Workshop on Syntax, Semantics and Structure in Statistical Translation (SSST), 2014.

[12]

Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio, “Neural machine translation by jointly learning to align and translate,” in Proceedings of the International Conference on Learning Representations (ICLR), 2015.

[13]

Y. Bengio, N. Boulanger-Lewandowski, and R. Pascanu, “Advances in optimizing recurrent networks,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2013.

[14]

Mike Schuster and Kuldip K. Paliwal, “Bidirectional recurrent neural networks,” IEEE Transactions on Signal Processing, vol. 45, pp. 2673–2681, November 1997.

Digital Library

[15]

Keras Development Team, “Keras: Deep learning library for theano and tensorflow,” https://keras.io/, 2016.

[16]

Vinod Nair and Geoffrey E Hinton, “Rectified linear units improve restricted boltzmann machines,” in Proceedings of the International Conference on Machine Learning (ICML), 2010, pp. 807–814.

[17]

Xiang Zhang, Junbo Zhao, and Yann LeCun, “Character-level convolutional networks for text classification,” in Advances in Neural Information Processing Systems (NIPS), C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett, Eds., pp. 649–657. Curran Associates, Inc., 2015.

[18]

N. Idika and A.P. Mathur, “A survey of malware detection techniques,” Tech. Rep., Purdue Univ., February 2007.

[19]

Theano Development Team, “Theano: A Python framework for fast computation of mathematical expressions,” ar Xiv e-prints, vol. abs/1605.02688, May 2016.

[20]

Jeffrey O. Kephart, “A biologically inspired immune system for computers” in In Artificial Life IV: Proceedings of the Fourth International Workshop on the Synthesis and Simulation of Living Systems. 1994, pp. 130–139, MIT Press.

[21]

M.G. Schultz, Eleazar Eskin, E. Zadok, and S. Stolfo, “Data mining methods of detection of new malicious executables,” in Proceedings of the 2001 IEEE Symposium on Security and Privacy, 2001, pp. 38–49.

[22]

J.Z. Kolter and M.A. Maloof, “Learning to detect and classify malicious executables in the wild,” in Journal of Machine Learning Research, 2006, pp. 2721–2744.

[23]

Wenke Lee, Saivatore J. Stolfo, and Kui W. Mok, “A Data Mining Framework for Building Intrusion Detection Models,” Proceedings of the IEEE Symposium on Security and Privacy (SP), pp. 120–132, 1999.

Cited By

Kim GYang SKim DKim SChoi JKu MLim SPark H(2024)Bayesian-based uncertainty-aware tool-wear prediction model in end-milling process of titanium alloyApplied Soft Computing10.1016/j.asoc.2023.110922148:COnline publication date: 27-Feb-2024
https://dl.acm.org/doi/10.1016/j.asoc.2023.110922
Li XYan ZZhang SLi XWang F(2023)Formal Characterization of Malicious Code in Power Information Systems Based on Multidimensional Feature FusionProceedings of the 8th International Conference on Cyber Security and Information Engineering10.1145/3617184.3630149(274-281)Online publication date: 22-Sep-2023
https://dl.acm.org/doi/10.1145/3617184.3630149
Hussain ZNurminen JRanta-Aho P(2023)Learning the Structure of Commands by Detecting Random Tokens Using Markov ModelProceedings of the 2023 8th International Conference on Machine Learning Technologies10.1145/3589883.3589892(61-67)Online publication date: 10-Mar-2023
https://dl.acm.org/doi/10.1145/3589883.3589892
Show More Cited By

Index Terms

Malware classification with LSTM and GRU language models and a character-level CNN

Index terms have been assigned to the content through auto-classification.

Recommendations

CNN-LSTM and transfer learning models for malware classification based on opcodes and API calls
Abstract
In this paper, we propose a novel model for a malware classification system based on Application Programming Interface (API) calls and opcodes, to improve classification accuracy. This system uses a novel design of combined Convolutional Neural ...
Use CNN-LSTM network to analyze secondary market data
ICIAI '18: Proceedings of the 2nd International Conference on Innovation in Artificial Intelligence

In the secondary market, analysis method is mainly based on the statistical and artificial modeling method. We proposed to use neural network to analysis secondary market financial data. First of all, puts forward the idea of using the neural network to ...
Combining Word Order and CNN-LSTM for Sentence Sentiment Classification
ICSEB '17: Proceedings of the 2017 International Conference on Software and e-Business

Neural network models have been demonstrated to be capable of achieving state-of-the-art performance in sentence sentiment classification. Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) are two widely used neural network ...

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings

2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

Mar 2017

6527 pages

Copyright © 2017.

Publisher

IEEE Press

Publication History

Published: 05 March 2017

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

11
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 13 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Kim GYang SKim DKim SChoi JKu MLim SPark H(2024)Bayesian-based uncertainty-aware tool-wear prediction model in end-milling process of titanium alloyApplied Soft Computing10.1016/j.asoc.2023.110922148:COnline publication date: 27-Feb-2024
https://dl.acm.org/doi/10.1016/j.asoc.2023.110922
Li XYan ZZhang SLi XWang F(2023)Formal Characterization of Malicious Code in Power Information Systems Based on Multidimensional Feature FusionProceedings of the 8th International Conference on Cyber Security and Information Engineering10.1145/3617184.3630149(274-281)Online publication date: 22-Sep-2023
https://dl.acm.org/doi/10.1145/3617184.3630149
Hussain ZNurminen JRanta-Aho P(2023)Learning the Structure of Commands by Detecting Random Tokens Using Markov ModelProceedings of the 2023 8th International Conference on Machine Learning Technologies10.1145/3589883.3589892(61-67)Online publication date: 10-Mar-2023
https://dl.acm.org/doi/10.1145/3589883.3589892
Jiang WYi ZWang LZhang HZhang JLin FYang CFrommholz IHopfgartner FLee MOakes MLalmas MZhang MSantos R(2023)A Stochastic Online Forecast-and-Optimize Framework for Real-Time Energy Dispatch in Virtual Power Plants under UncertaintyProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3614653(4646-4652)Online publication date: 21-Oct-2023
https://dl.acm.org/doi/10.1145/3583780.3614653
Balan GSimion CGavriluţ DLuchian H(2023)Feature mining and classifier selection for API calls-based malware detectionApplied Intelligence10.1007/s10489-023-05086-253:23(29094-29108)Online publication date: 1-Dec-2023
https://dl.acm.org/doi/10.1007/s10489-023-05086-2
Phan TDuc Luong THoang Quoc An NNguyen Huu QNghi HPham V(2022)Leveraging Reinforcement Learning and Generative Adversarial Networks to Craft Mutants of Windows Malware against Black-box Malware DetectorsProceedings of the 11th International Symposium on Information and Communication Technology10.1145/3568562.3568636(31-38)Online publication date: 1-Dec-2022
https://dl.acm.org/doi/10.1145/3568562.3568636
Trizna DDemontis AChen XTramèr F(2022)Quo Vadis: Hybrid Machine Learning Meta-Model Based on Contextual and Behavioral Malware RepresentationsProceedings of the 15th ACM Workshop on Artificial Intelligence and Security10.1145/3560830.3563726(127-136)Online publication date: 11-Nov-2022
https://dl.acm.org/doi/10.1145/3560830.3563726
Obaidat ISridhar MPham KPhung P(2022)JadeiteComputers and Security10.1016/j.cose.2021.102547113:COnline publication date: 1-Feb-2022
https://dl.acm.org/doi/10.1016/j.cose.2021.102547
Chakraborttii CLitz HWassermann BMalka MChidambaram VRaz D(2021)Reducing write amplification in flash by death-time prediction of logical block addressesProceedings of the 14th ACM International Conference on Systems and Storage10.1145/3456727.3463784(1-12)Online publication date: 14-Jun-2021
https://dl.acm.org/doi/10.1145/3456727.3463784
Abdelsalam MGupta MMittal SGupta MAbdelsalam MMittal S(2021)Artificial Intelligence Assisted Malware AnalysisProceedings of the 2021 ACM Workshop on Secure and Trustworthy Cyber-Physical Systems10.1145/3445969.3450433(75-77)Online publication date: 28-Apr-2021
https://dl.acm.org/doi/10.1145/3445969.3450433
Show More Cited By

View Options

View options

Media

Figures

Other

Tables

View Table of Contents