skip to main content
research-article

A novel combinatorial optimization based feature selection method for network intrusion detection

Published: 01 March 2021 Publication History

Abstract

The advancements in communication technologies and ubiquitous accessibility to a wide array of services has opened many challenges. Growing numbers of cyberattacks show that current security solutions and technologies do not provide effective safeguard against modern attacks. Intrusion is one of the main issue that has gone viral and can compromise the security of a network of any size. Intrusion Detection / Prevention Systems (IDS / IPS) are used to monitor, inspect and possibly block attacks. However, traditional intrusion detection techniques like signature or anomaly (network behavior) based approaches are prone to many weaknesses. Advancements in machine learning algorithms, data mining and soft computing techniques have shown potential to be used in IDS. All of these technologies, specially machine learning algorithms have to deal with the issue of high dimensionality of data /network traffic data as high dimensional data makes data sparse in hyper-space which restricts different algorithms scaling and generalization capabilities. Secondly, the problem magnitude also grows exponentially when IDS needs to make decision in a real time environment. One of the solution is to tackle this issue is to use feature selection techniques to reduce dimensionality of data. Feature selection is a process of selecting the optimal subset of features from a large feature-set to improve classification accuracy, performance and cost of extracting features. In this paper, we proposed a wrapper-based feature selection method called ’Tabu Search - Random Forest (TS-RF)’. Tabu search is used as a search method while random forest is used as a learning algorithm for Network Intrusion Detection Systems (NIDS). The proposed model is tested on the UNSW-NB15 dataset. The obtained results compared with other feature selection approaches. Results show that TS-RF improves classification accuracy while reducing number of features and false positive rate simultaneously.

References

[1]
M. Al-Zewairi, S. Almajali, A. Awajan, Experimental Evaluation of a Multi-layer Feed-forward Artificial Neural Network Classifier for Network Intrusion Detection System, 2017 International Conference on New Trends in Computing Sciences (ICTCS), IEEE, 2017, pp. 167–172.
[2]
M.H. Ali, B.A.D.A. Mohammed, A. Ismail, M.F. Zolkipli, A new intrusion detection system based on fast learning network and particle swarm optimization, IEEE Access 6 (2018) 20255–20261.
[3]
E. Atashpaz-Gargari, M.S. Reis, U.M. Braga-Neto, J. Barrera, E.R. Dougherty, A fast branch-and-bound algorithm for u-curve feature selection, Pattern Recognit 73 (2018) 172–188.
[4]
R.E. Banfield, L.O. Hall, K.W. Bowyer, W.P. Kegelmeyer, A comparison of decision tree ensemble creation techniques, IEEE Trans Pattern Anal Mach Intell 29 (1) (2007) 173–180.
[5]
B. Bauer, M. Kohler, et al., On deep learning as a remedy for the curse of dimensionality in nonparametric regression, Ann Stat 47 (4) (2019) 2261–2285.
[6]
J. Cheng, U.M. Fayyad, K.B. Irani, Z. Qian, Improved Decision Trees: A Generalized Version of Id3, Machine Learning Proceedings 1988, Elsevier, 1988, pp. 100–106.
[7]
S. Chopra, R. Hadsell, Y. LeCun, et al., Learning a Similarity Metric Discriminatively, with Application to Face Verification, CVPR (1), 2005, pp. 539–546.
[8]
M. Dash, K. Choi, P. Scheuermann, H. Liu, Feature Selection for Clustering-a Filter Solution, 2002 IEEE International Conference on Data Mining, 2002. Proceedings, IEEE, 2002, pp. 115–122.
[10]
H. Debar, M. Dacier, A. Wespi, A revised taxonomy for intrusion-detection systems, Annales des télécommunications 55 (2000) 361–378.
[11]
S. Dharmapurikar, P. Krishnamurthy, T. Sproull, J. Lockwood, Deep Packet Inspection Using Parallel Bloom Filters, 11th Symposium on High Performance Interconnects, 2003. Proceedings., IEEE, 2003, pp. 44–51.
[12]
A.S. Eesa, Z. Orman, A.M.A. Brifcani, A new feature selection model based on id3 and bees algorithm for intrusion detection system, Turkish Journal of Electrical Engineering & Computer Sciences 23 (2) (2015) 615–622.
[13]
A.S. Eesa, Z. Orman, A.M.A. Brifcani, A novel feature-selection approach based on the cuttlefish optimization algorithm for intrusion detection systems, Expert Syst Appl 42 (5) (2015) 2670–2679.
[14]
Feldmann A., Gasser O., Lichtblau F., Pujol E., Poese I., Dietzel C., Wagner D., Wichtlhuber M., Tapidor J., Vallina-Rodriguez N., et al. The lockdown effect: Implications of the covid-19 pandemic on internet traffic. 2020. ArXiv preprint arXiv:2008.10959.
[15]
Z. Feng, L. Mo, M. Li, A Random Forest-based Ensemble Method for Activity Recognition, 2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), IEEE, 2015, pp. 5074–5077.
[16]
S.J. Genereux, A.K. Lai, C.O. Fowles, V.R. Roberge, G.P. Vigeant, J.R. Paquet, Maidens: mil-std-1553 anomaly-based intrusion detection system using time-based histogram comparison, IEEE Trans Aerosp Electron Syst 56 (1) (2019) 276–284.
[17]
F. Glover, Tabu searchpart i, ORSA Journal on Computing 1 (3) (1989) 190–206,.
[18]
F. Glover, Tabu searchpart ii, ORSA Journal on computing 2 (1) (1990) 4–32.
[19]
J. Castañeda Gonzalez, A. Alvarez-Meza, A. Orozco-Gutierrez, An Enhanced Sequential Search Feature Selection Based on Mrmr to Support Fcd Localization, Iberoamerican Congress on Pattern Recognition, Springer, 2018, pp. 487–495.
[20]
S. Guha, S.S. Yau, A.B. Buduru, Attack Detection in Cloud Infrastructures Using Artificial Neural Network with Genetic Feature Selection, 2016 IEEE 14th Intl Conf on Dependable, Autonomic and Secure Computing, 14th Intl Conf on Pervasive Intelligence and Computing, 2nd Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress (DASC/PiCom/DataCom/CyberSciTech), IEEE, 2016, pp. 414–419.
[21]
A. Hadri, K. Chougdali, R. Touahni, Intrusion Detection System Using Pca and Fuzzy Pca Techniques, 2016 International Conference on Advanced Communication Systems and Information Security (ACOSIS), IEEE, 2016, pp. 1–7.
[22]
X. He, D. Cai, P. Niyogi, Laplacian Score for Feature Selection, Advances in neural information processing systems, 2006, pp. 507–514.
[23]
Hindy H., Brosset D., Bayne E., Seeam A., Tachtatzis C., Atkinson R., Bellekens X. A taxonomy and survey of intrusion detection system design techniques, network threats and datasets. 2018. arXiv:1806.03517.
[24]
Hore S., Raychaudhuri K. Cyber Espionagean Ethical Analysis. In: Innovations in Computational Intelligence and Computer Vision. Springer. p. 34–40.
[25]
F. Jiménez, G. Sánchez, J.M. García, G. Sciavicco, L. Miralles, Multi-objective evolutionary feature selection for online sales forecasting, Neurocomputing 234 (2017) 75–92.
[26]
G.H. John, R. Kohavi, K. Pfleger, Irrelevant Features and the Subset Selection Problem, Machine Learning Proceedings 1994, Elsevier, 1994, pp. 121–129.
[27]
N. Kambhatla, T.K. Leen, Dimension reduction by local principal component analysis, Neural Comput 9 (7) (1997) 1493–1516.
[28]
S.M. Kasongo, Y. Sun, A deep learning method with wrapper based feature extraction for wireless intrusion detection system, Computers & Security 92 (2020) 101752.
[29]
Kaspersky:. antivirus fundamentals: Viruses, signatures, disinfection. 2018. https://www.kaspersky.com/blog/signature-virus-disinfection/13233/, accessed:-05-16.
[30]
C. Khammassi, S. Krichen, A ga-lr wrapper approach for feature selection in network intrusion detection, computers & security 70 (2017) 255–277.
[31]
R.A. Khan, A. Crenn, A. Meyer, S. Bouakaz, A novel database of children’s spontaneous facial expressions (liris-cse), Image Vis Comput 83 (2019) 61–69.
[32]
R.A. Khan, A. Meyer, H. Konik, S. Bouakaz, Human Vision Inspired Framework for Facial Expressions Recognition, 2012 19th IEEE International Conference on Image Processing, 2012, pp. 2593–2596,.
[33]
R.A. Khan, A. Meyer, H. Konik, S. Bouakaz, Framework for reliable, real-time facial expression recognition for low resolution images, Pattern Recognit Lett 34 (10) (2013) 1159–1168.
[34]
R.A. Khan, A. Meyer, H. Konik, S. Bouakaz, Saliency-based framework for facial expression recognition, Frontiers of Computer Science 13 (1) (2019) 183–198.
[35]
K. Kourou, T.P. Exarchos, K.P. Exarchos, M.V. Karamouzis, D.I. Fotiadis, Machine learning applications in cancer prognosis and prediction, Comput Struct Biotechnol J 13 (2015) 8–17.
[36]
V. Kumar, D. Sinha, A.K. Das, S.C. Pandey, R.T. Goswami, An integrated rule based intrusion detection system: analysis on unsw-nb15 data set and the real time online dataset, Cluster Comput 23 (2) (2020) 1397–1418.
[37]
H.J. Liao, C.H.R. Lin, Y.C. Lin, K.Y. Tung, Review: intrusion detection system: a comprehensive review, J Netw Comput Appl 36 (1) (2013) 16–24,.
[38]
M.W. Libbrecht, W.S. Noble, Machine learning applications in genetics and genomics, Nat. Rev. Genet. 16 (6) (2015) 321.
[39]
H. Liu, H. Motoda, Computational methods of feature selection, CRC Press, 2007.
[40]
P. Louvieris, N. Clewley, X. Liu, Effects-based feature identification for network intrusion detection, Neurocomputing 121 (2013) 265–273.
[41]
P. Mishra, E.S. Pilli, V. Varadharajan, U. Tupakula, Out-vm Monitoring for Malicious Network Packet Detection in Cloud, 2017 ISEA Asia Security and Privacy (ISEASP), IEEE, 2017, pp. 1–10.
[43]
S. Mohammadi, H. Mirvaziri, M. Ghazizadeh-Ahsaee, H. Karimipour, Cyber intrusion detection by combined feature selection algorithm, Journal of information security and applications 44 (2019) 80–88.
[44]
N. Moustafa, J. Slay, Unsw-nb15: A Comprehensive Data Set for Network Intrusion Detection Systems (Unsw-nb15 Network Data Set), 2015 Military Communications and Information Systems Conference (MilCIS), 2015, pp. 1–6,.
[45]
Moustafa N., Slay J. A hybrid feature selection for network intrusion detection systems: Central points. 2017. ArXiv preprint arXiv:1707.05505.
[46]
A.H. Muna, N. Moustafa, E. Sitnikova, Identification of malicious activities in industrial internet of things based on deep learning models, Journal of Information Security and Applications 41 (2018) 1–11.
[47]
R.L. Neupane, T. Neely, P. Calyam, N. Chettri, M. Vassell, R. Durairajan, Intelligent defense using pretense against targeted attacks in cloud platforms, Future Generation Computer Systems 93 (2019) 609–626.
[48]
P. Nskh, M.N. Varma, R.R. Naik, Principle Component Analysis Based Intrusion Detection System Using Support Vector Machine, 2016 IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology (RTEICT), IEEE, 2016, pp. 1344–1350.
[49]
L.E. Peterson, K-Nearest neighbor, Scholarpedia 4 (2) (2009) 1883.
[50]
M. Prasad, S. Tripathi, K. Dahal, An efficient feature selection based bayesian and rough set approach for intrusion detection, Appl Soft Comput 87 (2020) 105980.
[51]
P. Pudil, J. Novovičová, J. Kittler, Floating search methods in feature selection, Pattern Recognit Lett 15 (11) (1994) 1119–1125.
[52]
J.R. Quinlan, Induction of decision trees, Mach Learn 1 (1) (1986) 81–106.
[53]
M. Ring, S. Wunderlich, D. Scheuring, D. Landes, A. Hotho, A survey of network-based intrusion detection data sets, Computers & Security 86 (2019) 147–167.
[54]
B.D. Ripley, Pattern recognition and neural networks, Cambridge university press, 2007.
[55]
B. Selvakumar, K. Muneeswaran, Firefly algorithm based feature selection for network intrusion detection, Computers & Security 81 (2019) 148–155.
[56]
J. Sethna, Statistical mechanics: Entropy, order parameters, and complexity, vol. 14, Oxford University Press, 2006.
[57]
A.R. Syarif, W. Gata, Intrusion Detection System Using Hybrid Binary Pso and K-nearest Neighborhood Algorithm, 2017 11th International Conference on Information & Communication Technology and System (ICTS), IEEE, 2017, pp. 181–186.
[58]
Symantec, Internet Security Threat Report (Vol. 24), Tech. rep., Symentec Corporaton, 2019.
[59]
B.A. Tama, M. Comuzzi, K.H. Rhee, Tse-ids: a two-stage classifier ensemble for intelligent anomaly-based intrusion detection system, IEEE Access 7 (2019) 94497–94507.
[60]
M. Tavallaee, E. Bagheri, W. Lu, A.A. Ghorbani, A Detailed Analysis of the Kdd Cup 99 Data Set, Computational Intelligence for Security and Defense Applications, 2009, CISDA 2009. IEEE Symposium on, IEEE, 2009, pp. 1–6.
[61]
S. Tong, D. Koller, Support vector machine active learning with applications to text classification, Journal of machine learning research 2 (2001) 45–66.
[62]
M. Trevisan, D. Giordano, I. Drago, M.M. Munafò, M. Mellia, Five years at the edge: watching internet from the isp network, IEEE/ACM Trans. Networking 28 (2) (2020) 561–574.
[64]
S. Zhao, W. Li, T. Zia, A.Y. Zomaya, A Dimension Reduction Model and Classifier for Anomaly-based Intrusion Detection in Internet of Things, 2017 IEEE 15th Intl Conf on Dependable, Autonomic and Secure Computing, 15th Intl Conf on Pervasive Intelligence and Computing, 3rd Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress (DASC/PiCom/DataCom/CyberSciTech), IEEE, 2017, pp. 836–843.

Cited By

View all

Index Terms

  1. A novel combinatorial optimization based feature selection method for network intrusion detection
            Index terms have been assigned to the content through auto-classification.

            Recommendations

            Comments

            Information & Contributors

            Information

            Published In

            cover image Computers and Security
            Computers and Security  Volume 102, Issue C
            Mar 2021
            414 pages

            Publisher

            Elsevier Advanced Technology Publications

            United Kingdom

            Publication History

            Published: 01 March 2021

            Author Tags

            1. Intrusion detection
            2. Machine learning
            3. Feature selection
            4. Metaheuristics

            Qualifiers

            • Research-article

            Contributors

            Other Metrics

            Bibliometrics & Citations

            Bibliometrics

            Article Metrics

            • Downloads (Last 12 months)0
            • Downloads (Last 6 weeks)0
            Reflects downloads up to 05 Jan 2025

            Other Metrics

            Citations

            Cited By

            View all

            View Options

            View options

            Media

            Figures

            Other

            Tables

            Share

            Share

            Share this Publication link

            Share on social media