research-article

Log Analysis For Network Attack Detection Using Deep Learning Models

Authors:

Huu Phong Pham,

Viet Hung Nguyen,

Minh Hieu NguyenAuthors Info & Claims

SOICT '23: Proceedings of the 12th International Symposium on Information and Communication Technology

Pages 731 - 738

https://doi.org/10.1145/3628797.3628943

Published: 07 December 2023 Publication History

Abstract

System logs play a vital role in upholding information security by capturing events to address potential risks. Numerous research initiatives have harnessed log data to create machine learning models geared towards spotting unusual activities within systems. In this pragmatic study, we introduce an innovative approach to detecting anomalies in log data, employing a three-step process encompassing preprocessing, advanced natural language processing (NLP) utilizing BERT, and a custom 1D-CNN classification model. During the preprocessing phase, we tokenize the data and eliminate non-essential elements, while BERT enriches log message representations. Our Sliding Window and Overlapping Mechanism ensures consistent input dimensions. The 1D-CNN model extracts temporal features for robust anomaly detection. Empirical findings on HFDS, BGL, Spirit, and Thunderbird datasets illustrate that our method outperforms prior approaches in identifying network attacks.

References

[1]

M. Du, F. Li, G. Zheng, and V. Srikumar, “Deeplog: Anomaly detection and diagnosis from system logs through deep learning,” in Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. ACM, 2017, pp. 1285–1298.

Digital Library

[2]

Weibin Meng, Ying Liu, Yichen Zhu, Shenglin Zhang, Dan Pei, Yuqing Liu, Yihao Chen, Ruizhi Zhang, Shimin Tao, Pei Sun, 2019. LogAnomaly: Unsupervised Detection of Sequential and Quantitative Anomalies in Unstructured Logs. In IJCAI, Vol. 7. 4739–4745.

[3]

Amir Farzad and T Aaron Gulliver. 2020. Unsupervised log message anomaly detection. ICT Express 6, 3 (2020), 229–237.

[4]

Xu Zhang, Yong Xu, Qingwei Lin, Bo Qiao, Hongyu Zhang, Yingnong Dang, Chunyu Xie, Xinsheng Yang, Qian Cheng, Ze Li, 2019. Robust log-based anomaly detection on unstable log data. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 807–817.

Digital Library

[5]

Weibin Meng, Ying Liu, Yichen Zhu, Shenglin Zhang, Dan Pei, Yuqing Liu, Yihao Chen, Ruizhi Zhang, Shimin Tao, Pei Sun, 2019. LogAnomaly: Unsupervised Detection of Sequential and Quantitative Anomalies in Unstructured Logs. In IJCAI, Vol. 7. 4739–4745.

[6]

Siyang Lu, Xiang Wei, Yandong Li, and Liqiang Wang. 2018. Detecting anomaly in big data system logs using convolutional neural network. In 2018 IEEE 16th Intl Conf on Dependable, Autonomic and Secure Computing, 16th Intl Conf on Pervasive Intelligence and Computing, 4th Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress (DASC/PiCom/DataCom/CyberSciTech). IEEE, 151–158.

[7]

V.H. L. a. H. Zhang, "Log-based Anomaly Detection Without Log Parsing," arXiv:2108.01955v3, 2021.O’

[8]

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,” arXiv preprint arXiv:1706.03762, 2017.

[9]

A. J. W. K. Meliboev Azizjon, "1D CNN based network intrusion detection with normalization on imbalanced data," 2020

[10]

M. Schuster and K. Nakajima, “Japanese and korean voice search,” in 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2012, pp. 5149–5152.

[11]

Y. Wu, M. Schuster, Z. Chen, Q. V. Le, M. Norouzi, W. Macherey, M. Krikun, Y. Cao, Q. Gao, K. Macherey, “Google's neural machine translation system: Bridging the gap between human and machine translation,” arXiv preprint arXiv:1609.08144, 2016.

[12]

J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805, 2018.

[13]

V. Sanh, L. Debut, J. Chaumond, and T. Wolf, “Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter,” arXiv preprint arXiv:1910.01108, 2019.

[14]

K. Clark, M.-T. Luong, Q. V. Le, and C. D. Manning, “Electra: Pretraining text encoders as discriminators rather than generators,” arXiv preprint arXiv:2003.10555, 2020.

[15]

Q. Le and T. Mikolov, “Distributed representations of sentences and documents,” in International conference on machine learning. PMLR, 2014, pp. 1188–1196.

[16]

J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805, 2018.

[17]

(2021) Bert pretrained models. [Online]. Available: https://github.com/google-research/bert

[18]

(2021) Loghub. [Online]. Available: https://github.com/logpai/loghub

[19]

Y. Wu, M. Schuster, Z. Chen, Q. V. Le, M. Norouzi, W. Macherey, M. Krikun, Y. Cao, Q. Gao, K. Macherey, “Google's neural machine translation system: Bridging the gap between human and machine translation,” arXiv preprint arXiv:1609.08144, 2016.

[20]

W. Xu, L. Huang, A. Fox, D. Patterson, and M. I. Jordan, “Detecting large-scale system problems by mining console logs,” in Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles, 2009, pp. 117–132.

Digital Library

[21]

A. Oliner and J. Stearley, “What supercomputers say: A study of five system logs,” in 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’07). IEEE, 2007, pp. 575– 584.

Digital Library

[22]

S. He, J. Zhu, P. He, and M. R. Lyu, “Loghub: a large collection of system log datasets towards automated log analytics,” arXiv preprint arXiv:2008.06448, 2020.

Index Terms

Log Analysis For Network Attack Detection Using Deep Learning Models
1. Security and privacy
  1. Intrusion/anomaly detection and malware mitigation
    1. Intrusion detection systems

Recommendations

Log-based anomaly detection with deep learning: how far are we?
ICSE '22: Proceedings of the 44th International Conference on Software Engineering

Software-intensive systems produce logs for troubleshooting purposes. Recently, many deep learning models have been proposed to automatically detect system anomalies based on log data. These models typically claim very high detection accuracy. For ...
Deep Learning or Classical Machine Learning? An Empirical Study on Log-Based Anomaly Detection
ICSE '24: Proceedings of the IEEE/ACM 46th International Conference on Software Engineering

While deep learning (DL) has emerged as a powerful technique, its benefits must be carefully considered in relation to computational costs. Specifically, although DL methods have achieved strong performance in log anomaly detection, they often require ...
Robust log-based anomaly detection on unstable log data
ESEC/FSE 2019: Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering

Logs are widely used by large and complex software-intensive systems for troubleshooting. There have been a lot of studies on log-based anomaly detection. To detect the anomalies, the existing methods mainly construct a detection model using log event ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

SOICT '23: Proceedings of the 12th International Symposium on Information and Communication Technology

December 2023

1058 pages

ISBN:9798400708916

DOI:10.1145/3628797

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 December 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article
Research
Refereed limited

Conference

SOICT 2023

SOICT 2023: The 12th International Symposium on Information and Communication Technology

December 7 - 8, 2023

Ho Chi Minh, Vietnam

Acceptance Rates

Overall Acceptance Rate 147 of 318 submissions, 46%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
76
Total Downloads

Downloads (Last 12 months)76
Downloads (Last 6 weeks)4

Reflects downloads up to 24 Dec 2024

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents