udCATS: A Comprehensive Unsupervised Deep Learning Framework for Detecting Collective Anomalies in Time Series
Truong Son Pham, Viet Hung Nguyen, Anh Thang Le, Van Duong Bui
DOI: http://dx.doi.org/10.15439/2022R04
Citation: Proceedings of the 2022 Seventh International Conference on Research in Intelligent and Computing in Engineering, Vu Dinh Khoa, Shivani Agarwal, Gloria Jeanette Rincon Aponte, Nguyen Thi Hong Nga, Vijender Kumar Solanki, Ewa Ziemba (eds). ACSIS, Vol. 33, pages 201–206 (2022)
Abstract. Anomaly detection has recently gained enormous attention from the research community. It is widely applied in many industrial areas, such as information security, financing, banking, and insurance. The data in these fields can mainly be represented as time series data, the corollary being that time series anomaly detection plays an essential role in these applications. Therefore, many authors have tried to solve the problem of collective anomaly detection in time series. They have proposed several approaches, from classical methods such as Isolation Forests to modern deep learning networks such as Autoencoders. However, a comprehensive framework for handling this problem is still lacking. In this work, firstly, we propose using an Attention-based Bidirectional LSTM Autoencoder (Att-BiLSTM-AE) as an anomaly detection model. Furthermore, in the essential part of this paper, we developed a comprehensive unsupervised deep learning framework, udCATS, to solve the problem of detecting collective anomalies in time series. Our experiments show that the Att-BiLSTM-AE outperforms other detection models, and using it within the udCATS framework increases the detection accuracy.
References
- Judith D Singer and John B Willett. It’s about time: Using discrete-time survival analysis to study duration and the timing of events. Journal of educational statistics, 18(2):155–195, 1993.
- Douglas M Hawkins. Identification of outliers, volume 11. Springer, 1980.
- Mohammad Braei and Sebastian Wagner. Anomaly detection in univariate time-series: A survey on the state-of-the-art. arXiv preprint https://arxiv.org/abs/2004.00433, 2020.
- Andrew A Cook, Göksel Mısırlı, and Zhong Fan. Anomaly detection for iot time-series data: A survey. IEEE Internet of Things Journal, 7(7):6481–6494, 2019.
- John Cristian Borges Gamboa. Deep learning for time-series analysis. arXiv preprint https://arxiv.org/abs/1701.01887, 2017.
- Mohsin Munir, Shoaib Ahmed Siddiqui, Andreas Dengel, and Sheraz Ahmed. Deepant: A deep learning approach for unsupervised anomaly detection in time series. Ieee Access, 7:1991–2005, 2018.
- Loı̈c Bontemps, Van Loi Cao, James McDermott, and Nhien-An Le-Khac. Collective anomaly detection based on long short-term memory recurrent neural networks. In International conference on future data and security engineering, pages 141–152. Springer, 2016.
- Eamonn Keogh and Jessica Lin. Clustering of time-series subsequences is meaningless: implications for previous and future research. Knowledge and information systems, 8(2):154–177, 2005.
- Mete Çelik, Filiz Dadaşer-Çelik, and Ahmet Şakir Dokuz. Anomaly detection in temperature data using dbscan algorithm. In 2011 international symposium on innovations in intelligent systems and applications, pages 91–95. IEEE, 2011.
- Wentai Wu, Ligang He, and Weiwei Lin. Local trend inconsistency: a prediction-driven approach to unsupervised anomaly detection in multi-seasonal time series. arXiv preprint https://arxiv.org/abs/1908.01146, 2019.
- Mayu Sakurada and Takehisa Yairi. Anomaly detection using autoencoders with nonlinear dimensionality reduction. In Proceedings of the MLSDA 2014 2nd workshop on machine learning for sensory data analysis, pages 4–11, 2014.
- Aristidis Likas, Nikos Vlassis, and Jakob J Verbeek. The global k-means clustering algorithm. Pattern recognition, 36(2):451–461, 2003.
- Martin Ester, Hans-Peter Kriegel, Jörg Sander, Xiaowei Xu, et al. A density-based algorithm for discovering clusters in large spatial databases with noise. In kdd, volume 96, pages 226–231, 1996.
- Erich Schubert, Jörg Sander, Martin Ester, Hans Peter Kriegel, and Xiaowei Xu. Dbscan revisited, revisited: why and how you should (still) use dbscan. ACM Transactions on Database Systems (TODS), 42(3):1–21, 2017.
- Seyedjamal Zolhavarieh, Saeed Aghabozorgi, and Ying Wah Teh. A review of subsequence time series clustering. The Scientific World Journal, 2014, 2014.
- Sepp Hochreiter and Jürgen Schmidhuber. Long short-term memory. Neural computation, 9(8):1735–1780, 1997.
- Stephen D. Bay, Dennis F. Kibler, Michael J. Pazzani, and Padhraic Smyth. UCI machine learning repository, 1999.
- Miodrag Lovrić, Marina Milanović, and Milan Stamenković. Algoritmic methods for segmentation of time series: An overview. Journal of Contemporary Economic and Business Issues, 1(1):31–53, 2014.
- Xiaoyue Wang, Abdullah Mueen, Hui Ding, Goce Trajcevski, Peter Scheuermann, and Eamonn Keogh. Experimental comparison of representation methods and distance measures for time series data. Data Mining and Knowledge Discovery, 26(2):275–309, 2013.
- Chotirat Ratanamahatana, Eamonn Keogh, Anthony J Bagnall, and Stefano Lonardi. A novel bit level time series representation with implication of similarity search and clustering. In Pacific-Asia conference on knowledge discovery and data mining, pages 771–777. Springer, 2005.
- Pallavi Pandey and Avinash Navlani. Feature scaling: Minmax, standard and robust scaler, Nov 2020.
- Sanket Mishra, Varad Kshirsagar, Rohit Dwivedula, and Chittaranjan Hota. Attention-based bi-lstm for anomaly detection on time-series data. In International Conference on Artificial Neural Networks, pages 129–140. Springer, 2021.
- Mahmoud Said Elsayed, Nhien-An Le-Khac, Soumyabrata Dev, and Anca Delia Jurcut. Network anomaly detection using lstm based autoencoder. In Proceedings of the 16th ACM Symposium on QoS and Security for Wireless and Mobile Networks, pages 37–45, 2020.
- HD Nguyen, Kim Phuc Tran, Sébastien Thomassey, and Moez Hamad. Forecasting and anomaly detection approaches using lstm and lstm autoencoder techniques with the applications in supply chain management. International Journal of Information Management, 57:102282, 2021.
- Ashima Chawla, Paul Jacob, Brian Lee, and Sheila Fallon. Bidirectional lstm autoencoder for sequence based anomaly detection in cyber security. International Journal of Simulation–Systems, Science & Technology, 2019.
- Ariyo Oluwasanmi, Muhammad Umar Aftab, Edward Baagyere, Zhiguang Qin, Muhammad Ahmad, and Manuel Mazzara. Attention autoencoder for generative latent representational learning in anomaly detection. Sensors, 22(1):123, 2021.
- Jing Wang, Guigen Nie, Shengjun Gao, Shuguang Wu, Haiyang Li, and Xiaobing Ren. Landslide deformation prediction based on a gnss time series analysis and recurrent neural network model. Remote Sensing, 13(6):1055, 2021.
- N Laptev and S Amizadeh. Yahoo anomaly detection dataset s5. URL http://webscope. sandbox. yahoo. com/catalog. php, 2015.