skip to main content
10.1145/3629527.3651844acmconferencesArticle/Chapter ViewAbstractPublication PagesicpeConference Proceedingsconference-collections
short-paper
Open access

Grammar-Based Anomaly Detection of Microservice Systems Execution Traces

Published: 07 May 2024 Publication History

Abstract

Microservice architectures are a widely adopted architectural pattern for large-scale applications. Given the large adoption of these systems, several works have been proposed to detect performance anomalies starting from analysing the execution traces. However, most of the proposed approaches rely on machine learning (ML) algorithms to detect anomalies. While ML methods may be effective in detecting anomalies, the training and deployment of these systems as been shown to be less efficient in terms of time, computational resources, and energy required.
In this paper, we propose a novel approach based on Context-free grammar for anomaly detection of microservice systems execution traces. We employ the SAX encoding to transform execution traces into strings. Then, we select strings encoding anomalies, and for each possible anomaly, we build a Context-free grammar using the Sequitur grammar induction algorithm. We test our approach on two real-world datasets and compare it with a Logistic Regression classifier. We show how our approach is more effective in terms of training time of ~15 seconds with a minimum loss in effectiveness of ~5% compared to the Logistic Regression baseline.

References

[1]
Nuha Alshuqayran, Nour Ali, and Roger Evans. 2016. A systematic mapping study in microservice architecture. In 2016 IEEE 9th International Conference on Service-Oriented Computing and Applications (SOCA). IEEE, 44--51.
[2]
Mohamed Aly. 2005. Survey on multiclass classification methods. Neural Netw, Vol. 19, 1 (2005), 9. Publisher: Citeseer.
[3]
Mohammad Rifat Arefin, Suraj Shetiya, Zili Wang, and Christoph Csallner. 2024. Fast Deterministic Black-box Context-free Grammar Inference. arxiv: 2308.06163 [cs.SE]
[4]
Chetan Bansal, Sundararajan Renganathan, Ashima Asudani, Olivier Midy, and Mathru Janakiraman. 2020. DeCaf: diagnosing and triaging performance issues in large-scale cloud services. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering: Software Engineering in Practice (Seoul, South Korea) (ICSE-SEIP '20). Association for Computing Machinery, New York, NY, USA, 201--210. https://doi.org/10.1145/3377813.3381353
[5]
Steven Bird. 2006. NLTK: the natural language toolkit. In Proceedings of the COLING/ACL 2006 Interactive Presentation Sessions. 69--72.
[6]
Konstantinos Bountrogiannis, George Tzagkarakis, and Panagiotis Tsakalides. 2021. Anomaly Detection for Symbolic Time Series Representations of Reduced Dimensionality. In 2020 28th European Signal Processing Conference (EUSIPCO). 2398--2402. https://doi.org/10.23919/Eusipco47968.2020.9287474
[7]
Michael Buckland and Fredric Gey. 1994. The relationship between recall and precision. Journal of the American society for information science, Vol. 45, 1 (1994), 12--19. Publisher: Wiley Online Library.
[8]
Diogo V Carvalho, Eduardo M Pereira, and Jaime S Cardoso. 2019. Machine learning interpretability: A survey on methods and metrics. Electronics, Vol. 8, 8 (2019), 832.
[9]
Andrea D'Angelo and Giordano d'Aloisio. 2024. Grammar-Based Anomaly Detection of Microservice Systems Execution Traces Replication Package. https://doi.org/10.5281/zenodo.10806012
[10]
Qingfeng Du, Tiandi Xie, and Yu He. 2018. Anomaly detection and diagnosis for container-based microservices with performance monitoring. In Algorithms and Architectures for Parallel Processing: 18th International Conference, ICA3PP 2018, Guangzhou, China, November 15--17, 2018, Proceedings, Part IV 18. Springer, 560--572.
[11]
Raphael Fischer, Matthias Jakobs, Sascha Mücke, and Katharina Morik. 2022. A Unified Framework for Assessing Energy Efficiency of Machine Learning. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 39--54.
[12]
E Mark Gold. 1978. Complexity of automaton identification from given data. Information and Control, Vol. 37, 3 (1978), 302--320. https://doi.org/10.1016/S0019--9958(78)90562--4
[13]
Fabio Guigou, Pierre Collet, and Pierre Parrend. 2019. SCHEDA: Lightweight euclidean-like heuristics for anomaly detection in periodic time series. Applied Soft Computing, Vol. 82 (2019), 105594. https://doi.org/10.1016/j.asoc.2019.105594
[14]
Mingxu Jin, Aoran Lv, Yuanpeng Zhu, Zijiang Wen, Yubin Zhong, Zexin Zhao, Jiang Wu, Hejie Li, Hanheng He, and Fengyi Chen. 2020. An Anomaly Detection Algorithm for Microservice Architecture Based on Robust Principal Component Analysis. IEEE Access, Vol. 8 (2020), 226397--226408. https://doi.org/10.1109/ACCESS.2020.3044610
[15]
Neil Kulkarni, Caroline Lemieux, and Koushik Sen. 2021. Learning Highly Recursive Input Grammars. In 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE). 456--467. https://doi.org/10.1109/ASE51524.2021.9678879
[16]
Bowen Li, Xin Peng, Qilin Xiang, Hanzhang Wang, Tao Xie, Jun Sun, and Xuanzhe Liu. 2022. Enjoy your observability: an industrial survey of microservice tracing and analysis. Empirical Software Engineering, Vol. 27 (2022), 1--28.
[17]
Tjen-Sien Lim, Wei-Yin Loh, and Yu-Shan Shih. 2000. A Comparison of Prediction Accuracy, Complexity, and Training Time of Thirty-Three Old and New Classification Algorithms. Machine Learning, Vol. 40, 3 (Sept. 2000), 203--228. https://doi.org/10.1023/A:1007608224229
[18]
Francesca Marzi, Giordano d'Aloisio, Antinisca Di Marco, and Giovanni Stilo. 2023. Towards a Prediction of Machine Learning Training Time to Support Continuous Learning Systems Development. arXiv preprint arXiv:2309.11226 (2023).
[19]
Scott Menard. 2002. Applied logistic regression analysis. Vol. 106. Sage.
[20]
Muhammad Marwan Muhammad Fuad. 2012. Genetic algorithms-based symbolic aggregate approximation. In Data Warehousing and Knowledge Discovery: 14th International Conference, DaWaK 2012, Vienna, Austria, September 3--6, 2012. Proceedings 14. Springer, 105--116.
[21]
Nadia Nahar, Haoran Zhang, Grace Lewis, Shurui Zhou, and Christian Kästner. 2023. A Meta-Summary of Challenges in Building Products with ML Components -- Collecting Experiences from 4758 Practitioners. https://doi.org/10.48550/arXiv.2304.00078 arXiv:2304.00078 [cs].
[22]
Craig G Nevill-Manning and Ian H Witten. 1997. Identifying hierarchical structure in sequences: A linear-time algorithm. Journal of Artificial Intelligence Research, Vol. 7 (1997), 67--82.
[23]
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. 2011. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research, Vol. 12 (2011), 2825--2830.
[24]
G.H. Rosenfield and K. Fitzpatrick-Lins. 1986. A coefficient of agreement as a measure of thematic classification accuracy. Photogrammetric Engineering and Remote Sensing, Vol. 52, 2 (1986), 223--227. http://pubs.er.usgs.gov/publication/70014667
[25]
Hagit Shatkay and Stanley B Zdonik. 1996. Approximate queries and representations for large data sequences. In Proceedings of the Twelfth International Conference on Data Engineering. IEEE, 536--545.
[26]
Jacopo Soldani and Antonio Brogi. 2022. Anomaly Detection and Failure Root Cause Analysis in (Micro) Service-Based Cloud Applications: A Survey. ACM Comput. Surv., Vol. 55, 3, Article 59 (feb 2022), bibinfonumpages39 pages. https://doi.org/10.1145/3501297
[27]
Youqiang Sun, Jiuyong Li, Jixue Liu, Bingyu Sun, and Christopher Chow. 2014. An improvement of symbolic aggregate approximation distance measure for time series. Neurocomputing, Vol. 138 (2014), 189--198.
[28]
Luca Traini and Vittorio Cortellessa. 2023. DeLag: Using Multi-Objective Optimization to Enhance the Detection of Latency Degradation Patterns in Service-Based Systems. IEEE Transactions on Software Engineering, Vol. 49, 6 (2023), 3554--3580. https://doi.org/10.1109/TSE.2023.3266041
[29]
Li Wu, Jasmin Bogatinovski, Sasho Nedelkoski, Johan Tordsson, and Odej Kao. 2020. Performance diagnosis in cloud microservices using deep learning. In International Conference on Service-Oriented Computing. Springer, 85--96.
[30]
Yufeng Yu, Yuelong Zhu, Dingsheng Wan, Huan Liu, and Qun Zhao. 2019. A novel symbolic aggregate approximation for time series. In Proceedings of the 13th International Conference on Ubiquitous Information Management and Communication (IMCOM) 2019 13. Springer, 805--822. io

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICPE '24 Companion: Companion of the 15th ACM/SPEC International Conference on Performance Engineering
May 2024
305 pages
ISBN:9798400704451
DOI:10.1145/3629527
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 May 2024

Check for updates

Badges

  • Best Short Paper

Author Tags

  1. anomaly detection
  2. context-free grammar
  3. execution traces
  4. micro service system

Qualifiers

  • Short-paper

Conference

ICPE '24

Acceptance Rates

Overall Acceptance Rate 252 of 851 submissions, 30%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 238
    Total Downloads
  • Downloads (Last 12 months)238
  • Downloads (Last 6 weeks)41
Reflects downloads up to 06 Jan 2025

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media