skip to main content
10.1145/3690407.3690515acmotherconferencesArticle/Chapter ViewAbstractPublication PagescaibdaConference Proceedingsconference-collections
research-article

An AIOps Approach to Data Cloud Based on Large Language Models

Published: 24 October 2024 Publication History

Abstract

In order to overcome the challenges of inefficiency, high error rate and poor scalability in the traditional operations model, and to ensure that the data cloud can provide efficient and stable quality of service, this study explores AIOps strategies for data clouds. This study proposes an AIOps approach based on a large-scale language model, aiming to achieve AIOps of data clouds. The approach builds a system architecture containing an operations big data platform, an intelligent decision-making platform and a tool platform through a comprehensive demand and capability analysis, and details its workflow and functional modules. This approach enables the data cloud to achieve safe and efficient operations in an “unattended” situation, thus providing solid support for the continuous and stable operation of the data cloud platform.

References

[1]
Shuiguang Deng, Hailiang Zhao, Binbin Huang, Cheng Zhang, Feiyi Chen, Yinuo Deng, Jianwei Yin, Schahram Dustdar, and Albert Y. Zomaya. “Cloud-Native Computing: A Survey From the Perspective of Services”, PROCEEDINGS OF THE IEEE 112.1. 2024: 12-46.
[2]
Teresa Higuera, José L. Risco-Martín, Patricia Arroba, and José L. Ayala. “Green Adaptation of Real-Time Web Services for Industrial CPS within a Cloud Environment”, IEEE Transactions on Industrial Informatics 13.3. 2024: 1249-1256.
[3]
Youcef Remil, Anes Bendimerad, Romain Mathonat, and Mehdi Kaytoue. “AIOps Solutions for Incident Management: Technical Guidelines and A Comprehensive Literature Review”, CoRR abs/2404.01363. 2024
[4]
Anas Dakkak, Jan Bosch, and Helena Holmstrom Olsson. “Towards AIOps enabled services in continuously evolving software-intensive embedded systems”, Journal of Software: Evolution and Process 36.5. 2023.
[5]
Mahsa Panahandeh, Abdelwahab Hamou-Lhadj, Mohammad Hamdaqa, and James Miller. “ServiceAnomaly: An anomaly detection approach in microservices using distributed traces and profiling metrics”, Journal of Systems and Software 209. 2024.
[6]
Muhammad Waseem, Peng Liang, and Mojtaba Shahin. “A Systematic Mapping Study on Microservices Architecture in DevOps”, Journal of Systems and Software 170. 2020: 110798.
[7]
Claus Pahl, Antonio Brogi, Jacopo Soldani, and Pooyan Jamshidi. “Cloud Container Technologies: a State-of-the-Art Review”, IEEE Transactions on Cloud Computing 7.3. 2019: 1-1.
[8]
Daksh Dave, Gauransh Sawhney, Dhruv Khut, Sahil Nawale, Pushkar Aggrawal, and Prasenjit Bhavathankar. “AIOps-Driven Enhancement of Log Anomaly Detection in Unsupervised Scenarios”, 2023 International Conference on Big Data, Knowledge and Control Systems Engineering (BdKCSE) abs/2311.02621. 2023: 1-6.
[9]
Rakesh Kumar, and Rinkaj Goyal. “On cloud security requirements, threats, vulnerabilities and countermeasures: A survey”, Computer Science Review 33. 2019: 1-48.
[10]
Yingzhe Lyu, Heng Li, Mohammed Sayagh, Zhen Ming (Jack) Jiang, and Ahmed E. Hassan. “An Empirical Study of the Impact of Data Splitting Decisions on the Performance of AIOps Solutions”, ACM Transactions on Software Engineering and Methodology 30.4. 2021: 1-38.
[11]
Pei Dan,Zhang Shenglin,Sun Yongqian, et al. Intelligent operation and maintenance in the era of big language modelling[J/OL]. ZTE Technology,1-11, 2024-06-29.http://kns.cnki.net/kcms/detail/34.1228.tn.20240407.1926.002.html.
[12]
Xiao-Yang Li, Yue Liu, Yan-Hui Lin, Lianghua Xiao, Enrico Zio, and Rui Kang. “A Generalized Petri Net-Based Modeling Framework For Service Reliability Evaluation And Management Of Cloud Data Centers”, Reliability Engineering & System Safety 207. 2021: 107381.
[13]
R. Rocchetta, L. Bellani, M. Compare, E. Zio, and E. Patelli. “A reinforcement learning framework for optimal operation and maintenance of power grids”, Applied energy 241. 2019: 291-301.
[14]
Toufique Ahmed, Supriyo Ghosh, Chetan Bansal, Thomas Zimmermann, Xuchao Zhang, and Saravan Rajmohan. “Recommending Root-Cause and Mitigation Steps for Cloud Incidents using Large Language Models”, 2023 IEEE/ACM 45TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ICSE abs/2301.03797. 2023: 1737-1749.
[15]
Huan Wang, and Yan-Fu Li. “Large Language Model Empowered by Domain-Specific Knowledge Base for Industrial Equipment Operation and Maintenance”, 2023 5th International Conference on System Reliability and Safety Engineering (SRSE). 2023: 474-479.

Index Terms

  1. An AIOps Approach to Data Cloud Based on Large Language Models

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    CAIBDA '24: Proceedings of the 2024 4th International Conference on Artificial Intelligence, Big Data and Algorithms
    June 2024
    1206 pages
    ISBN:9798400710247
    DOI:10.1145/3690407
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 24 October 2024

    Check for updates

    Author Tags

    1. AIOps
    2. Data Cloud
    3. Large Language Models
    4. Quality of Service (QOS)

    Qualifiers

    • Research-article

    Conference

    CAIBDA 2024

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 60
      Total Downloads
    • Downloads (Last 12 months)60
    • Downloads (Last 6 weeks)34
    Reflects downloads up to 12 Jan 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media