CN112052151A - Fault root cause analysis method, device, equipment and storage medium - Google Patents
Fault root cause analysis method, device, equipment and storage medium Download PDFInfo
- Publication number
- CN112052151A CN112052151A CN202011072717.9A CN202011072717A CN112052151A CN 112052151 A CN112052151 A CN 112052151A CN 202011072717 A CN202011072717 A CN 202011072717A CN 112052151 A CN112052151 A CN 112052151A
- Authority
- CN
- China
- Prior art keywords
- analyzed
- root cause
- fault root
- fault
- component
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3409—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/17—Details of further file system functions
- G06F16/1734—Details of monitoring file system events, e.g. by the use of hooks, filter drivers, logs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2465—Query processing support for facilitating data mining operations in structured databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/26—Visual data mining; Browsing structured data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/253—Fusion techniques of extracted features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Databases & Information Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- Biophysics (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Fuzzy Systems (AREA)
- Probability & Statistics with Applications (AREA)
- Computer Hardware Design (AREA)
- Quality & Reliability (AREA)
- Debugging And Monitoring (AREA)
Abstract
The application discloses a fault root cause analysis method, a fault root cause analysis device, equipment and a storage medium, wherein the method comprises the steps of obtaining original time sequence information of a plurality of indexes to be analyzed corresponding to a component set to be analyzed; determining implicit sequence mode characteristics based on the original timing sequence information; acquiring an alarm log of each component in a component set to be analyzed within a first preset time range; determining alarm log text characteristics corresponding to the alarm logs of each component within a first preset time range; based on a root cause association probability analysis model, according to the text features of the alarm logs and the hidden sequence mode features, carrying out fault root cause association probability analysis on the components in the component set to be analyzed to obtain fault root cause association probability among the components in the component set to be analyzed; and determining the fault root cause association relation among the components according to the fault root cause association probability. By the aid of the technical scheme, the incidence relation of the fault root causes among the components can be efficiently and accurately determined in fault detection, and reliability of fault root cause analysis is improved.
Description
Technical Field
The application relates to the technical field of operation and maintenance management, in particular to a fault root cause analysis method, device, equipment and storage medium.
Background
With the continuous progress of digital transformation, data indexes and calling relations of various systems become more and more complex, one system is often composed of a large number of components such as servers, and once a fault occurs, huge loss can be brought, so that besides rapid detection, fault root cause analysis is also needed, so that similar faults are prevented from occurring again later, and loss brought by the fault is reduced.
In the prior art, when fault root cause analysis is carried out, rules are often manually specified or experiences are accumulated, a decision tree is constructed or a knowledge graph is established, the flexibility is low, the dependence on manpower is strong, the efficiency is low, errors are difficult to avoid, more time and manpower resources are consumed when the rules and the like need to be updated, and a more reliable and efficient scheme needs to be provided.
Disclosure of Invention
In order to solve the problems in the prior art, the application provides a fault root cause analysis method, a fault root cause analysis device, equipment and a storage medium. The technical scheme is as follows:
one aspect of the present application provides a fault root cause analysis method, including:
acquiring original time sequence information of a plurality of indexes to be analyzed corresponding to a component set to be analyzed, wherein the plurality of indexes to be analyzed comprise indexes to be analyzed corresponding to each component in the component set to be analyzed;
determining implicit sequence mode characteristics based on the original time sequence information of the plurality of indexes to be analyzed;
acquiring an alarm log of each component in the component set to be analyzed within a first preset time range;
determining alarm log text characteristics corresponding to the alarm logs of each component within a first preset time range;
based on a root cause correlation probability analysis model, according to the text features of the alarm log and the implicit sequence mode features, carrying out fault root cause correlation probability analysis on the concentrated assemblies to be analyzed to obtain fault root cause correlation probabilities among the concentrated assemblies to be analyzed;
and determining the fault root cause incidence relation among the components in the component set to be analyzed according to the fault root cause incidence probability among the components in the component set to be analyzed.
Another aspect of the present application provides a fault root cause analysis apparatus, including:
the system comprises an original time sequence information acquisition module, a time sequence analysis module and a time sequence analysis module, wherein the original time sequence information acquisition module is used for acquiring original time sequence information of a plurality of indexes to be analyzed corresponding to a component set to be analyzed, and the plurality of indexes to be analyzed comprise indexes to be analyzed corresponding to each component in the component set to be analyzed;
the implicit sequence mode characteristic determining module is used for determining implicit sequence mode characteristics based on the original time sequence information of the plurality of indexes to be analyzed;
the alarm log acquisition module is used for acquiring an alarm log of each component in the component set to be analyzed within a first preset time range;
the text characteristic determining module is used for determining the text characteristic of the alarm log corresponding to the alarm log of each component in a first preset time range;
the fault root cause correlation probability analysis module is used for carrying out fault root cause correlation probability analysis on the component set to be analyzed according to the alarm log text characteristic and the hidden sequence mode characteristic based on a root cause correlation probability analysis model to obtain the fault root cause correlation probability among the component set to be analyzed;
and the fault root incidence relation determining module is used for determining the fault root incidence relation among the components in the component set to be analyzed according to the fault root incidence probability among the components in the component set to be analyzed.
Another aspect of the present application provides a fault root cause analysis device, which includes a processor and a memory, where at least one instruction or at least one program is stored in the memory, and the at least one instruction or the at least one program is loaded and executed by the processor to implement the fault root cause analysis method as described above.
Another aspect of the present application provides a computer-readable storage medium, in which at least one instruction or at least one program is stored, and the at least one instruction or the at least one program is loaded and executed by a processor to implement the fault root cause analysis method as described above.
The fault root cause analysis method, the fault root cause analysis device, the fault root cause analysis equipment and the storage medium have the following technical effects:
according to the method and the device, the implicit sequence mode characteristics can be determined by acquiring the original time sequence information of a plurality of indexes to be analyzed corresponding to the component set to be analyzed; acquiring an alarm log of each component in the component set to be analyzed within a first preset time range to determine corresponding text characteristics of the alarm log, so as to adapt to the requirements of dynamic operation and maintenance change; then, based on a root cause association probability analysis model, according to the text features of the alarm logs and the hidden sequence mode features, carrying out fault root cause association probability analysis on the concentrated assemblies to be analyzed to obtain fault root cause association probabilities among the concentrated assemblies to be analyzed, so that the fault root cause association probabilities among the concentrated assemblies to be analyzed can be quickly and accurately obtained, and finally, the fault root cause association relation among the concentrated assemblies to be analyzed is determined according to the fault root cause association probabilities among the concentrated assemblies to be analyzed. By using the technical scheme provided by the embodiment of the specification, the incidence relation of the fault root causes among the components can be rapidly and accurately determined, and the reliability of fault root cause analysis is further improved.
Additional aspects and advantages of the present application will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the present application.
Drawings
In order to more clearly illustrate the technical solutions and advantages of the embodiments of the present application or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a schematic diagram of an application environment provided by an embodiment of the present application;
fig. 2 is a schematic flow chart of a fault root cause analysis method provided in an embodiment of the present application;
FIG. 3 is a schematic flow chart diagram of another fault root cause analysis method provided in the embodiments of the present application;
FIG. 4 is a schematic flow chart illustrating another fault root cause analysis method provided in the embodiments of the present application;
FIG. 5 is a schematic structural diagram of a root cause association probability analysis model according to an embodiment of the present disclosure;
FIG. 6 is a schematic flow chart illustrating another fault root cause analysis method provided in the embodiments of the present application;
FIG. 7 is a schematic structural diagram of another root cause relevance probability analysis model provided in the embodiments of the present application;
FIG. 8 is a schematic flow chart diagram illustrating another fault root cause analysis method provided in the embodiments of the present application;
fig. 9 is a schematic diagram of a fault root cause analysis device according to an embodiment of the present application;
fig. 10 is a hardware structure block diagram of a server for implementing a fault root cause analysis method according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application. Examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the same or similar elements or elements having the same or similar function throughout.
It should be noted that the terms "first," "second," and the like in the description and claims of this application and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or server that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Referring to fig. 1, fig. 1 is a schematic diagram of an application environment provided by the present application, and as shown in fig. 1, the application environment may include a root cause analysis server 01 and a plurality of service components 02.
In this embodiment of the present specification, the root cause analysis server 01 may be configured to perform fault root cause analysis by combining data of the multiple service components 02, and optionally, the root cause analysis server 01 may be an independent physical server, a server cluster or a distributed system formed by multiple physical servers, or a cloud server that provides basic cloud computing services such as cloud service, a cloud database, cloud computing, cloud functions, cloud storage, Network service, cloud communication, middleware service, domain name service, security service, CDN (Content Delivery Network) and a big data and artificial intelligence platform.
In this embodiment, the plurality of service components 02 may generate operation data, alarm logs, and the like, so that the root cause analysis server 01 may obtain required data to implement fault root cause analysis, and in one embodiment, the plurality of service components 02 may include servers for implementing different functions, may be independent physical servers, may be a server cluster or a distributed system formed by a plurality of physical servers, and may be cloud servers providing basic cloud computing services such as cloud services, a cloud database, cloud computing, cloud functions, cloud storage, Network services, cloud communication, middleware services, domain name services, security services, a CDN (Content Delivery Network), big data, an artificial intelligence platform, and the like. In practical applications, the service component 02 may further include, but is not limited to, a terminal device such as a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, a network device, a firewall, and the like.
In the embodiment of the present specification, the root cause analysis server 01 and the plurality of service components 02 may be directly or indirectly connected through wired or wireless communication, and the present application is not limited herein.
Fig. 2 is a flow chart of a fault root cause analysis method provided in an embodiment of the present application, and the present specification provides the method operation steps as described in the embodiment or the flow chart, but more or less operation steps may be included based on conventional or non-inventive labor. The order of steps recited in the embodiments is merely one manner of performing the steps in a multitude of orders and does not represent the only order of execution. In practice, the system or server product may be implemented in a sequential or parallel manner (e.g., parallel processor or multi-threaded environment) according to the embodiments or methods shown in the figures. Specifically, as shown in fig. 2, the method may include:
s201: and acquiring original time sequence information of a plurality of indexes to be analyzed corresponding to the component set to be analyzed.
In an embodiment of the present specification, the set of components to be analyzed includes at least two components. In particular, the components of the set of components to be analyzed may be configured in association with actual root cause analysis requirements for a fault, and in one particular embodiment, the set of components to be analyzed may include a component that failed in an exception event and at least one component that may be associated with the failed component.
In this embodiment of the present specification, the analysis of the root cause of the fault may include analyzing whether there is a fault correlation among several preset components, so as to avoid similar faults from occurring again; in the embodiment of the present specification, in combination with actual failure root cause analysis requirements, several preset components may be taken as a to-be-analyzed component set, and then failure root cause association probability analysis may be performed based on a root cause association probability analysis model to determine a failure root cause association relationship between components in the to-be-analyzed component set, so as to facilitate the operation and maintenance personnel to perform corresponding maintenance subsequently, avoiding a similar failure from occurring again.
Specifically, the components may include, but are not limited to, terminal devices, servers for implementing different functions, network devices, firewalls, and the like; the index may be used to characterize relevant operational information of the corresponding component, and specifically, the index may include, but is not limited to, average response time, average throughput rate, number of requests, error rate, health, and processing time.
In an embodiment of the present specification, the plurality of to-be-analyzed indicators include an to-be-analyzed indicator corresponding to each component in the to-be-analyzed component set. Because each component may correspond to a plurality of indexes, some indexes in all indexes corresponding to each component can be obtained by combining with the actual fault root cause analysis requirement to serve as the indexes to be analyzed corresponding to the component. For example, the component set to be analyzed includes a component a, a component B, and a component C, 3 indexes of all indexes corresponding to the component a may be obtained as indexes to be analyzed corresponding to the component a, 5 indexes of all indexes corresponding to the component B may be obtained as indexes to be analyzed corresponding to the component B, 2 indexes of all indexes corresponding to the component C may be obtained as indexes to be analyzed corresponding to the component C, and at this time, the 10 indexes may be used as the plurality of indexes to be analyzed corresponding to the component set to be analyzed.
In an embodiment of the present specification, the raw timing information of each index to be analyzed may represent a variation relationship of a value of the index to be analyzed with time, and in an embodiment, the raw timing information may include a two-dimensional curve continuously varying with time, or a plurality of point values discretely varying with time. For example, when the index to be analyzed includes the average throughput rate of the component a, the original timing information of the index to be analyzed may be a two-dimensional curve varying with time, the abscissa is time, and the ordinate is the value of the average throughput rate, and the value of the index to be analyzed at each time and the variation trend may be obtained by obtaining the original timing information of the index to be analyzed. In practical application, the original time sequence information of the index to be analyzed at any time can be acquired by combining the requirements of practical fault root cause analysis, and the method is flexible.
S203: and determining the characteristic of the implicit sequence mode based on the original time sequence information of the plurality of indexes to be analyzed.
In a specific embodiment, as shown in fig. 3, the determining the implicit sequence pattern feature based on the raw timing information of the plurality of indicators to be analyzed may include:
s301: and determining an index time sequence ascending and descending sequence within a second preset time range according to the original time sequence information of the plurality of indexes to be analyzed.
In this embodiment, an index time sequence ascending and descending sequence may include a plurality of index change identifiers, and the index change identifiers may represent changes of corresponding indexes to be analyzed.
In one embodiment, when the original timing information includes a two-dimensional curve that continuously changes over time, the incremental change of the original timing information may be determined by determining a change node of the curve (e.g., the curve change trend is originally increasing before the change node, and the curve change trend becomes decreasing after the change node, or the curve change trend is originally decreasing before the change node, and the curve change trend is changed to increasing after the change node). Specifically, the second preset time range may include a plurality of preset continuous time periods, and the preset continuous time periods may be determined in combination with the actual fault root cause analysis requirement. The determining the indicator timing ascending and descending sequence within the second preset time range according to the original timing information of the plurality of indicators to be analyzed may include:
and determining corresponding index time sequence ascending and descending sequences respectively based on the appearance sequence of the change nodes of the original time sequence information of the plurality of indexes to be analyzed in each preset continuous time period, and taking the index time sequence ascending and descending sequences corresponding to all the preset continuous time periods as the index time sequence ascending and descending sequences in the second preset time range.
In a specific embodiment, the second preset time range may include 3 preset continuous time periods of 20 to 23 days of 7-month 9, 20 to 23 days of 7-month 10, and 20 to 23 days of 7-month 11, where the to-be-analyzed index includes an index a, an index b, a index c, and an index d, and in the preset continuous time period of 20 to 23 days of 7-month 9, a curve corresponding to the index b first appears at a change node, a curve corresponding to the index b changes into an increase b after the change node, then a curve corresponding to the index c appears at a change node, a curve corresponding to the index c changes into an increase c after the change node, then a curve corresponding to the index a appears at a change node, a curve corresponding to the index a decrease a after the change node, then a curve corresponding to the index d appears at a change node, and a curve corresponding to the index d changes into an increase d after the change node, then the corresponding index time sequence ascending and descending sequence determined at this time is 'b increase-c increase-a decrease-d increase', and the index time sequence ascending and descending sequence comprises index change identifiers of b increase, c increase, a decrease and d increase; similarly, the index time sequence ascending and descending sequence corresponding to the other 2 preset continuous time periods can be determined.
The corresponding index time sequence ascending and descending sequence is determined according to the appearance sequence of the change nodes of the original time sequence information of the indexes to be analyzed in each preset continuous time period, so that whether potential causal relationships exist among the changes of the indexes or not is determined, the follow-up fault root cause analysis is facilitated according to needs, and the reliability and the comprehensiveness of the fault root cause analysis are improved.
In the former embodiment, the incremental and decremental change of the original time sequence information may be determined by determining a change node of a curve, and the corresponding index time sequence ascending and descending sequence is determined based on the occurrence order of the change nodes of the original time sequence information of the plurality of indexes to be analyzed in each preset continuous time period. In another embodiment provided by the present specification, the time sequence of the index ascending and descending in the second preset time range may also be determined by setting a plurality of time intervals based on the original time sequence information, and comparing the value of the index in one time interval with the value of the index in the corresponding previous time interval to determine the increase and decrease of the value of the index in each time interval to be analyzed. In this embodiment, specifically, as shown in fig. 4, determining the indicator timing ascending and descending sequence within the second preset time range according to the original timing information of the plurality of indicators to be analyzed may include:
s401: and determining the time sequence ascending and descending information of the plurality of indexes to be analyzed according to the original time sequence information of the plurality of indexes to be analyzed.
Specifically, the determining the timing ascending and descending information of the plurality of indexes to be analyzed according to the original timing information of the plurality of indexes to be analyzed may include:
1) setting a plurality of time nodes, and taking a time interval between every two adjacent time nodes as a time interval;
2) respectively determining the increase and decrease information of the value of each index to be analyzed at each time interval according to the original time sequence information of each index to be analyzed;
in practical application, the increase and decrease information of the value of each index to be analyzed in each time interval is determined according to the original time sequence information of each index to be analyzed, and the increase and decrease information of the value of each index to be analyzed in each time interval can be determined by comparing the value of each index to be analyzed in each time interval with the value of each index to be analyzed in the corresponding previous time interval.
3) And performing time sequence ascending and descending marking according to the increasing and decreasing information of the value of the index to be analyzed at each time interval to obtain the time sequence ascending and descending information of the index to be analyzed, and integrating the time sequence ascending and descending information of the plurality of indexes to be analyzed.
For example, when it is determined that the value of the a-index is increased when it is 1-2 compared to the value of the a-index when it is 0-1, the index change identifier corresponding to the a-index when it is 1-2 may be marked as increased.
In a specific embodiment, a time node may be set every 1 hour, the plurality of indexes to be analyzed include an index a, an index b, an index c, an index d, an index e, and an index f, and taking 0-12 of 1/7/2020 as an example, the following time sequence ascending and descending information of the plurality of indexes to be analyzed may be utilized
Table 1 shows:
time interval | a index | b index | c index | d index | e index | f index |
At 0-1 time | a minus | Increase of b | c increase | d is increased | e is decreased | f increase |
1-2 times | a increase | b is decreased | c increase | d is decreased | e increase | f minus |
2-3 times | a minus | Increase of b | c reduction | d is increased | e increase | f minus |
At 3-4 times | a minus | b is decreased | c increase | d is decreased | e is decreased | f minus |
At 4-5 deg.C | a increase | Increase of b | c increase | d is increased | e increase | f increase |
At 5-6 times | a increase | b is decreased | c increase | d is decreased | e is decreased | f increase |
At 6-7 times | a increase | Increase of b | c reduction | d is decreased | e increase | f minus |
At 7-8 times | a increase | b is decreased | c reduction | d is increased | e increase | f increase |
At 8-9 times | a increase | Increase of b | c increase | d is decreased | e increase | f minus |
When 9-10 times | a minus | Increase of b | c increase | d is decreased | e is decreased | f increase |
At 10-11 hours | a minus | b is decreased | c increase | d is increased | e increase | f increase |
At 11-12 times | a increase | b is decreased | c reduction | d is decreased | e is decreased | f increase |
TABLE 1
The timing ascending and descending information of the indexes to be analyzed on other dates can be similar to the form of the table 1, and is not described herein again.
S403: and constructing an index time sequence ascending and descending sequence within a second preset time range according to the time sequence ascending and descending information of the plurality of indexes to be analyzed.
In practical applications, the second predetermined time range may be set according to the actual failure cause analysis requirement, in one embodiment, the second predetermined time range may be the same time period (time interval) on different dates, and in another embodiment, the second predetermined time range may also be different time periods (time intervals) on the same day, which is not limited in this application. Taking the same time period of different dates as an example of the second preset time range, an index time sequence ascending and descending sequence corresponding to the time interval between 8 and 9 of 7/1/2020 and an index time sequence ascending and descending sequence corresponding to the time interval between 8 and 9 of 7/2/2020 can be respectively constructed according to the generated time sequence ascending and descending information of a plurality of indexes to be analyzed, as shown in table 2:
date | Time interval | Index timing sequence up-down sequence |
20200701 | At 8-9 times | a increase, b increase, c increase, d decrease, e increase, f decrease |
20200702 | At 8-9 times | a minus b plus c plus d minus e minus f minus |
TABLE 2
Taking the index time sequence ascending and descending sequence of 'a increasing, b increasing, c increasing, d decreasing, e increasing, f decreasing' as an example, it can be understood that at this time, a increasing is accompanied by b increasing, d decreasing is accompanied by d increasing, and f decreasing is accompanied by e increasing, the time sequence ascending and descending information of the multiple indexes to be analyzed is determined according to the original time sequence information of the multiple indexes to be analyzed, and then the index time sequence ascending and descending sequence in the second preset time range is constructed according to the time sequence ascending and descending information of the multiple indexes to be analyzed, so that a large amount of index change information is favorably acquired so as to determine whether potential correlation exists among changes of the multiple indexes, and therefore, the subsequent fault root cause analysis is favorably carried out as required, and the reliability of the fault root cause analysis is improved.
S303: and mining a sequence mode according to the index time sequence ascending and descending sequence to obtain an implicit sequence mode.
In this embodiment of the present specification, sequence Pattern mining may be performed by using a Prefix-Projected Pattern Growth (Prefix-Projected sequence Pattern mining) according to the index timing ascending and descending sequence, so as to obtain an implicit sequence Pattern. Specifically, the mining of the sequence mode according to the index time sequence ascending and descending sequence to obtain the implicit sequence mode may include the following steps:
1) determining the frequency number of each index change identifier in the index time sequence ascending and descending sequence;
in particular, the frequency may represent the number of occurrences of the index change identifier in the entire index timing ascending and descending sequence.
Taking the above table 2 as an example, there are 2 index timing ascending and descending sequences, namely, "a increases-b increases-c increases-d decreases-e increases-f decreases" and "a decreases-b increases-c increases-d decreases-e decreases-f decreases", and the frequency of each index change identifier in the above index timing ascending and descending sequences is determined as shown in table 3:
index change identifier | a increase | a minus | Increase of b | c increase | d is decreased | e increase | e is decreased | f minus |
Frequency of occurrence | 1 | 1 | 2 | 2 | 2 | 1 | 1 | 2 |
TABLE 3
2) And determining the index change identifier meeting a preset minimum support threshold based on the frequency of the index change identifier, respectively taking the index change identifier meeting the preset minimum support threshold as a prefix, and determining a corresponding suffix.
In an embodiment of the present disclosure, the preset minimum support threshold may be set according to an actual application requirement, and in an embodiment, the preset minimum support threshold may be determined according to the following formula:
min_sup=a×n
where min _ sup represents the preset minimum support threshold, n represents the number of days (number of days) included in the second preset time range, and a represents the minimum support rate, which may be determined according to the actual application requirements, for example, the minimum support rate may be adjusted according to the number of ascending and descending sequences of the indicator timing sequence, and the minimum support rate may be adjusted downward along with the number of ascending and descending sequences of the indicator timing sequence. The preset minimum support threshold may represent a requirement for a frequency of occurrence of data, for example, the preset minimum support threshold is 0.5, and is satisfied when an occurrence frequency of the target data in all data is higher than 0.5, and if there are 10 index timing ascending and descending sequences, it is determined that the target element satisfies the preset minimum support threshold when the target element occurs in more than 5 index timing ascending and descending sequences.
Referring to table 4, when the predetermined minimum support threshold is 0.5, the prefix and the corresponding suffix determined in step 2) are shown in table 4:
TABLE 4
3) Respectively determining the single item meeting the preset minimum support threshold in the suffixes corresponding to the two prefixes, combining the single item meeting the preset minimum support threshold with the corresponding one prefix to obtain two prefixes, and continuously determining the suffixes corresponding to the two prefixes.
Referring to table 5, when the preset minimum support threshold is 0.5, the prefixes and suffixes determined in step 3) are as shown in table 5:
TABLE 5
4) In the same way, determining the single item meeting the preset minimum support threshold in the suffixes corresponding to the i prefixes respectively, combining the single item meeting the preset minimum support threshold with the corresponding i prefixes to obtain (i +1) prefixes, and determining the suffixes (i is an integer greater than 1) corresponding to the (i +1) prefixes;
and repeating the step 4) until the longest prefix sequence is mined, and taking the longest prefix sequence as the implicit sequence mode.
Referring to table 6 and table 7, when the preset minimum support threshold is 0.5, the determined prefixes and suffixes of three items are shown in table 6, and the prefixes and suffixes of four items are shown in table 7:
TABLE 6
Prefix of four items | Corresponding suffix |
b increase, c increase, d decrease, f decrease | Is free of |
TABLE 7
At this time, the excavated longest prefix sequence is "b is increased-c is increased-d is decreased-f is decreased", that is, an implicit sequence mode obtained by excavating a sequence mode according to an index time sequence ascending and descending sequence shown in table 2 is "b is increased-c is increased-d is decreased-f is decreased". Determining an index time sequence ascending and descending sequence within a second preset time range according to the original time sequence information of the plurality of indexes to be analyzed, and mining a sequence pattern according to the index time sequence ascending and descending sequence to obtain a hidden sequence pattern, wherein the hidden sequence pattern can be a rule hidden by the plurality of indexes to be analyzed, and can be an incidence relation or a causal relation of changes of a plurality of indexes, and the hidden sequence pattern can be subjected to feature coding subsequently, and fault root cause analysis is performed by combining alarm logs of all components, so that the reliability of fault root cause analysis is improved; because each index data can be updated and changed continuously along with time, the implicit sequence mode can also be changed continuously, and some indexes are not associated in the past period of time, but are likely to be associated later, the second preset time range can be adjusted according to requirements to mine the latest implicit sequence mode in real time, so that the flexibility is high, and the timeliness of fault root cause analysis is improved.
S307: and carrying out characteristic coding on the implicit sequence mode to obtain the implicit sequence mode characteristic.
In the embodiment of the present specification, One-Hot feature coding (One-Hot coding) may be performed on the implicit sequence mode to obtain an implicit sequence mode feature.
S205: and acquiring an alarm log of each component in the component set to be analyzed within a first preset time range.
Specifically, the first preset time range may be set in combination with an actual fault root cause analysis requirement; in a specific embodiment, the first preset time range may include one hour before the occurrence time of the fault to one hour after the occurrence time of the fault. For example, the component set to be analyzed includes a component a, a component B, and a component C, where the component a generates 4 alarm logs within a first preset time range, the component B generates 3 alarm logs within the first preset time range, and the component C generates 3 alarm logs within the first preset time range, and may obtain the 10 alarm logs, and then determine the alarm log text features corresponding to each alarm log respectively.
The alarm log belongs to semi-structured data, and is characterized by real time and rich data, and by acquiring the alarm log of each component in the component set to be analyzed within a first preset time range, fault root cause analysis can be subsequently performed by combining an implicit sequence mode, so that the reliability of the fault root cause analysis is improved.
S207: and determining the text characteristics of the alarm log corresponding to the alarm log of each component in the first preset time range.
In an embodiment of the present specification, determining an alarm log text feature corresponding to the alarm log of each component in the first preset time range may include:
and respectively carrying out text vectorization on each alarm log to obtain corresponding text characteristics of the alarm logs.
In a specific embodiment, the text vectorizing is performed on each alarm log respectively, and obtaining the text feature of the corresponding alarm log may include:
1) obtaining a word vector corresponding to each word in the alarm log based on a preset word vector model;
in practical applications, the preset Word vector model may include a Word2vec Word vector model. It should be noted that, when the text of the alarm log is of a preset text type, for example, chinese, before the word vector corresponding to each word in the alarm log is obtained based on the preset word vector model, text segmentation needs to be performed on the alarm log.
2) Calculating the characteristic weight corresponding to each word in the alarm log;
since the alarm log has many format words existing for unifying the alarm specifications, these words appear in many alarm logs, and in order to reduce the influence of these words on the text vectorization feature representation of the alarm log, it is necessary to calculate the feature weight corresponding to each word in the alarm log. If a word frequently appears in the alarm log and rarely appears in other alarm logs, the word has the distinguishing capability for the alarm log, and the distinguishing capability of the alarm log and other alarm logs is facilitated. In a specific embodiment, a TFIDF method (term frequency-inverse document frequency) may be used to calculate a feature weight corresponding to each word in the alarm log, and specifically, the feature weight corresponding to each word in the alarm log calculated by using the TFIDF method may specifically be based on the following formula:
TF-IDF value Term Frequency (TF) x Inverse Document Frequency (IDF)
Specifically, in the calculation formula of the Inverse Document Frequency (IDF), the base of the logarithmic function can be set according to the actual application requirement. The TF-IDF value described above may characterize the corresponding feature weight of the word.
3) And carrying out weighted summation based on the word vector corresponding to each word in the alarm log and the corresponding characteristic weight to obtain the text characteristic of the alarm log corresponding to the alarm log.
The word vector corresponding to each word in the alarm log is obtained based on a preset word vector model, the characteristic weight corresponding to each word in the alarm log is calculated, the word vector corresponding to each word in the alarm log and the corresponding characteristic weight are subjected to weighted summation to obtain the alarm log text characteristics corresponding to the alarm log, the influence of irrelevant words on the alarm log text characteristics is favorably reduced, words with distinguishing capability are determined to be subjected to corresponding weight setting, then the alarm log text characteristics which are more favorable for fault root cause analysis can be obtained, and the accuracy of fault root cause analysis is improved.
S209: and based on a root cause correlation probability analysis model, carrying out fault root cause correlation probability analysis on the concentrated assemblies to be analyzed according to the text features of the alarm logs and the implicit sequence mode features to obtain fault root cause correlation probabilities among the concentrated assemblies to be analyzed.
In an embodiment of the present specification, as shown in fig. 5, the root cause relevance probability analysis model may include a correlation mining module 510, a feature fusion layer 520, a feed forward layer 530, and a classification layer 540.
As shown in fig. 6, based on a root cause association probability analysis model, according to the text feature of the alarm log and the implicit sequence pattern feature, performing a fault root cause association probability analysis on the to-be-analyzed component set, and obtaining a fault root cause association probability between the to-be-analyzed component set components may include:
s601: performing relevance mining on the text characteristics of the alarm log based on the relevance mining module to obtain the relevance characteristics of the alarm log;
since the alarm logs appearing in a specified time range often have strong correlation which is extremely important for fault root cause analysis, it is necessary to perform correlation mining on the text features of the alarm logs based on the correlation mining module.
In this embodiment of the present specification, the correlation mining module may include a transform model (a translation model based on the self-attention mechanism), and in practical applications, the correlation mining module may be used as a part of the root cause association probability analysis model, or may be cascaded with the root cause association probability analysis model as an independent neural network. Compared with a CNN (Convolutional Neural Networks) network, the Transformer model can acquire global information better; compared with an RNN (Current Neural Network Recurrent Neural Network), the method has the advantages that the training of the transform model is faster, the efficiency is high, and the fast parallel can be realized by utilizing a self-attention mechanism. In a specific embodiment, the Transformer model may include, but is not limited to, a multi-head self-attention module, a summation and normalization module, and a feedforward module, wherein the multi-head self-attention module may be composed of a plurality of self-attention units having the same structure but different weight matrices, so that each self-attention unit can focus on different features, and further the Transformer model can focus on more features, thereby avoiding a situation where the model only focuses on a part of features, facilitating more comprehensive correlation mining of the text features of the alarm log, obtaining more accurate correlation features of the alarm log, and further improving accuracy of fault root cause analysis.
S603: performing feature fusion on the alarm log correlation feature and the implicit sequence mode feature based on the feature fusion layer to obtain a target fusion feature;
in this embodiment of the present specification, the feature fusion layer can perform deep feature extraction on the alarm log correlation feature and the above-mentioned implicit sequence pattern feature to implement feature fusion, and in one embodiment, the feature fusion layer may include a GRU layer (Gate recovery Unit gated cycle Unit) that has fewer GRU parameters and can better process sequence information compared to an LSTM (Long-Short Term Memory network); in another embodiment, the feature fusion layer may also include a plurality of cascaded feed-forward layers, which can also effectively process and fuse the alarm log correlation feature and the implicit sequence pattern feature described above, and the application is not limited thereto.
Referring to fig. 7, when the correlation mining module includes a Transformer model 5101 and the feature fusion layer includes a GRU layer 5102, the structure of the root cause correlation probability analysis model is as shown in fig. 7.
S605: performing feature processing on the target fusion feature based on the feedforward layer to obtain a processed target fusion feature;
in the embodiments of the present specification, the feature processing of the target fusion feature based on the above-mentioned feedforward layer may include, but is not limited to, feature extraction and weight configuration of the target fusion feature.
S607: and calculating the fault root cause association probability of the processed target fusion characteristics based on the classification layer to obtain the fault root cause association probability among the components in the component set to be analyzed.
In this embodiment of the present specification, the classification layer may include a classification layer, and the classification layer may calculate a fault root cause association probability based on the classification layer, and output the fault root cause association probability between the components in the component set to be analyzed, where the fault root cause association probability may represent a probability that a fault root cause association exists between the components in the component set to be analyzed.
In the embodiment of the present specification, the loss function of the root cause correlation probability analysis model may include, but is not limited to, cross entropy loss and hinge loss.
Based on a root cause association probability analysis model, according to the alarm log text features and the implicit sequence mode features, fault root cause association probability analysis is carried out on the concentrated assemblies to be analyzed to obtain fault root cause association probabilities among the concentrated assemblies to be analyzed, the fault root cause association probabilities among the assemblies can be accurately and efficiently determined, dependence on manpower is greatly reduced, and resource consumption is reduced.
S211: and determining the fault root cause incidence relation among the components in the component set to be analyzed according to the fault root cause incidence probability among the components in the component set to be analyzed.
Specifically, the relationship of the root cause of the failure between the components in the component set to be analyzed may include: and fault association exists among the components in the component set to be analyzed, or fault association does not exist among the components in the component set to be analyzed.
Referring to fig. 8, in an embodiment of the present specification, determining a fault root association relationship between components in the component set to be analyzed according to the fault root association probability between the components in the component set to be analyzed may include:
s801: and when the fault root cause correlation probability meets a preset condition, determining that the fault root cause correlation relationship among the components in the component set to be analyzed is the fault root cause correlation.
In a specific embodiment, the condition that the fault root cause correlation probability satisfies the preset condition may include that the fault root cause correlation probability is greater than a preset threshold, and the preset threshold may be determined by combining with the actual fault root cause analysis requirement, for example, the preset threshold may include 50% or 80%. Accordingly, when the fault root cause association probability does not satisfy a predetermined condition (for example, when the fault root cause association probability is less than or equal to a predetermined threshold), it may be determined that the fault root cause association relationship between the components in the component set to be analyzed is no fault root cause association.
The fault root incidence relation among the assemblies in the assemblies set to be analyzed is determined according to the fault root incidence probability among the assemblies in the assemblies set to be analyzed, so that operation and maintenance personnel can be helped to trace the source of the fault and determine the relevant factors of the fault, the operation and maintenance personnel can perform corresponding maintenance subsequently, similar faults are avoided from happening again, and the loss caused by the fault is reduced.
In an embodiment of the present specification, a method for training a root cause association probability analysis model may also be included, as follows:
1) acquiring sample implicit sequence mode characteristics corresponding to a plurality of sample component sets marked with fault root cause association probabilities among components and corresponding sample alarm log text characteristics;
in the embodiments of the present specification, the sample component set as the training sample may include a sample component set in which a fault root cause association exists between components (i.e., the probability of the fault root cause association between the components is high), and a sample component set in which a fault root cause association does not exist between components (i.e., the probability of the fault root cause association between the components is low).
Specifically, the obtaining of the text feature of the sample alarm log corresponding to the sample component set labeled with the association probability of the fault root cause between the components may include:
respectively obtaining a sample alarm log of each component in the sample component set within a third preset time range, wherein the third preset time can be determined by combining with actual application requirements; respectively performing text vectorization on the sample alarm logs to obtain corresponding text features of the sample alarm logs, wherein the specific process is similar to the process of S207, and reference may be made to the related description of S207, which is not repeated herein. The specific process of obtaining the sample implicit sequence mode features corresponding to the plurality of sample component sets labeled with the inter-component fault root association probability is similar to the process of S201 to S203, and reference may be made to the relevant description of S201 to S203, which is not described herein again.
2) And training a preset neural network model for fault root cause association probability analysis based on the sample implicit sequence mode characteristics corresponding to the plurality of sample component sets marked with the fault root cause association probability among the components and the corresponding sample alarm log text characteristics, and adjusting the model parameters of the preset neural network model in the training of the fault root cause association probability analysis until the preset neural network model meets a preset convergence condition to obtain the fault root cause association probability analysis model.
By utilizing the sample implicit sequence mode characteristics corresponding to the sample component sets marked with the fault root cause association probability among the components and the corresponding sample alarm log text characteristics, the preset neural network model is trained for fault root cause association probability analysis, a more reliable fault root cause association probability model is obtained, and the reliability of the fault root cause analysis is improved.
As can be seen from the technical solutions provided in the embodiments of the present specification, by obtaining original time sequence information of a plurality of indicators to be analyzed corresponding to a set of components to be analyzed, and determining an implicit sequence pattern feature based on the original time sequence information of the plurality of indicators to be analyzed, determining an implicit sequence pattern feature based on the original time sequence information of the plurality of indicators to be analyzed may include determining an indicator time sequence ascending and descending sequence within a second preset time range according to the original time sequence information of the plurality of indicators to be analyzed, performing sequence pattern mining according to the indicator time sequence ascending and descending sequence to obtain an implicit sequence pattern, where the implicit sequence pattern may be a rule implicit to a change of the plurality of indicators to be analyzed, and may be an association relationship or a causal relationship of a change of several indicators, and subsequently performing fault root cause analysis by combining with an alarm log of each component, the reliability of fault root cause analysis is improved; because each index data can be updated and changed continuously along with time, the implicit sequence mode can also be changed continuously, and a certain index is not associated in the past period of time, but is likely to be associated later, the second preset time range can be adjusted according to requirements to mine the latest implicit sequence mode in real time, so that the flexibility is high, and the timeliness of fault root cause analysis is improved; then, carrying out feature coding on the hidden sequence mode to obtain hidden sequence mode features, acquiring an alarm log of each component in the component set to be analyzed within a first preset time range, and determining alarm log text features corresponding to the alarm log of each component within the first preset time range; then, based on a root cause association probability analysis model, according to the text features of the alarm log and the implicit sequence mode features, fault root cause association probability analysis is carried out on the concentrated assemblies to be analyzed to obtain fault root cause association probabilities among the concentrated assemblies to be analyzed, the fault root cause association probabilities among a plurality of assemblies can be accurately and efficiently determined, dependence on manpower is greatly reduced, and resource consumption is reduced; and finally, determining the fault root cause incidence relation among the assemblies in the assemblies set to be analyzed according to the fault root cause incidence probability among the assemblies in the assemblies set to be analyzed, so that the operation and maintenance personnel can be helped to trace the source of the fault and determine the relevant factors of the fault, the operation and maintenance personnel can perform corresponding maintenance subsequently, the occurrence of similar faults again is avoided, and the loss caused by the fault is reduced.
An embodiment of the present application further provides a fault root cause analysis device, as shown in fig. 9, the device may include:
an original timing information obtaining module 910, configured to obtain original timing information of a plurality of to-be-analyzed indexes corresponding to a to-be-analyzed component set, where the plurality of to-be-analyzed indexes include to-be-analyzed indexes corresponding to each component in the to-be-analyzed component set;
an implicit sequence pattern feature determining module 920, configured to determine an implicit sequence pattern feature based on the original timing information of the multiple to-be-analyzed indicators;
an alarm log obtaining module 930, configured to obtain an alarm log of each component in the component set to be analyzed within a first preset time range;
a text feature determining module 940, configured to determine a text feature of the alarm log corresponding to the alarm log of each component within a first preset time range;
a fault root cause association probability analysis module 950, configured to perform fault root cause association probability analysis on the component set to be analyzed according to the alarm log text feature and the implicit sequence pattern feature based on a root cause association probability analysis model, so as to obtain a fault root cause association probability between the component set to be analyzed;
a failure root cause association relation determining module 960, configured to determine a failure root cause association relation between the components in the component set to be analyzed according to the failure root cause association probability between the components in the component set to be analyzed.
In some embodiments, the root cause association probability analysis model may include:
a correlation mining module, a feature fusion layer, a feed-forward layer, and a classification layer.
When the root cause correlation probability analysis module includes a correlation mining module, a feature fusion layer, a feed-forward layer, and a classification layer, the fault root cause correlation probability analysis module 950 may include:
the correlation mining unit is used for performing correlation mining on the text characteristics of the alarm log based on the correlation mining module to obtain the correlation characteristics of the alarm log;
the characteristic fusion unit is used for carrying out characteristic fusion on the alarm log correlation characteristic and the implicit sequence mode characteristic based on the characteristic fusion layer to obtain a target fusion characteristic;
the feature processing unit is used for performing feature processing on the target fusion feature based on the feedforward layer to obtain a processed target fusion feature;
and the fault root cause association probability determination unit is used for calculating the fault root cause association probability of the processed target fusion characteristics based on the classification layer to obtain the fault root cause association probability among the components in the component set to be analyzed.
In some embodiments, the implicit sequence mode feature determination module 920 described above may include:
the index time sequence ascending and descending sequence determining unit is used for determining an index time sequence ascending and descending sequence within a second preset time range according to the original time sequence information of the plurality of indexes to be analyzed;
the sequence pattern mining unit is used for mining a sequence pattern according to the index time sequence ascending and descending sequence to obtain a hidden sequence pattern;
and the characteristic coding unit is used for carrying out characteristic coding on the implicit sequence mode to obtain the implicit sequence mode characteristics.
In some embodiments, the indicator timing ascending and descending sequence determining unit may include:
the time sequence lifting information determining unit is used for determining the time sequence lifting information of the plurality of indexes to be analyzed according to the original time sequence information of the plurality of indexes to be analyzed;
and the time sequence ascending and descending sequence constructing unit is used for constructing the index time sequence ascending and descending sequence within the second preset time range according to the time sequence ascending and descending information of the plurality of indexes to be analyzed.
In some embodiments, the apparatus may further comprise:
the sample data acquisition unit is used for acquiring sample implicit sequence mode characteristics corresponding to a plurality of sample component sets marked with fault root cause association probabilities among the components and corresponding sample alarm log text characteristics;
and the model training unit is used for training fault root cause correlation probability analysis on a preset neural network model based on the sample implicit sequence mode characteristics corresponding to the plurality of sample component sets marked with the fault root cause correlation probability among the components and the corresponding sample alarm log text characteristics, and adjusting the model parameters of the preset neural network model in the training of the fault root cause correlation probability analysis until the preset neural network model meets a preset convergence condition to obtain the fault root cause correlation probability analysis model.
In some embodiments, the sample data acquiring unit may include:
the sample alarm log acquisition unit is used for respectively acquiring a sample alarm log of each component in the sample component set within a third preset time range;
and the text vectorization unit is used for respectively carrying out text vectorization on the sample alarm logs to obtain corresponding text characteristics of the sample alarm logs.
In some embodiments, the failure root cause association determination module 960 may include:
and the fault root cause association determining unit is used for determining that the fault root cause association relationship among the components in the component set to be analyzed is fault root cause association when the fault root cause association probability meets a preset condition.
The device and method embodiments in the device embodiment are based on the same application concept.
The embodiment of the present application provides a computer device, which includes a processor and a memory, where at least one instruction or at least one program is stored in the memory, and the at least one instruction or the at least one program is loaded and executed by the processor to implement the fault root cause analysis method provided by the above method embodiment.
The memory may be used to store software programs and modules, and the processor may execute various functional applications and data processing by operating the software programs and modules stored in the memory. The memory can mainly comprise a program storage area and a data storage area, wherein the program storage area can store an operating system, application programs needed by functions and the like; the storage data area may store data created according to use of the apparatus, and the like. Further, the memory may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. Accordingly, the memory may also include a memory controller to provide the processor access to the memory.
The method embodiments provided in the embodiments of the present application may be executed in a mobile terminal, a computer terminal, a server, or a similar computing device, that is, the computer device may include a mobile terminal, a computer terminal, a server, or a similar computing device. The server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a network service, cloud communication, middleware service, a domain name service, a security service, a CDN, a big data and artificial intelligence platform, and the like. The terminal may be, but is not limited to, a smart phone, a tablet computer, a laptop computer, a desktop computer, a smart speaker, a smart watch, and the like. Taking the example of running on a server, fig. 10 is a hardware structure block diagram of a server for implementing the fault root cause analysis method according to the embodiment of the present application. As shown in fig. 10, the server 1000 may have a large difference due to different configurations or performances, and may include one or more Central Processing Units (CPUs) 1010 (the processor 1010 may include a package)A processing device including, but not limited to, a microprocessor MCU or a programmable logic device FPGA, etc.), a memory 1030 for storing data, one or more storage media 1020 (e.g., one or more mass storage devices) storing applications 1023 or data 1022. Memory 1030 and storage media 1020 may be, among other things, transient or persistent storage. The program stored in the storage medium 1020 may include one or more modules, each of which may include a series of instruction operations for a server. Still further, the central processor 1010 may be configured to communicate with the storage medium 1020 and execute a series of instruction operations in the storage medium 1020 on the server 1000. The Server 1000 may also include one or more power supplies 1060, one or more wired or wireless network interfaces 1050, one or more input-output interfaces 1040, and/or one or more operating systems 1021, such as a Windows ServerTM,Mac OS XTM,UnixTM,LinuxTM,FreeBSDTMAnd so on.
The Processor 1010 may be an integrated circuit chip having Signal processing capabilities, such as a general purpose Processor, a Digital Signal Processor (DSP), or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like, wherein the general purpose Processor may be a microprocessor or any conventional Processor, or the like.
Input-output interface 1040 may be used to receive or transmit data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of the server 1000. In one example, i/o Interface 1040 includes a Network adapter (NIC) that may be coupled to other Network devices via a base station to communicate with the internet. In one example, the input/output interface 1040 may be a Radio Frequency (RF) module, which is used for communicating with the internet in a wireless manner.
The operating system 1021 may include system programs for handling various basic system services and performing hardware related tasks, such as framework layer, core library layer, driver layer, etc., for implementing various underlying services and handling hardware based tasks.
It will be understood by those skilled in the art that the structure shown in fig. 10 is merely illustrative and is not intended to limit the structure of the electronic device. For example, server 1000 may also include more or fewer components than shown in FIG. 10, or have a different configuration than shown in FIG. 10.
Embodiments of the present application further provide a computer-readable storage medium, which may be disposed in a server to store at least one instruction or at least one program for implementing a fault root cause analysis method according to the method embodiments, where the at least one instruction or the at least one program is loaded and executed by the processor to implement the fault root cause analysis method according to the method embodiments.
Alternatively, in this embodiment, the storage medium may be located in at least one network server of a plurality of network servers of a computer network. Optionally, in this embodiment, the storage medium may include, but is not limited to: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
As can be seen from the embodiments of the fault root cause analysis method, device, apparatus, or storage medium provided in the present application, determining the implicit sequence pattern characteristics based on the original time sequence information of the plurality of to-be-analyzed indicators by obtaining the original time sequence information of the plurality of to-be-analyzed indicators corresponding to the to-be-analyzed set, wherein determining the implicit sequence pattern characteristics based on the original time sequence information of the plurality of to-be-analyzed indicators may include determining an indicator time sequence ascending/descending sequence within a second preset time range according to the original time sequence information of the plurality of to-be-analyzed indicators, performing sequence pattern mining according to the indicator time sequence ascending/descending sequence to obtain an implicit sequence pattern, where the implicit sequence pattern may be a rule implicit in a change of the plurality of to-be-analyzed indicators, and may be an association relationship or a causal relationship of changes of several indicators, and subsequently performing fault root cause analysis by combining with the alarm logs of each assembly, the reliability of fault root cause analysis is improved; because each index data can be updated and changed continuously along with time, the implicit sequence mode can also be changed continuously, and a certain index is not associated in the past period of time, but is likely to be associated later, the second preset time range can be adjusted according to requirements to mine the latest implicit sequence mode in real time, so that the flexibility is high, and the timeliness of fault root cause analysis is improved; then, carrying out feature coding on the hidden sequence mode to obtain hidden sequence mode features, acquiring an alarm log of each component in the component set to be analyzed within a first preset time range, and determining alarm log text features corresponding to the alarm log of each component within the first preset time range; then, based on a root cause association probability analysis model, according to the text features of the alarm log and the implicit sequence mode features, fault root cause association probability analysis is carried out on the concentrated assemblies to be analyzed to obtain fault root cause association probabilities among the concentrated assemblies to be analyzed, the fault root cause association probabilities among several assemblies can be accurately and efficiently determined, dependence on manpower is greatly reduced, and resource consumption is reduced; and finally, determining the fault root cause incidence relation among the assemblies in the assemblies set to be analyzed according to the fault root cause incidence probability among the assemblies in the assemblies set to be analyzed, so that the operation and maintenance personnel can be helped to trace the source of the fault and determine the relevant factors of the fault, the operation and maintenance personnel can perform corresponding maintenance subsequently, the occurrence of similar faults again is avoided, and the loss caused by the fault is reduced.
It should be noted that: the sequence of the embodiments of the present application is only for description, and does not represent the advantages and disadvantages of the embodiments. And specific embodiments thereof have been described above. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the apparatus, device and storage medium embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference may be made to some descriptions of the method embodiments for relevant points.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
The above description is only exemplary of the present application and should not be taken as limiting the present application, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the protection scope of the present application.
Claims (10)
1. A method of root cause analysis of a fault, the method comprising:
acquiring original time sequence information of a plurality of indexes to be analyzed corresponding to a component set to be analyzed, wherein the plurality of indexes to be analyzed comprise indexes to be analyzed corresponding to each component in the component set to be analyzed;
determining implicit sequence mode characteristics based on the original time sequence information of the plurality of indexes to be analyzed;
acquiring an alarm log of each component in the component set to be analyzed within a first preset time range;
determining alarm log text characteristics corresponding to the alarm logs of each component within a first preset time range;
based on a root cause correlation probability analysis model, according to the text features of the alarm log and the implicit sequence mode features, carrying out fault root cause correlation probability analysis on the concentrated assemblies to be analyzed to obtain fault root cause correlation probabilities among the concentrated assemblies to be analyzed;
and determining the fault root cause incidence relation among the components in the component set to be analyzed according to the fault root cause incidence probability among the components in the component set to be analyzed.
2. The method of claim 1, wherein the root cause correlation probability analysis model comprises a correlation mining module, a feature fusion layer, a feed forward layer, and a classification layer;
the analyzing model based on root cause correlation probability analyzes the fault root cause correlation probability of the concentrated assemblies to be analyzed according to the text features of the alarm logs and the implicit sequence mode features, and the obtaining of the fault root cause correlation probability among the concentrated assemblies to be analyzed comprises the following steps:
performing relevance mining on the text features of the alarm logs based on the relevance mining module to obtain the relevance features of the alarm logs;
performing feature fusion on the alarm log correlation feature and the implicit sequence mode feature based on the feature fusion layer to obtain a target fusion feature;
performing feature processing on the target fusion feature based on the feedforward layer to obtain a processed target fusion feature;
and calculating the fault root cause association probability of the processed target fusion characteristics based on the classification layer to obtain the fault root cause association probability among the components in the component set to be analyzed.
3. The method of claim 1, wherein the determining implicit sequence pattern features based on the raw timing information of the plurality of metrics to be analyzed comprises:
determining an index time sequence ascending and descending sequence within a second preset time range according to the original time sequence information of the plurality of indexes to be analyzed;
mining a sequence mode according to the index time sequence ascending and descending sequence to obtain an implicit sequence mode;
and carrying out characteristic coding on the implicit sequence mode to obtain the implicit sequence mode characteristic.
4. The method of claim 3, wherein the determining the index timing ascending and descending sequence within a second preset time range according to the original timing information of the plurality of indexes to be analyzed comprises:
determining time sequence lifting information of the multiple indexes to be analyzed according to the original time sequence information of the multiple indexes to be analyzed;
and constructing an index time sequence ascending and descending sequence within the second preset time range according to the time sequence ascending and descending information of the plurality of indexes to be analyzed.
5. The method of claim 1, further comprising:
acquiring sample implicit sequence mode characteristics corresponding to a plurality of sample component sets marked with fault root cause association probabilities among components and corresponding sample alarm log text characteristics;
based on the sample implicit sequence mode characteristics corresponding to the sample component sets marked with the fault root cause association probability among the components and the corresponding sample alarm log text characteristics, training of fault root cause association probability analysis is carried out on a preset neural network model, model parameters of the preset neural network model are adjusted in the training of the fault root cause association probability analysis until the preset neural network model meets a preset convergence condition, and the fault root cause association probability analysis model is obtained.
6. The method of claim 5, wherein the obtaining sample alarm log text features corresponding to a plurality of sample component sets labeled with inter-component fault root cause association probabilities comprises:
respectively obtaining a sample alarm log of each component in the sample component set within a third preset time range;
and respectively carrying out text vectorization on the sample alarm logs to obtain corresponding text characteristics of the sample alarm logs.
7. The method according to claim 1, wherein the determining the correlation relationship of the fault root cause between the components in the component set to be analyzed according to the probability of the correlation of the fault root cause between the components in the component set to be analyzed comprises:
and when the fault root cause correlation probability meets a preset condition, determining that the fault root cause correlation relationship among the components in the component set to be analyzed is the fault root cause correlation.
8. A fault root cause analysis apparatus, the apparatus comprising:
the system comprises an original time sequence information acquisition module, a time sequence analysis module and a time sequence analysis module, wherein the original time sequence information acquisition module is used for acquiring original time sequence information of a plurality of indexes to be analyzed corresponding to a component set to be analyzed, and the plurality of indexes to be analyzed comprise indexes to be analyzed corresponding to each component in the component set to be analyzed;
the implicit sequence mode characteristic determining module is used for determining implicit sequence mode characteristics based on the original time sequence information of the plurality of indexes to be analyzed;
the alarm log acquisition module is used for acquiring an alarm log of each component in the component set to be analyzed within a first preset time range;
the text characteristic determining module is used for determining the text characteristic of the alarm log corresponding to the alarm log of each component in a first preset time range;
the fault root cause correlation probability analysis module is used for carrying out fault root cause correlation probability analysis on the component set to be analyzed according to the alarm log text characteristic and the hidden sequence mode characteristic based on a root cause correlation probability analysis model to obtain the fault root cause correlation probability among the component set to be analyzed;
and the fault root incidence relation determining module is used for determining the fault root incidence relation among the components in the component set to be analyzed according to the fault root incidence probability among the components in the component set to be analyzed.
9. A fault root cause analysis device comprising a processor and a memory, wherein at least one instruction or at least one program is stored in the memory, and the at least one instruction or the at least one program is loaded and executed by the processor to implement the fault root cause analysis method according to any one of claims 1 to 7.
10. A computer-readable storage medium, wherein at least one instruction or at least one program is stored in the storage medium, and the at least one instruction or the at least one program is loaded and executed by a processor to implement the fault root cause analysis method according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011072717.9A CN112052151B (en) | 2020-10-09 | 2020-10-09 | Fault root cause analysis method, device, equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011072717.9A CN112052151B (en) | 2020-10-09 | 2020-10-09 | Fault root cause analysis method, device, equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112052151A true CN112052151A (en) | 2020-12-08 |
CN112052151B CN112052151B (en) | 2022-02-18 |
Family
ID=73605513
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011072717.9A Active CN112052151B (en) | 2020-10-09 | 2020-10-09 | Fault root cause analysis method, device, equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112052151B (en) |
Cited By (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112699005A (en) * | 2020-12-30 | 2021-04-23 | 网宿科技股份有限公司 | Server hardware fault monitoring method, electronic equipment and storage medium |
CN112799929A (en) * | 2021-01-29 | 2021-05-14 | 中国工商银行股份有限公司 | Root cause analysis method and system for alarm log |
CN112804079A (en) * | 2020-12-10 | 2021-05-14 | 北京浪潮数据技术有限公司 | Cloud computing platform alarm analysis method, device, equipment and storage medium |
CN112799868A (en) * | 2021-02-08 | 2021-05-14 | 腾讯科技(深圳)有限公司 | Root cause determination method and device, computer equipment and storage medium |
CN112905479A (en) * | 2021-03-17 | 2021-06-04 | 中通天鸿(北京)通信科技股份有限公司 | Cloud platform based alarm accident root cause optimal path determination method and system |
CN112905371A (en) * | 2021-01-28 | 2021-06-04 | 清华大学 | Software change checking method and device based on heterogeneous multi-source data anomaly detection |
CN113177584A (en) * | 2021-04-19 | 2021-07-27 | 合肥工业大学 | Zero sample learning-based composite fault diagnosis method |
CN113240139A (en) * | 2021-06-03 | 2021-08-10 | 南京中兴新软件有限责任公司 | Alarm cause and effect evaluation method, fault root cause positioning method and electronic equipment |
CN113255780A (en) * | 2021-05-28 | 2021-08-13 | 润联软件系统(深圳)有限公司 | Reduction gearbox fault prediction method and device, computer equipment and storage medium |
CN113552856A (en) * | 2021-09-22 | 2021-10-26 | 成都数之联科技有限公司 | Process parameter root factor positioning method and related device |
CN113569083A (en) * | 2021-06-17 | 2021-10-29 | 南京大学 | Intelligent sound box local end digital evidence obtaining system and method based on data traceability model |
CN113590451A (en) * | 2021-09-29 | 2021-11-02 | 阿里云计算有限公司 | Root cause positioning method, operation and maintenance server and storage medium |
CN113640699A (en) * | 2021-10-14 | 2021-11-12 | 南京国铁电气有限责任公司 | Fault judgment method, system and equipment for microcomputer control type alternating current and direct current power supply system |
CN113821408A (en) * | 2021-09-23 | 2021-12-21 | 中国建设银行股份有限公司 | Server alarm processing method and related equipment |
CN113821418A (en) * | 2021-06-24 | 2021-12-21 | 腾讯科技(深圳)有限公司 | Fault tracking analysis method and device, storage medium and electronic equipment |
CN113872814A (en) * | 2021-09-29 | 2021-12-31 | 北京金山云网络技术有限公司 | Information processing method, device and system for content distribution network |
CN114490303A (en) * | 2022-04-07 | 2022-05-13 | 阿里巴巴达摩院(杭州)科技有限公司 | Fault root cause determination method and device and cloud equipment |
CN114629776A (en) * | 2020-12-11 | 2022-06-14 | 中国联合网络通信集团有限公司 | Fault analysis method and device based on graph model |
WO2023011618A1 (en) * | 2021-08-06 | 2023-02-09 | International Business Machines Corporation | Predicting root cause of alert using recurrent neural network |
CN115878421A (en) * | 2022-12-09 | 2023-03-31 | 国网湖北省电力有限公司信息通信公司 | Data center equipment-level fault prediction method, system and medium based on log time sequence correlation characteristic mining |
CN117093407A (en) * | 2023-10-19 | 2023-11-21 | 北京凡得科技有限公司 | Improved S-learner-based flow anomaly cascade root cause analysis method and system |
CN117527523A (en) * | 2023-11-23 | 2024-02-06 | 广东堡塔安全技术有限公司 | Cloud computing-based server security monitoring system |
CN117656846A (en) * | 2024-02-01 | 2024-03-08 | 临沂大学 | Dynamic storage method for automobile electric drive fault data |
Citations (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090049338A1 (en) * | 2007-08-16 | 2009-02-19 | Gm Global Technology Operations, Inc. | Root cause diagnostics using temporal data mining |
US20140129876A1 (en) * | 2012-11-05 | 2014-05-08 | Cisco Technology, Inc. | Root cause analysis in a sensor-actuator fabric of a connected environment |
US20150032746A1 (en) * | 2013-07-26 | 2015-01-29 | Genesys Telecommunications Laboratories, Inc. | System and method for discovering and exploring concepts and root causes of events |
CN105812177A (en) * | 2016-03-08 | 2016-07-27 | 华为技术有限公司 | Network fault processing method and processing apparatus |
CN105893380A (en) * | 2014-12-11 | 2016-08-24 | 成都网安科技发展有限公司 | Improved text classification characteristic selection method |
CN107301119A (en) * | 2017-06-28 | 2017-10-27 | 北京优特捷信息技术有限公司 | The method and device of IT failure root cause analysis is carried out using timing dependence |
CN109358602A (en) * | 2018-10-23 | 2019-02-19 | 山东中创软件商用中间件股份有限公司 | A kind of failure analysis methods, device and relevant device |
CN109687999A (en) * | 2018-12-11 | 2019-04-26 | 山东中创软件商用中间件股份有限公司 | A kind of association analysis method of alarm failure, device and equipment |
CN110147387A (en) * | 2019-05-08 | 2019-08-20 | 腾讯科技(上海)有限公司 | A kind of root cause analysis method, apparatus, equipment and storage medium |
US20190320329A1 (en) * | 2017-01-26 | 2019-10-17 | Telefonaktiebolaget Lm Ericsson (Publ) | System and Method for Analyzing Network Performance Data |
US20190324831A1 (en) * | 2017-03-28 | 2019-10-24 | Xiaohui Gu | System and Method for Online Unsupervised Event Pattern Extraction and Holistic Root Cause Analysis for Distributed Systems |
CN110609759A (en) * | 2018-06-15 | 2019-12-24 | 华为技术有限公司 | Fault root cause analysis method and device |
CN111191230A (en) * | 2019-12-27 | 2020-05-22 | 国网天津市电力公司 | Fast network attack backtracking mining method based on convolutional neural network and application |
CN111552609A (en) * | 2020-04-12 | 2020-08-18 | 西安电子科技大学 | Abnormal state detection method, system, storage medium, program and server |
CN111726248A (en) * | 2020-05-29 | 2020-09-29 | 北京宝兰德软件股份有限公司 | Alarm root cause positioning method and device |
CN111722952A (en) * | 2020-05-25 | 2020-09-29 | 中国建设银行股份有限公司 | Fault analysis method, system, equipment and storage medium of business system |
-
2020
- 2020-10-09 CN CN202011072717.9A patent/CN112052151B/en active Active
Patent Citations (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090049338A1 (en) * | 2007-08-16 | 2009-02-19 | Gm Global Technology Operations, Inc. | Root cause diagnostics using temporal data mining |
US20140129876A1 (en) * | 2012-11-05 | 2014-05-08 | Cisco Technology, Inc. | Root cause analysis in a sensor-actuator fabric of a connected environment |
US20150032746A1 (en) * | 2013-07-26 | 2015-01-29 | Genesys Telecommunications Laboratories, Inc. | System and method for discovering and exploring concepts and root causes of events |
CN105893380A (en) * | 2014-12-11 | 2016-08-24 | 成都网安科技发展有限公司 | Improved text classification characteristic selection method |
CN105812177A (en) * | 2016-03-08 | 2016-07-27 | 华为技术有限公司 | Network fault processing method and processing apparatus |
US20190320329A1 (en) * | 2017-01-26 | 2019-10-17 | Telefonaktiebolaget Lm Ericsson (Publ) | System and Method for Analyzing Network Performance Data |
US20190324831A1 (en) * | 2017-03-28 | 2019-10-24 | Xiaohui Gu | System and Method for Online Unsupervised Event Pattern Extraction and Holistic Root Cause Analysis for Distributed Systems |
CN107301119A (en) * | 2017-06-28 | 2017-10-27 | 北京优特捷信息技术有限公司 | The method and device of IT failure root cause analysis is carried out using timing dependence |
CN110609759A (en) * | 2018-06-15 | 2019-12-24 | 华为技术有限公司 | Fault root cause analysis method and device |
CN109358602A (en) * | 2018-10-23 | 2019-02-19 | 山东中创软件商用中间件股份有限公司 | A kind of failure analysis methods, device and relevant device |
CN109687999A (en) * | 2018-12-11 | 2019-04-26 | 山东中创软件商用中间件股份有限公司 | A kind of association analysis method of alarm failure, device and equipment |
CN110147387A (en) * | 2019-05-08 | 2019-08-20 | 腾讯科技(上海)有限公司 | A kind of root cause analysis method, apparatus, equipment and storage medium |
CN111191230A (en) * | 2019-12-27 | 2020-05-22 | 国网天津市电力公司 | Fast network attack backtracking mining method based on convolutional neural network and application |
CN111552609A (en) * | 2020-04-12 | 2020-08-18 | 西安电子科技大学 | Abnormal state detection method, system, storage medium, program and server |
CN111722952A (en) * | 2020-05-25 | 2020-09-29 | 中国建设银行股份有限公司 | Fault analysis method, system, equipment and storage medium of business system |
CN111726248A (en) * | 2020-05-29 | 2020-09-29 | 北京宝兰德软件股份有限公司 | Alarm root cause positioning method and device |
Non-Patent Citations (2)
Title |
---|
IULIA GABRIELA CARJEU 等: "Clustering IT Events around Common Root Causes", 《2014 IEEE INTERNATIONAL CONFERENCE ON SERVICES COMPUTING》 * |
贾统 等: "基于日志数据的分布式软件系统故障诊断综述", 《软件学报》 * |
Cited By (39)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112804079A (en) * | 2020-12-10 | 2021-05-14 | 北京浪潮数据技术有限公司 | Cloud computing platform alarm analysis method, device, equipment and storage medium |
CN112804079B (en) * | 2020-12-10 | 2023-04-07 | 北京浪潮数据技术有限公司 | Alarm analysis method, device, equipment and storage medium for cloud computing platform |
CN114629776A (en) * | 2020-12-11 | 2022-06-14 | 中国联合网络通信集团有限公司 | Fault analysis method and device based on graph model |
CN112699005A (en) * | 2020-12-30 | 2021-04-23 | 网宿科技股份有限公司 | Server hardware fault monitoring method, electronic equipment and storage medium |
CN112905371A (en) * | 2021-01-28 | 2021-06-04 | 清华大学 | Software change checking method and device based on heterogeneous multi-source data anomaly detection |
CN112905371B (en) * | 2021-01-28 | 2022-05-20 | 清华大学 | Software change checking method and device based on heterogeneous multi-source data anomaly detection |
CN112799929A (en) * | 2021-01-29 | 2021-05-14 | 中国工商银行股份有限公司 | Root cause analysis method and system for alarm log |
CN112799868A (en) * | 2021-02-08 | 2021-05-14 | 腾讯科技(深圳)有限公司 | Root cause determination method and device, computer equipment and storage medium |
CN112799868B (en) * | 2021-02-08 | 2023-01-24 | 腾讯科技(深圳)有限公司 | Root cause determination method and device, computer equipment and storage medium |
CN112905479B (en) * | 2021-03-17 | 2024-05-10 | 中通天鸿(北京)通信科技股份有限公司 | Cloud platform-based method and system for determining optimal path of alarm accident root cause |
CN112905479A (en) * | 2021-03-17 | 2021-06-04 | 中通天鸿(北京)通信科技股份有限公司 | Cloud platform based alarm accident root cause optimal path determination method and system |
CN113177584B (en) * | 2021-04-19 | 2022-10-28 | 合肥工业大学 | Compound fault diagnosis method based on zero sample learning |
CN113177584A (en) * | 2021-04-19 | 2021-07-27 | 合肥工业大学 | Zero sample learning-based composite fault diagnosis method |
CN113255780A (en) * | 2021-05-28 | 2021-08-13 | 润联软件系统(深圳)有限公司 | Reduction gearbox fault prediction method and device, computer equipment and storage medium |
CN113255780B (en) * | 2021-05-28 | 2024-05-03 | 润联智能科技股份有限公司 | Reduction gearbox fault prediction method and device, computer equipment and storage medium |
CN113240139B (en) * | 2021-06-03 | 2023-09-26 | 南京中兴新软件有限责任公司 | Alarm cause and effect evaluation method, fault root cause positioning method and electronic equipment |
CN113240139A (en) * | 2021-06-03 | 2021-08-10 | 南京中兴新软件有限责任公司 | Alarm cause and effect evaluation method, fault root cause positioning method and electronic equipment |
CN113569083A (en) * | 2021-06-17 | 2021-10-29 | 南京大学 | Intelligent sound box local end digital evidence obtaining system and method based on data traceability model |
CN113569083B (en) * | 2021-06-17 | 2023-11-03 | 南京大学 | Intelligent sound box local digital evidence obtaining system and method based on data tracing model |
CN113821418A (en) * | 2021-06-24 | 2021-12-21 | 腾讯科技(深圳)有限公司 | Fault tracking analysis method and device, storage medium and electronic equipment |
CN113821418B (en) * | 2021-06-24 | 2024-05-14 | 腾讯科技(深圳)有限公司 | Fault root cause analysis method and device, storage medium and electronic equipment |
US11928009B2 (en) | 2021-08-06 | 2024-03-12 | International Business Machines Corporation | Predicting a root cause of an alert using a recurrent neural network |
WO2023011618A1 (en) * | 2021-08-06 | 2023-02-09 | International Business Machines Corporation | Predicting root cause of alert using recurrent neural network |
CN113552856A (en) * | 2021-09-22 | 2021-10-26 | 成都数之联科技有限公司 | Process parameter root factor positioning method and related device |
CN113552856B (en) * | 2021-09-22 | 2021-12-10 | 成都数之联科技有限公司 | Process parameter root factor positioning method and related device |
CN113821408A (en) * | 2021-09-23 | 2021-12-21 | 中国建设银行股份有限公司 | Server alarm processing method and related equipment |
CN113872814A (en) * | 2021-09-29 | 2021-12-31 | 北京金山云网络技术有限公司 | Information processing method, device and system for content distribution network |
CN113590451A (en) * | 2021-09-29 | 2021-11-02 | 阿里云计算有限公司 | Root cause positioning method, operation and maintenance server and storage medium |
CN113640699B (en) * | 2021-10-14 | 2021-12-24 | 南京国铁电气有限责任公司 | Fault judgment method, system and equipment for microcomputer control type alternating current and direct current power supply system |
CN113640699A (en) * | 2021-10-14 | 2021-11-12 | 南京国铁电气有限责任公司 | Fault judgment method, system and equipment for microcomputer control type alternating current and direct current power supply system |
CN114490303B (en) * | 2022-04-07 | 2022-07-12 | 阿里巴巴达摩院(杭州)科技有限公司 | Fault root cause determination method and device and cloud equipment |
CN114490303A (en) * | 2022-04-07 | 2022-05-13 | 阿里巴巴达摩院(杭州)科技有限公司 | Fault root cause determination method and device and cloud equipment |
CN115878421B (en) * | 2022-12-09 | 2023-11-14 | 国网湖北省电力有限公司信息通信公司 | Data center equipment level fault prediction method, system and medium |
CN115878421A (en) * | 2022-12-09 | 2023-03-31 | 国网湖北省电力有限公司信息通信公司 | Data center equipment-level fault prediction method, system and medium based on log time sequence correlation characteristic mining |
CN117093407A (en) * | 2023-10-19 | 2023-11-21 | 北京凡得科技有限公司 | Improved S-learner-based flow anomaly cascade root cause analysis method and system |
CN117093407B (en) * | 2023-10-19 | 2024-03-19 | 北京凡得科技有限公司 | Improved S-learner-based flow anomaly cascade root cause analysis method and system |
CN117527523A (en) * | 2023-11-23 | 2024-02-06 | 广东堡塔安全技术有限公司 | Cloud computing-based server security monitoring system |
CN117656846A (en) * | 2024-02-01 | 2024-03-08 | 临沂大学 | Dynamic storage method for automobile electric drive fault data |
CN117656846B (en) * | 2024-02-01 | 2024-04-19 | 临沂大学 | Dynamic storage method for automobile electric drive fault data |
Also Published As
Publication number | Publication date |
---|---|
CN112052151B (en) | 2022-02-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112052151B (en) | Fault root cause analysis method, device, equipment and storage medium | |
US10127301B2 (en) | Method and system for implementing efficient classification and exploration of data | |
CN109871311B (en) | Method and device for recommending test cases | |
US8266149B2 (en) | Clustering with similarity-adjusted entropy | |
CN111914159B (en) | Information recommendation method and terminal | |
CN113626241B (en) | Abnormality processing method, device, equipment and storage medium for application program | |
CN109471783B (en) | Method and device for predicting task operation parameters | |
US20140365827A1 (en) | Architecture for end-to-end testing of long-running, multi-stage asynchronous data processing services | |
CN111104242A (en) | Method and device for processing abnormal logs of operating system based on deep learning | |
EP3356951B1 (en) | Managing a database of patterns used to identify subsequences in logs | |
KR101850993B1 (en) | Method and apparatus for extracting keyword based on cluster | |
CN113821418B (en) | Fault root cause analysis method and device, storage medium and electronic equipment | |
CN113204621A (en) | Document storage method, document retrieval method, device, equipment and storage medium | |
CN112800197A (en) | Method and device for determining target fault information | |
US20210081441A1 (en) | Automatic feature extraction from unstructured log data utilizing term frequency scores | |
CN116795977A (en) | Data processing method, apparatus, device and computer readable storage medium | |
CN117312825A (en) | Target behavior detection method and device, electronic equipment and storage medium | |
CN114418226B (en) | Fault analysis method and device for power communication system | |
US20200110815A1 (en) | Multi contextual clustering | |
CN108733707B (en) | Method and device for determining stability of search function | |
CN112364185B (en) | Method and device for determining characteristics of multimedia resources, electronic equipment and storage medium | |
CN111831536A (en) | Automatic testing method and device | |
US11244007B2 (en) | Automatic adaption of a search configuration | |
CN112579673A (en) | Multi-source data processing method and device | |
CN112381167A (en) | Method for training task classification model, and task classification method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |