Next Article in Journal
MAXENT3D_PID: An Estimator for the Maximum-Entropy Trivariate Partial Information Decomposition
Next Article in Special Issue
A Nonvolatile Fractional Order Memristor Model and Its Complex Dynamics
Previous Article in Journal
Energy-Efficient Joint Design of Fronthaul and Edge Links for Cache-Aided C-RAN Systems with Wireless Fronthaul
Previous Article in Special Issue
Recurrence Networks in Natural Languages
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Remaining Useful Life Prediction with Similarity Fusion of Multi-Parameter and Multi-Sample Based on the Vibration Signals of Diesel Generator Gearbox

1
School of Reliability and Systems Engineering, Beihang University, Beijing 100191, China
2
Nanjing Research Institute of Electronics Technology, Nanjing 210039, China
*
Author to whom correspondence should be addressed.
Submission received: 23 July 2019 / Revised: 24 August 2019 / Accepted: 1 September 2019 / Published: 3 September 2019
(This article belongs to the Special Issue Entropy, Nonlinear Dynamics and Complexity)

Abstract

:
The prediction of electrical machines’ Remaining Useful Life (RUL) can facilitate making electrical machine maintenance policies, which is important for improving their security and extending their life span. This paper proposes an RUL prediction model with similarity fusion of multi-parameter and multi-sample. Firstly, based on the time domain and frequency domain extraction of vibration signals, the performance damage indicator system of a gearbox is established to select the optimal damage indicators for RUL prediction. Low-pass filtering based on approximate entropy variance (Aev) is introduced in this process because of its stability. Secondly, this paper constructs Dynamic Time Warping Distance (DTWD) as a similarity measurement function, which belongs to the nonlinear dynamic programming algorithm. It performed better than the traditional Euclidean distance. Thirdly, based on DTWD, similarity fusion of multi-parameter and multi-sample methods is proposed here to achieve RUL prediction. Next, the performance evaluation indicator Q is adopted to evaluate the RUL prediction accuracy of different methods. Finally, the proposed method is verified by experiments, and the Multivariable Support Vector Machine (MSVM) and Principal Component Analysis (PCA) are introduced for comparative studies. The results show that the Mean Absolute Percentage Error (MAPE) of the similarity fusion of multi-parameter and multi-sample methods proposed here is below 14%, which is lower than MSVM’s and PCA’s. Additionally, the RUL prediction based on the DTWD function in multi-sample similarity fusion exhibits the best accuracy.

1. Introduction

As a nonlinear dynamical system, a diesel generator’s safe and smooth running is essential to the reliability of systems. The gearbox is a core part of a diesel generator, directly determining its performance. Remaining Useful Life (RUL) prediction can detect faults early and estimate the downtime of diesel generator components, further helping operators to arrange a reasonable maintenance schedule and save operating costs.
Vibration signal analysis is one of the most widely used methods of condition monitoring. Vibration monitoring generally involves arranging sensors at important locations, using the data acquisition card to obtain signals, and finally using the computer to calculate and analyze the data. This article aims at analyzing the degradation trend of machines and predicts their RUL with the vibration signal collected from the sensors online or offline. In this way, the RUL of a diesel generator is achieved during condition monitoring.
One of condition-based maintenance (CBM)’s main missions is to predict a machine’s RUL [1]. RUL prediction counts more than fault diagnosis in the makings of maintenance decisions [2]. According to the data and continuous degradation trend recorded by the condition detection system, RUL is predicted. It will forecast a potential degradation when current faults have been cleared, providing direct references for CBM. As Figure 1 shows, the functional degradation of a and b stands at an even level at ti−1. Sa, Sb represent the degree of performance degradation for machine a and b, and fc means that the machine is incapable of working. Additionally, at ti, a’s health level is higher than b’s, indicating that a is healthier. After ti, a’s function degrades faster than b’s and a’s RUL is shorter. Any planned maintenance must be performed on a in advance [3].
RUL is defined as the time span from the present moment to the end of the useful life [4], expressed as l k = t E o l t k , where tEol is life termination, tk is the present moment, and l k is the remaining life at tk.
The primary mission of RUL prediction is to monitor the useful time left before the system loses its working capability according to condition detection information. Based on time series analysis, the accuracy of prediction is the primary factor considered in the choice of prediction method. The existing methods are based on physical models, statistical data, and artificial intelligence [5], as described in the following:
(1) RUL prediction methods based on physical models reflect the life-cycle degradation process of the system by establishing a mathematical model based on the failure mechanism [5]. As a typical physical model, the Paris-Erdogan model (PE) is widely used for RUL prediction. Frank et al. [6] used PE to predict the RUL of two types of pipelines, 80 and 100. Hu et al. [7] used Norton’s law to describe the creep of a turbine and combined the Kalman filtering (KF) and particle filter (PF) to predict RUL; however, the methods based on physical models need the deep understanding and sufficiently accurate judgment of failure mechanism to ensure the accuracy of RUL estimation.
(2) RUL prediction methods based on statistical data fit the observational data into a random coefficient model and a stochastic process model. This method is widely applied as many on-the-shelf statistical models can be applied to fit the data, that is for instance random coefficient models, autoregressive models, Gamma process models, inverse gaussian processes, Markov models, and proportional hazards models. However, Autoregressive models rely heavily on high-quality historical data and are not conducive to RUL prediction under complex operating conditions, Wiener models and Gamma process models is limited by the assumption of Markov, which assume that the future state is only related to the current state but not to the past state, so it is not applicable to some practical situations.
(3) RUL prediction methods based on artificial intelligence concentrate on learning the degradation pattern of the system from observations. Common AI techniques include the artificial neural network (ANN), neural fuzzy (NF), support vector machine/relevance vector machine (SVM/RVM), K-nearest neighbor (KNN) and Gaussian process regression (GPR). Hussain et al. [8] extracted the index of health from the vibration signal, and established the RUL prediction model by the adaptive neural fuzzy inference system and nonlinear autoregression. The NF excels in RUL prediction because it takes advantage of expert knowledge and intelligent ANN, but needs high-quality data sources. There are many different kinds of SVM that are used for machines’ RUL prediction, like one-class SVM and multi-class SVM [9], and Squares-SVM [10]. However, SVM only provides point estimate and does not provide a probability distribution over of points. In order to make up for this shortcoming, RVM was proposed, which has the same functional form as SVM, but provides a full probability distribution over all possible outcomes [11]. However, those methods focus more on data training rather than analysing the mechanism of mechanical failure. The structure and parameters of ANN need to be set artificially, which leads to low generalization ability; Kernel function selection for SVM/RVM with different objects is a huge challenge. Calculation process of GPR is complex and takes a long time.
It can be seen from the above that the three RUL prediction ideas have their own limitations. The methods of RUL prediction are variable, among which, similarity measure of the data-driven prediction is advantageous at avoiding constructing complex functional degradation models. Therefore, this paper will study RUL prediction based on statistical data from the perspective of similarity measure. Research on similarity-based RUL prediction was first proposed in 2012 and has been proved to be a very effective RUL prediction approach [12,13,14,15,16]. However, the methods have not been so widespread until now. The basic idea is that products with similar degradation processes have a similar service life [3]. The RUL of the test sample is determined by observing the similarity between the performance degradation trajectory of the test sample and the reference samples of the known life-cycle degradation process.
There is little literature about RUL prediction based on similarity but they verified the validity of “similarity” idea. You et al. [12] conducted an experiment to predict the RUL of a welding spot under vibrations. He thought if the asset under study is more similar to reference sample “A”, then “A” should play a more important role in RUL estimation of the asset under study. Eker [13] testified the function of similarity-based prediction through data collected from Virkler’s fatigue crack propagation, a degradation data set of drilling, and a turnout system of slide chair degradation. Zhang [14] put forward a method to predict the RUL of a mechanical system based on the similarity of a phase space trajectory and found that the results approximated the actual RUL very closely. Xiong [15] built a one-dimensional damage indicator on an aero engine’s multiple parameters by means of liner regression. He obtained the RUL after matching test engine data to the model base. In the same way, Moghaddass [16] adopted principal components analysis to integrate a turbine engine’s multiple parameters and drew the first principal component to describe the system degradation process.
It can be concluded from the literature review that similarity-based RUL prediction methods so far are almost always built on a single parameter. The latest research is only employed to integrate multiple parameters into a one-dimensional parameter firstly, and then compare the similarity of performance degradation curves with statistical methods or AI methods. There is no research about co-impact both multiple samples and multiple parameters of those samples on RUL prediction. However, performance degradation or malfunction may result from a multitude of reasons. Thus, multiple parameters of different perspectives may provide a more comprehensive reflection of the running process [17]. Especially for a complex system, what a single parameter can present is far less than multiple parameters in describing the degradation of various forms.
Therefore, this paper proposes an RUL prediction method based on the similarity fusion of multiple damage indicators and samples. In contrast to the more traditional methods, the method of multi-parameter and multi-sample similarity fusion estimates RUL by referring to multiple parameters and samples.
The process can be divided into five parts. At first, in Section 2.1 and Section 2.2, the various time and frequency domain features extracted from a vibration signal that will be applied as damage indicators are introduced together with the entropy variance method for fuzzy filtering applied for low pass filtering of the time-domain features. Further, the method used for parameter evaluation in order to select the most significant performance damage indicators to be applied for RUL prediction is discussed. Second, in Section 2.3, we introduce principles of RUL prediction based on similarity and defines four core elements in the RUL prediction based on similarity: Time window D, similarity measurement function S(.), weight function w(.), and performance evaluation indicator Q. Third, in Section 2.4, we introduce the Dynamic Time Warping Distance (DTWD) as the similarity measure function S(.) to discuss the similarity of data degradation trajectory patterns for the first time. Fourth, in Section 2.5, Section 2.6 and Section 2.7, according to combinations of different performance damage indicators, the RUL prediction model based on the similarity fusion of multi-parameter and multi-sample methods is established. Finally, in Section 3, this paper studies a type of heavy high-speed diesel generator produced by the China Shipbuilding Industry Corporation (CSIC), and validates the RUL prediction method proposed here with experimental results. In the meanwhile, Proposed method here are compared with the mature methods of Multivariable Support Vector Machine (MSVM) and Principal Component Analysis (PCA) for comparison analysis in Section 3.3 and Section 3.4.

2. Methodology

At first step, we will select the most significant performance damage indicators which will be the input of RUL estimation from various time and frequency domain features. Then, we will define four core elements in the RUL prediction based on similarity: Time window D, similarity measurement function S(.), weight function w(.), and performance evaluation indicator Q. Next, as the most important core, similarity measurement function S(.) will be established with DTWD and we write the details about DTWD in Section 2.3. At last, the RUL prediction model based on the similarity fusion of multi-parameter and multi-sample methods will be established.

2.1. The Damage Indicators

The various time domain and frequency domain features extracted from the vibration signal will be used as damage indicators in the following RUL prediction. Further, we discuss the method we apply to to define for each individual gearbox under study a subset of most significant damage indicators system, to be applied for RUL prediction for this particular gearbox. The time domain features of the vibration signal effectively reflect the performance degradation of the gearbox [18]. As shown in Table A1 of Appendix A, we have chosen to use 10-time domain features as damage indicators [19]. Further, the Fourier transform is applied to convert the vibration signal into its frequency spectrum representation [20]. We have chosen to use 15 frequency domain features [21], as damage indicator, see Table A2 in Appendix A.
Since the time-series of the various damage indicators are noisy and in order to correctly compare them with the reference samples we need to smooth the series, i.e., low pass filtering. Fuzzy filtering is a low pass filtering method based on fuzzy set theory, which can adjust the filter structure adaptively based on the features of the signal [22]. A large number of studies have shown that this method is easy to implement and has a good filtering effect, which is very suitable for engineering applications.
For time domain features, this paper proposed the low pass filtering based on approximate entropy variance. The time-series of the various damage indicators are rather noisy, we apply low pass filtering techniques to smooth them [23]. For the time domain damage indicators we have applied low-pass filtering with approximate entropy variance (Aev), because approximate entropy [24] is suitable for describing dynamic noise with a small amount of data and has a strong Robustness to observation noise, and the dynamics system is easy to reconstruct. Approximate entropy variance is a statistic measuring the complexity of time series and it can accurately measure the complexity of signals. Especially in the case of small data quantity and noise interference, it also demonstrates statistical stability. The variance could describe the stability in time series. Approximate entropy (Ae) is defined as: For time series { n ( i ) } ( i = 1 , 2 , ( N ) , x(i) denotes m consecutive values of u starting at point i:
A e ( m , r ) = lim n [ φ m ( r ) φ m + 1 ( r ) ]
where:
φ m ( r ) = ( N m + 1 ) 1 i = 1 N m + 1 [ j = 1 N m + 1 H { r d m [ x m ( i ) , x m ( j ) ] } / ( N m + 1 ) ]
d m [ x m ( i ) , x m ( j ) ] = max | u ( i + k ) u ( j + k ) | x m ( i ) = [ u ( i ) , u ( i + 1 ) , ,   ( u ( i + m 1 ) ]
H() is the Heaviside function, After Ae is calculated, Aev is defined as:
Aev = i = 1 N ( Ae Ae ¯ ) / N
Then low-pass filtering decomposes the damge indicator signal into the parts trend and noise:
X ( t k ) = X T ( t k ) + X R ( t k )
With X(tk) is the value of the performance damage indicator at time tk, XT(tk) is the trend term, XR(tk) the noise term, and tk = 1, 2,…, N, with N the number of discrete observations made within the measurement time interval.The weighting filter and fuzzy filtering membership function are defined as u ( x n k ) = f ( A e , n k ) according to [24], the range of u’(xn−k) is a [0, 1], and f is set to normal distribution function. So XT(tk) will be ramained while XR(tk) removed.
To smooth the frequency domain damage indicator over time, a simple moving average filtering is applied. The moving average filtering can reduce random noise while reflect unit step function response of signal [25]. First, the damage indicators are decomposed into two parts just as before in Equation (5), then calculate the average value as the predicted value of the next sub-interval and move forward in turn. X ^ ( t j ) is the first part of damage indicator with moving average filtering which is defined as the weighted average value of the adjacent N data points.
X ^ ( t j ) = 1 n i = j n j 1 X ( t i ) j = n + 1 , n + 2 , L , N + 1
In Figure 2, as an example of the full signal together with its trend is shown for one of the time/frequency domain indicators. The ideal output can be obtained by wave filtering.

2.2. Defining a Subset of Most Significant Damage Indicators

To define an—asset dependent—subset of most significant damage indicators to be applied for RUL prediction of the asset, so called significance indicators have been defined [26]. By aid of these significance indicators each of the twenty-five damage indicators is evaluated and a score from 0 to 1 is given to each damage indicator as a measure of how significant the parameter is for the RUL prediction for the asset under study. We have defined three significance indicators, Correlation, Monotonicity and Robustness to act as RUL significance indicators, and which will be defined and explained in the following.
The correlation r measures the correlation of a damage indicator with time (that is over the whole time span the vibration measurements have been performed), i.e., it states the normalized slope of the trend of the damage indicator over time, i.e., r = ( σ X / σ t ) b , with X is the damage indicator, σX the standard deviation of X, σt the standard deviation of the the variable t time, and b the slope of the regression line ‘found by linear regression when viewing X as a function of t.
The Monotonicity indicator reflects the unidirectional trend of time domain features and frequency domain features. The larger the value of Monotonicity, the greater the slope of the parameters, and the more intuitive and obvious the trend of performance degradation. If a parameter rises and falls recurrently in the degradation process, it may be just a cyclic change as the machine vibrates. That does not change in a certain direction as performance degradation occurs.
The Robustness indicator reflects the tolerability of damage indicators for outliers. Robustness measures whether the degradation parameter is capable of resisting random interference [27]. If a parameter is sensitive to external disturbance, it does not contain valuable information even if it fluctuates wildly.
The equations applied to compute each of the indicator indicators are stated in Appendix B.
This study proposes a combination function W with three indicators above as a “ruler” to select several optimal parameters for following RUL prediction.
max X Ω W = ω 1 Corr ( x ) + ω 2 Mon ( X ) + ω 3 Rob ( X ) with :   ω i > 0 i ω i = 1 i = 1 , 2 , 3
In this equation, W is the combination function, distributed in the range of [0, 1]; Ω represents a set of candidate damage indicators; and ωi represents the weight of each indicator. The parameter with a larger value of W should be selected for effective RUL prediction. ωi is determined by two sources: Subjectively, due to the fact that damage indicator is used to describe performance degradation trajectory as time goes, Mon should take up the largest weight. This is in compliance with similarity-based prediction method. So ωi will be subjectively assigned a value denoted as prior weighing ai. While objectively, the optimal combination of the chosen damage indicators in essence is about constrained optimization. We adopt the solving model with AMPL, input the permutation and combination of three indicators’ weights (adjustment of weighting is from 0.2~0.8), and determine the posterior weighting bi. according to the results. At last, considering both prior weighing ai and posterior weighting bi, ωi will be determined, and some more significant damage indicators can be chosen for subsequent RUL estimation.
ω i = α a i + ( 1 α ) b i , ( 0 α 1 )

2.3. Similarity-Based RUL Prediction

As Figure 3 shows, the concept of the similarity-based RUL prediction method is that assets that show similar behavior of their damage indicators have similar RUL values [28]. By comparing the damage indicator time series of an asset with corresponding historical reference time-series, the RUL of the asset can be predicted. It is assumed that the assets from which reference indicator curves are available are the same or of closely related type of product or system—and have performed under more or less similar operating environments and conditions—as the asset under study.
The blue curve represents the time-series of one of the damage indicators over time for a reference gearbox, while the red curve is the time-series of same indicator for a gearbox in use on which we wish to make an RUL prediction. Now the similarity concept states that we should find the most similar certain part of blue curve to red curve, which named ‘optimal match’. When an optimal match has been established then as estimate for the RUL of gearbox of interest the length of the time interval of the blue curve which is on the right of the red is applied. Here we always assume that the final available measurement point of any of the reference curves corresponds with the end of the remaining useful life of the reference gearbox.
To apply the similarity prediction method, its four core elements need to be defined. These are the time window D, the similarity measure function S(.), a weight function w(.), and the performance evaluation indicator Q. The time window D refers to the time interval of similarity between the test sample and reference samples, shown as the data block length marked as yellow in Figure 3. The similarity measurement function S(.) quantifies the similarity of the degradation trajectory of the test sample and reference samples. This paper will establish the DTWD-based nonlinear dynamic programming algorithm as S(.) which will be explained in Section 2.4. The weight function w(.) concerns the similarity between the test sample and reference samples, and it gives different weights to different reference samples and different parameters in line with their contributions. The performance evaluation indicator Q is used to describe differences between the RUL estimated value and its actual value, which helps to find the optimal method through comparing different RUL prediction methods. We borrowed 5 indicators as the performance evaluation indicator Q, which are shown in Appendix B (2).
Similarity-based RUL prediction follows four steps:
(1)
Define the time window D to be used for each of the damage indicators related to an asset. The right side of the data block is the state of asset under study. The red curve is the time window D of the test sample and the blue curve is the life-cycle degradation state of reference sample. The right boundary line of D is observation point at present for test sample.
(2)
Define a similarity measure function S(.) through which the similarity or closeness between two time-series is defined. DTWD algorithm is established as the similarity measure function S(.) in order to find the most similar part in one certain reference asset with the time window D, so one similarrity distance could be obtained. Suppose H most significant damage indicators are selected and L reference assets are compared with the asset under study, which means each reference asset contains H damage indicators. Then H*L similarity distances between each damage indicator in the asset under study and each damage indicator in those reference assets could be obtained by DTWD algorithm.
(3)
Based on the thought “the more similar the two-time series is, the larger the weight value is”, we will make weighted summation among those H*L similarity distances. That is normalizing H*L similarity distances and then assigning different weights according to the thought such as closer distances will be given greater weights. The details of weight function w(.) based on multi-parameter and multi-sample refers to Equations (12) and (14) and Equations (16) and (18), respectively.
(4)
For those RUL values referring to different parameters or different samples, weighted average method is used to obtain the test sample’s RUL estimation based on the corresponding weights calculated in step (3).

2.4. Similarity Measurement Function S(.): Dynamic Time Warping Distance (DTWD)

The DTWD is a dynamic nonlinear programming idea, and an algorithm that matches time dimension warping with distance optimization planning [29]. DTWD has been widely used in text data matching, voice information processing and other fields in recent years. Compared with the traditional Euclidean distance, it shows better recognition accuracy and robustness in the application of time series. DTWD can compress and bend time series, make the overall distance of two sequences smaller. The DTWD of two time series is defined as the minimum distance between the two series calculated by time dimension bending. when calculating the distance between series A and B, traditional Euclidean Distance takes the distance between two time series A and B at same time point, while DWTD takes the distance between two time series A and B that needn’t at same time point in order to obtain the shortest distance. For example, supposing that time series A = {2,5,2,5,2,3}, B = {0,3,6,0,6,0}, so the traditional Euclidean Distance is calculated as 2 + 2 + 4 + 5 + 4 + 3 = 20, and DTWD is calculated as Figure 4. The gray elements from the upper left corner to the lowest right corner are dynamic time warping path. The lowest right corner element ”12” is the cumulative distance Dtwd(A,B) = 12.
Therefore, DTWD is calculated as follows: Setting time series A = (a1,a2,,al) and B = (b1,b2,,bj,bk), l and k represent the sequence length of A and B, respectively. The DTWD algorithm needs to first align two time series and establish a l × k matrix D which contains the value d(ai,bj) on its ij-th entry. d (ai,bj) represents the distance between points ai and bj in two time series.
In matrix D, P (P = q1,q2,…,qn,…,qN) denotes the dynamic time warping path of time series A and B, qi represents the distance of time series A and B at time point i. Path P needs to meet the following four restraint conditions:
(1)
Boundedness: max(l,k) < N < l + k−1;
(2)
Boundary conditions: q1 = D(1,1) and qN = D(l,k), that is, the start and end points of the dynamic warping path can only be on the diagonal of the matrix;
(3)
Continuity: For qn = (a,b) and qn−1 = (a’,b’), the conditions a a 1 and b b 1 must be met;
(4)
Monotonicity: For qn = (a,b) and qn−1 = (a’,b’), aa’ = 0 and bb’ = 0 can’t happen. that is, all line segments representing the dynamic bending paths cannot intersect each other.
For small-scale data, an exhaustive search method can be used to find an optimal dynamic time warping path. For large-scale data, based on the Dynamic Programming Model, the optimal dynamic time warping path can be obtained by a recursive search algorithm with the local optimal solution from point (1,1) to point (i,j). Using DTWD to represent DTWD between time series A and time series B, the computation process is
{ D twd ( A , B ) = d ( a 1 , b 1 ) + min { D twd ( A , rest ( B ) ) D twd ( rest ( A ) , B ) D twd ( rest ( A ) , rest ( B ) ) d ( a , b ) = a b p
In the equation, p denotes the norm, rest(A) = {a2, a3al}, rest(B) = {b2, b3bk}. As Equation (9) showed, d(ai,bj) represents the first point’s distance between two time series, then search for each shortest bending path at each rest point(i.e., rest(A) and rest(B)) between two series. The pseudocode of DTWD algorithm is shown in Appendix B (3).

2.5. RUl Estimation by Multi-Parameter Fusion

Multi-parameter similarity fusion focuses on the impact of different parameters on the RUL estimation of the asset under study. As the four steps showed in Section 2.3, Suppose H most significant damage indicators are selected and L reference assets are compared with the asset under study, which means each asset contains H damage indicators. Then H*L similarity distances between each damage indicator in the asset under study and each damage indicator in those reference assets could be obtained by DTWD algorithm. First, according to the weight idea in step (3) of Section 2.3, different weights are arranged to those H*L similarity distances. Second, for each certain damage indicator Hi, we make weighted summation among those Hi from L reference assets respectively, which is called “first fusion” and need to be traversal H times because there are total of H damage indicators. After first fusion there will be H similarity distances formed. Third, for those formed H similarity distances, we make weighted summation among them again based on the weight idea in step (3). This is called “second fusion”. There will be one similarity distances formed called “RUL value”. At last, by finding the corresponding time point of “RUL value”, we can estimate the RUL.
The following is the calculation process of mathematical theory:
For a diesel generator gearbox, with a asset under study(called “test sample’) of X, suppose H performance damage indicators can be obtained with the method in Section 2.2. With the l-th reference sample Yl, l ( l = 1 , 2 , , L ) is the label of reference sample and L is the number of reference samples. The idea of multi-parameter similarity fusion is shown in Figure 5.
In Figure 5, by y h l we denote the time-series of the h-th damage indicator of the l-th reference gearbox, with l = 1,…, L, and L the total number of reference gearboxes. Further, U h * l represents the RUL estimation described by the h-th damage indicator of the l-th reference sample, and U h * represents the RUL value estimated by the h-th damage indicator after first fusion.
(1) Calculating similarity distance between each damage indicator in the asset under study and each damage indicator in those reference assets which is denoted by U h * l , so we need to run this step H*L times and obtain total of H*L similarity distances. let S h * l denotes the optimal similarity distance by DTWD between the h-th damage indicator of the l-th reference sample and the h-th damage indicator of the test sample. The calculation of U h * l is as follows:
S h * l = min Δ   D t w d ( x h * , y h * l ( N D Δ ) )
U h * l = arg   min Δ   D t w d ( x h * , y h * l ( N D Δ ) )
As Figure 3 shows, we need to match the red block such that is most similar (w.r.t. a certain measure) to a part of the blue curve, that is to find an ‘optimal match’. Only when the Dtwd attains a minimum, we can conclude that the right boundary line of D which corresponds to a time point of reference sample reflects the RUL of test sample. With the minimum of distance S h * l is determined, the U h * l is determined.
(2) First fusion: w h * l represents the weight of U h * l , so Equation (12) is established as weight function w(.) for the first fusion according to the idea” The smaller the distance between the two time series is, the larger the weight value of the parameter is.”, then U h * could be obtained as showed in Figure 6. S h * l and U h * l have been calculated in Equations (10) and (11).
w h * l = l = 1 L S h * l l = 1 L ( l = 1 L S h * l / S h * l ) · S h * l
U h * = l = 1 L w h * l U h * l
(3) Second fusion: After obtaining a total of H U h * , w h represents the weight of U h * , so Equation (14) is established as weight function w(.) for the second fusion.
w h = h = 1 H l = 1 L S h * l l = 1 L ( h = 1 H l = 1 L S h * l l = 1 L S h * l ) · l = 1 L S h * l
U = h = 1 H w h U h *

2.6. RUl Estimation by Multi-Sample Fusion

Compared with multi-parameter similarity fusion, multi-sample similarity fusion focuses more on the similarity between reference assets and the asset under study, rather than the similarity among different parameters. Same as Section 2.5, suppose H most significant damage indicators are selected and L reference assets are compared with the asset under study, which means each asset contains H damage indicators. Then H*L similarity distances between each damage indicator in the asset under study and each damage indicator in those reference assets could be obtained by DTWD algorithm. First, according to the weight idea in step (3), different weights are arranged to those H*L similarity distances. Second, for each certain reference sample Li, we make weighted summation among those Hi, which is called “first fusion” and need to be traversal L times because there are total of L reference samples. After first fusion there will be L similarity distances formed. Third, for those formed L similarity distances, we make weighted summation among them again based on the weight idea in step (3). This is called “second fusion”. There will be one similarity distances formed called “RUL value”. At last, by finding the corresponding time point of “RUL value”, we can estimate the RUL.
The following is the calculation process of mathematical theory:
(1)
Repeating the steps (1) in Section 2.5 based on multi-parameter similarity fusion;
(2)
First fusion: w h * l represents the weight of U h * l , so Equation (16) is established as weight function w(.) for the first fusion, then U h * could be obtained as showed in Figure 7. Unlike multi-parameter fusion, each reference sample is treated as a “unit”, H damage indicators of a certain reference sample will have a fusion firstly in those units.
w h * l = h = 1 H S h * l h = 1 H ( h = 1 H S h * l / S h * l ) · S h * l
U l = h = 1 H w ˙ h * l U h * l
(3)
Second fusion: For the obtained U l , We used Equation (18) as weight function w(.) and make weighted summation to integrate L reference sample to the final RUL value U :
w l = h = 1 H l = 1 L S h * l l = 1 L ( h = 1 H l = 1 L S h * l h = 1 H S h * l ) · h = 1 H S h * l
U = l = 1 L w l U l

2.7. Combining the Two Estimates into One

After obtaining the two results of RUL estimation with two methods, it’s feasible to make “third fusion” to combining the two estimates into one. This paper provides another idea about combining yet. As Figure 7 showed, the performance evaluation indicator Q is established to discuss the estimation results of the two methods, and the better RUL estimation result is selected for the diesel generator gearbox.
In addition, for a mechanical system, we will use the both methods and then prefer a more suitable result. Performance evaluation indicator Q is used to measure which result is better, they are some index like deviation of estimation in Appendix B. The two methods make the fusion process from different perspectives and take into account influencing factors comprehensively, so there is no need to fuse the two method’ results.

3. Experimental Results and Comparative Analysis

In this paper, the RUL of a diesel generator gearbox is studied by analyzing the vibration signals of a gearbox shell surface as Figure 8 showed. Data comes from the High Stress Accelerated Life Test of a certain type of heavy high-speed vessel diesel manufactured by the China Shipbuilding Industry Corporation (CSIC), which is collected from the gearbox Monoblock’s accelerometers. The number of teeth of the drive pinion is 17, and the number of teeth of the driven bull gear is 75. The input shaft bearing has a pitch diameter of 60 mm, a rolling element diameter of 19.05 mm, and six steel balls; the output shaft bearing has a diameter of 95 mm, a rolling element diameter of 22.25 mm, and eight steel balls. The data were recorded every 5 or 10 min at a sampling rate of 20 KHz. Four sets of diesel generator gearbox data were recorded during the life-cycle degradation process in Table 1. GU1, GU2, GU3, and GU4 all belong to the same type of component of the system, which are of similar working environments and operating conditions.
Figure 9 depicts the whole vibration signal in a gearbox lifecycle. The amplitude of the vibration signal increases gradually until the gear box fails to work properly.

3.1. Parameters System: Gearbox Performance Degradation Data

With the theory in Section 2.1 and Section 2.2, the evaluation result of 25 damage indicators is showed in Table 2. According to the calculation result of Equation (7), the weights are ω 1 = 0.2, ω 2 = 0.5, and ω 3 = 0.3, respectively. According to the Section 2.2, the first six damage indicators (Fp9, Fp13, Fs4, Fp3, Fs2, and Fp1) ranked from large to small according to the W value are selected to construct the performance damage indicator system of the diesel generator gearbox. They will be the input of two RUL estimation methods. Figure 10 shows the life-cycle trajectories of Fp9, Fp13, Fs4, Fp3, Fs2, and Fp1.
After establishing the gearbox performance damage indicator system [Fp9, Fp13, Fs4, Fp3, Fs2, Fp1], the performance damage indicator data set of four samples (GU1 to GU4) is calculated. Figure 11 indicates that the curves of the same performance damage indicator from different samples have similar states. This proves that the gearbox registers a similar degradation trajectory in line with the running state and environment, which provides strong practical evidence for the subsequent RUL prediction based on multi-parameter and multi-sample similarity fusion. On the other hand, the different characters of FP13, FP3, and FP1 exactly reflect the different performance degradation trajectories of four samples. By selecting samples with different performance degradation, the verification of experience could be more convincing. In addition, in the aspect of the sample, FP13, FP3, and FP1 from a same sample have similar degradation trajectories, and the amplitude ranges are also so similar. This proves that these three parameters could actually reflect the performance degradation and should be selected for RUL prediction.
Considering the running time and data features, this study sets Sample GU1 as the test sample and GU2, GU3, and GU4 as reference samples to prove the validity of multi-parameter and multi-sample similarity fusion.

3.2. RUL Prediction Results

(1) Results based on multi-parameter similarity fusion
This study unrolled the prediction of a diesel generator’s data starting from the point of 200 h, with the time window D of 30. The details of RUL prediction result based on multi-parameter similarity fusion with Euclidean distance/DTWD are shown in Table A3 and Table A4 of Appendix B. Figure 12 and Figure 13 show the relative error between the actual values and predicted values of RUL.
In RUL prediction based on multi-parameter similarity fusion with DTWD, the relative error between the predicted values and the actual values ranges from −0.88% to −95.82%. Except for very few points with large errors, the relative errors of most of the predicted values are below 30%, which could obtain more accurate values than traditional Euclidean distance.
(2) Result based on multi-sample similarity fusion
The RUL estimation values based on multi-sample similarity fusion during the life-cycle degradation process are shown in Table A5 and Table A6 of Appendix B. Figure 14 and Figure 15 show the relative error between the actual values of RUL and the predicted values with Euclidean distance /DTWD.
With RUL prediction based on multi-sample similarity fusion, the relative error between the predicted values and the actual values ranges from −0.35% to −76%. Except for very few points with large errors, the overall relative error is controlled below 30%, which has a better prediction accuracy than the RUL prediction result based on multi-parameter similarity fusion.

3.3. Comparative Analysis with Single-Parameter RUL Prediction

Unlike methods of single-parameter similarity fusion, the method of multi-parameter similarity fusion generates a combination of results predicted by multiple parameters. In order to prove the validity and rationality of the model, a performance degradation curve is established upon each and every one of the reference samples’ parameters. The calculation adopts that of the single-parameter similarity prediction method and the weight value calculation process of different reference samples is the same as above. The test data set contains six performance damage indicators of Sample GU1: Fp9, Fp13, Fs4, Fp3, Fs2, and Fp1. They are compared to parameters Fp9, Fp13, Fs4, Fp3, Fs2, and Fp1 of Sample GU2, GU3, and GU4 to determine the RUL.
In this study, Principal Component Analysis (PCA) technology is used to integrate elements of the performance degradation index system [30]. The first principal component PCA-1 and the second principal component PCA-2 were extracted respectively to conduct RUL prediction through the single-parameter life RUL prediction method [31]. This paper takes the life cycle data set of Sample GU1 as an example. Through PCA of its six performance damage indicators, we get the KMO of 0.748, higher than 0.5, indicating that the six parameters are suitable for dimensionality reduction processing.
The curve of the first-order principal component and second-order principal component of the performance damage indicator system of Sample GU1’s life cycle data is shown in Figure 16.
The single-parameter RUL prediction results of the first-order performance principal component PCA-1 and the second-order principal component PCA-2 are shown in Table 3.
It can be seen from the table that there are significant differences in the prediction effects of the six parameters in the diesel generator gearbox performance degradation index system. The prediction accuracy of Fp13 and Fs4 is higher than the rest, whereas the prediction of all the six first-order principal components of PCA-1 is more accurate than that of a single-parameter. The single-parameter similarity RUL prediction registers a poorer performance. In summary, the multi-parameter fusion-based RUL prediction method proposed in this study has certain advantages and effectiveness.

3.4. Comparative Analysis with AI-Based RUL Method: MSVM

Research on RUL prediction based on artificial intelligence has also been developed, such as Bayesian methods, which are deep learning methods. This paper uses the multivariable support vector machine (MSVM) for comparative analysis. MSVM fully considers the interaction and constraints between multiple variables, and realizes the maximum mining of potential information for small sample data. According to Section 3.1, Fp9, Fp13, Fs4, Fp3, Fs2, and Fp1 are selected as the input of MSVM, and a regression function is constructed:
f ( x ) = ( w x ) + b , ( w R n , b R )
w and b can be obtained by solving the optimum solution of the following equation:
min 1 2 w 2 + C i = 1 n ( ξ i + ξ i * ) with : w x i + b y i ξ i + ε ; y i w x i b ξ i * + ε i = 1 , 2 , , n ; ζ i , ζ i * 0
C is a penalty factor, ζ i , ζ i * are relaxation factors, and ε is an unsensitive factor. When the data set shows a nonlinear relationship, a kernel function is introduced into the SVM operation to map the original data into the high-dimensional feature space. The Radial Basis Function (RBF) and Poly kernel function are as follows:
K ( x i , x j ) = exp ( x i x j 2 2 p 2 )
P is the index of RBF. The Lagrangian function is introduced to transform the optimization problem into a convex quadratic programming problem. α i , α i * are Lagrangian multipliers.
max W ( α i , α i * ) = 1 2 i , j = 1 n ( α i α i * ) ( α j α j * ) K ( x i , x j ) ε i = 1 n ( α i α i * ) + i = 1 n y i ( α i α i * ) w i t h : i = 1 n ( α i α i * ) = 0 ; 0 α i C ; 0 α i * C ( i = 1 , 2 , , n )
The calculation results of the comparative analysis are shown in Table 4.
Table 4 indicates that the prediction accuracy of multi-sample similarity fusion is higher than multi-parameter similarity fusion concerning the prediction’s average relative error, and the two methods’ MAPE are both lower than MSVM, validating the effectiveness of the proposed method compared with the AI-based method. In addition, the proposed DTWD-based algorithm performs better than the traditional Euclidean distance.
In parameter similarity fusion, RUL values predicted by the same performance damage indicators are integrated to calculate the RUL of the test sample; while in sample similarity fusion, the RUL values of samples are integrated on the basis of performance damage indicators carried by each sample.
Multi-parameter and multi-sample methods are similar in calculation, but differ in some respects. Multi-parameter similarity fusion depends more on parameters’ feedback on the performance degradation process, while multi-sample similarity fusion relies on the sample data that is similar to the life-cycle trajectory in the gearbox running process. The more similar the test samples are with reference samples in terms of operating methods, conditions, and load environments, the larger the weight value that can be obtained, and the closer the RUL prediction value is to the actual value. Experimental results of the comparison are shown in Figure 17.

3.5. Limitations and Future Work

This paper proposes an RUL prediction model based on multi-parameter and multi-sample fusion, and has verified its effectiveness through analyzing a certain type of heavy high-speed diesel generator manufactured by an affiliate of CSIC. The results show that the proposed method is superior to previous studies in terms of the prediction accuracy. However, there are still some limitations in several respects. First, this paper verifies the proposed model with the diesel generator gearbox as an example, but further efforts should be devoted to testing broader gearbox equipment and even the mechanical rotating equipment. Second, this study does not classify types of malfunction at the termination and identify the degradation trend at different stages. Future researches can focus more on RUL under different malfunctions, grouping and decomposing the performance degradation process to identify test samples’ running stages, and refining the RUL prediction problems and models. Third, the research is conducted on the vibration signal of the diesel generator gearbox. To develop a more comprehensive RUL prediction method, future research should incorporate more data sources, such as performance parameters and environmental parameters.

4. Conclusions

This paper takes a certain type of heavy high-speed diesel generator as the study case. In the first step, through extracting time and frequency domain features of the original vibration and fuzzy filtering based on approximate entropy variance, the diesel generator performance damage indicators system is established. Next, this paper analyses the four core elements of similarity-based RUL prediction and establishes DTWD as the similarity measurement function. Then, we propose the methods of multi-parameter similarity fusion and multi-sample similarity fusion. Based on the two methods, the performance comparison research is carried out. The experimental results show that the MAPE values of the two RUL prediction methods proposed here are below 14%, which are lower than MSVM’s and PCA’s. This fully validates the effectiveness of the proposed method for predicting the RUL.And the RUL prediction based on the dynamic time bending distance function in the sample similarity fusion has the best accuracy which is below 10%. The similarity-based RUL prediction method has the merit of avoiding establishing a system degradation model, and is simple and practical. Moreover, it fully employs effective information provided by vibration signals, considers multiple parameters that can reflect performance degradation, and conducts a comparative analysis of multiple samples. The predicted results are stable as experimental results showed.
In summary, the innovations of this article are mainly as follows:
(1)
We put forward the idea of similarity fusion with multi-parameter and multi-sample methods, and established the RUL prediction model. The performance degradation process is multi-dimensional and multifaceted. Multi-parameter similarity fusion takes full consideration of multiple parameters of vibration signals and a whole performance degradation process. Hence, a more comprehensive and accurate prediction is achieved. In contrast, multi-sample similarity fusion considers multiple samples with life-cycle degradation. By integrating RUL prediction values calculated by damage indicators carried with those samples, we improve the stability and credibility of RUL prediction; the MAPE is reduced to less than 14%, the MSE less than 220, the MADM less than 13.
(2)
The DTWD-based nonlinear dynamic programming algorithm is established as the distance measure of similarity in RUL prediction. In the time series analysis, it performed better than the traditional Euclidean distance, the average relative errors of DTWD is 17% less than Euclidean distance.
(3)
After time domain and frequency domain features extraction, we proposed approximate entropy variance (Aev) for low-pass filtering to remove signal noise.

Author Contributions

This manuscript was written by X.X., under the supervision of S.Z. and W.C. The modeling, data analysis, and software process were executed by S.Q., X.P. And Y.X. is responsible for the data acquisition and model design.

Funding

This work is supported by the National Natural Science Foundation of China (Grant No. 71971013 & 71871003 & 71501007) and the Fundamental Research Funds for the Central Universities (YWF-19-BJ-J-330). The study is also sponsored by the Aviation Science Foundation of China (2017ZG51081), the Technical Research Foundation (JSZL2016601A004) and Civil Aircraft Science Research Fund (MJ-2017-J-92).

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

In Table A1, Fss represents the time domain parameter of s, and xi represents the amplitude of the vibration signal collected by the gearbox sensor within a certain period (i = 1, 2, …, N0), where N0 is the quantity of data points collected within the period.
Table A1. Time domain damage indicators of the gearbox vibration signal.
Table A1. Time domain damage indicators of the gearbox vibration signal.
Damage IndicatorsFeature SymbolEquationImplication
Average value F s 1 F s 1 = 1 N 0 i = 1 N 0 x i Average energy value of gearbox vibration within a certain period
Mean square F s 2 F s 2 = 1 N 0 i = 1 N 0 x i 2 Better manifesting performance degradation trend of gear and bearing [32,33,34]
Mean-square amplitude F s 3 F s 3 = [ 1 N 0 i = 1 N 0 | x i | ] 2 Sensitive to larger amplitude change [35]
Absolute average F s 4 F s 4 = 1 N 0 i = 1 N 0 | x i | Calculating absolute value before calculating averages, which can avoid positive-negative offset
Skewness index F s 5 F s 5 = 1 N 0 i = 1 N 0 x i 3 Measuring asymmetry of vibration signals
Waveform index F s 6 F s 6 = 1 N 0 i = 1 N 0 x i 2 F s 4 Representing deviated and inclined value between present vibration signal and sine wave
Pulsatility index F s 7 F s 7 = max ( x i ) 1 N i = 1 N 0 | x i | Manifesting stability and destruction level of gearbox’s degradation and malfunction [36]
Kurtosis index F s 8 F s 8 = 1 N 0 i = 1 N 0 x i 4 ( 1 N 0 i = 1 N 0 x i 2 ) 4 Measuring “bending and arching” level of vibration signal
Peak-peak value F s 9 F s 9 = m a x ( x i ) m i n ( x i ) Reflecting impact vibration resulted from malfunction
Margin index F s 10 F s 10 = max ( x i ) F s 3 Reflecting Abrasion level of gear and bearing [35]
In Table A2, The spectrum of the original signal xi collected within a certain period is represented as sj, where j = 1, 2, …, J. J is the spectral line quantity of the spectrum, fj represents the frequency value of the j-th line, and Fpk below represents the value of the k-th frequency domain damage indicator Fp.
Table A2. Frequency domain features of the vibration signal.
Table A2. Frequency domain features of the vibration signal.
Feature SymbolEquationFeature SymbolEquationFeature SymbolEquation
F p 1 j = 1 J s ¯ j s j 2 π j = 1 J s i 2 F p 2 j = 1 J s ¯ j 2 4 π 2 j = 1 J s i 2 F p 3 F p 2 F p 1 2
F p 4 j = 1 J s j J F p 5 j = 1 J ( s j F p 4 ) 2 J 1 F p 6 j = 1 J ( s j F p 4 ) 3 J ( F p 5 ) 3
F p 7 j = 1 J ( s j F p 4 ) 4 J F p 5 2 F p 8 j = 1 J f j s j j = 1 J s j F p 9 j = 1 J ( f j F p 8 ) 2 s j J
F p 10 j = 1 J f j 2 s j j 1 J s j F p 11 j = 1 J f j 4 s j j 1 J f j 2 s j F p 12 j = 1 J f j 2 s j j = 1 J s j j = 1 J f j 4 s j
F p 13 F p 9 F p 8 F p 14 j = 1 J ( f j F p 8 ) 3 s j J F p 9 3 F p 15 j = 1 J ( f j F p 8 ) 4 s j J F p 9 4

Appendix B

(1) Calculation of Corr, Mon and Rob
C o r r ( X , T ) = | K k X T ( t k ) t k k X T ( t k ) k t k | [ K k X T ( t k ) 2 ( k X T ( t k ) ) 2 ] [ K k t k 2 ( k t k ) 2 ]
M o n ( x ) = 1 K 1 | k δ ( X T ( t k + 1 ) X T ( t k ) ) k δ ( X T ( t k ) X T ( t k + 1 ) ) |
R o b ( X ) = 1 K k exp ( | X R ( t k ) X ( t k ) | )
In those equations, K represents the total number of time series and δ ( ) represents the unit step function. When the value of the independent variable in parentheses is larger than 0, the value of δ ( ) is 1; otherwise, the value of δ ( ) is 0.They are all distributed in the range of [0, 1] and positively correlated with time domain features and frequency domain features.
(2) The performance evaluation indicator Q
Mean Absolute Error (MAE): B is the start time of the test sample’s prediction, E is the end time of prediction, and i is the time point of prediction. Δ(i) represents the difference between the predicted value and actual value of the i-th prediction. The smaller the MAE is, the higher the prediction accuracy is.
M A E = i = B E | Δ ( i ) | E B + 1
Mean Squared Error (MSE):
M S E = 1 E B + 1 i = 1 E B + 1 Δ ( i ) 2
Mean Absolute Percentage Error (MAPE): The concept of Relative Error is introduced in this paper considering the difference between the predicted value and actual value.
M A P E = 1 E B + 1 i = 1 E B + 1 | 100 Δ ( i ) U ( i ) |
Error Standard Deviation (ESD): This reflects fluctuation of the error value. The smaller the value is, the more stable the gearbox is.
M = 1 E B + 1 i = 1 E B + 1 Δ ( i )
E S D = i = 1 E B + 1 ( Δ ( i ) M ) 2 E B
Error Standard Deviation (ESD):Me denotes the median of the error value. MADM reflects the deviation degree of the error value from the median value, which applies to cases where the error value does not conform to Normal Distribution.
M A D M = 1 E B + 1 i 1 E B + 1 | Δ ( i ) M e |
(3) The pseudocode of DTWD (Algorithm A1)
Algorithm A1: Main: calculate DTWD
Input: time series A (array [1, l]), B(array [1,k])
Output: DTWD
(1) Matrix D = AT B.
(2) Set constraint condition:
Boundedness, Boundary conditions, Continuity, and Monotonicity
(3) Dtwd = 0
(4) FORi: = 1: l DO
DTWD [i, 0]: = ∞
(5) FORi: = 1: l DO
DTWD [0, i]: = ∞
(6) DTW [0, 0]: = 0
(7) FORi = 1: l DO
{  FORj = 1: k DO
d ( i , j ) = i j p
cos t : = d ( A [ i ] , B [ j ] )
DTWD [i,j]: = cost + minimum (DTWD [i − 1,j], DTWD [i,j − 1], DTWD [i − 1,j − 1])
}
(8) returnDTWD
The details of RUL prediction result based on multi-parameter similarity fusion with Euclidean distance/DTWD.
Table A3. RUL prediction results based on multi-parameter similarity fusion with Euclidean distance.
Table A3. RUL prediction results based on multi-parameter similarity fusion with Euclidean distance.
NO.RULErrorRelative Error (%)NO.RULErrorRelative Error (%)
Actual ValuePredicted ValueActual ValuePredicted Value
1267348−81−30.352813212842.97
2262238249.1729127137−10−7.50
3257351−94−36.5430122135−13−10.33
4252235176.9131117128−11−9.35
5247236124.6732112115−3−2.90
6242227156.3933107122−15−13.69
7237223146.0534102129−27−26.31
8232218146.033597116−19−19.45
9227208198.363692111−19−20.36
10222209135.833787113−26−30.06
11217207104.733882103−21−25.65
1221220394.12397794−17−21.66
132071664220.08407296−24−33.46
142021673617.61416792−25−37.67
151971633417.20426297−35−55.75
161921613116.34435790−33−57.49
171871612614.14445297−45−86.60
181821562614.06454795−48−101.49
191771542312.77464271−29−68.44
201721522011.77473771−34−91.10
211671491810.65483265−33−102.14
22162150127.35492763−36−131.98
23157147106.16502251−29−130.16
24152142106.36511739−22−127.27
2514714342.66521215−3−23.33
26142143−1−0.3153715−8−109.38
27137143−81−4.34
Table A4. RUL prediction results based on multi-parameter similarity fusion with DTWD.
Table A4. RUL prediction results based on multi-parameter similarity fusion with DTWD.
NO.RULErrorRelative Error (%)NO.RULErrorRelative Error (%)
Actual ValuePredicted ValueActual ValuePredicted Value
12672392810.362813212753.95
2262242207.6729127135−8−6.38
3257240176.5230122134−11−9.38
4252237156.1331117126−9−7.55
5247232156.1332112122−10−8.69
6242232104.2833107121−14−12.84
7237222156.5234102110−8−7.56
8232215177.223597120−23−23.53
9227214135.733692111−18−19.95
1022221573.053787100−13−14.76
1121721341.85388298−16−19.31
1221221021.00397792−15−19.27
1320720441.72407286−14−19.21
14202206−4−1.81416781−14−20.64
15197205−8−3.89426298−35−56.96
16192201−9−4.66435795−38−66.05
17187202−15−8.12445294−42−79.67
18182202−20−11.01454792−45−95.82
191771502715.23464245−3−7.55
201721492313.364737251231.81
21167191−24−14.59483242−9−29.06
22162148148.4449272527.94
23157143149.00502230−8−36.89
24152141117.10511730−12−71.91
2514713895.94521216−4−28.81
2614213853.195375227.88
27137138−1−0.88
The details of RUL prediction result based on multi-sample similarity fusion with euclidean distance/DTWD.
Table A5. prediction results based on multi-sample similarity fusion with euclidean distance.
Table A5. prediction results based on multi-sample similarity fusion with euclidean distance.
NO.RULErrorRelative Error (%)NO.RULErrorRelative Error (%)
Actual ValuePredicted ValueActual ValuePredicted Value
12672214617.1428132121118.09
22622214115.722912712075.37
32572203714.34301221230−0.38
42522223011.8931117119−2−1.60
52472212610.433211210586.75
62422053715.2533107931413.36
72372043313.7834102911110.94
82321993314.073597970−0.13
92271923515.33369293−1−1.37
102221893315.0337878522.84
112171883013.60388286−3−4.12
122121634923.1839777356.01
132071575024.30407276−4−5.06
142021574522.16416771−4−5.64
151971534422.2742626111.98
161921504222.11435774−17−28.88
171871553217.17445276−24−46.44
181821513117.17454770−23−49.52
191771403720.85464274−31−74.72
201721413218.35473758−21−57.02
211671442313.97483259−27−84.10
221621402213.35492754−27−98.48
231571372012.60502235−13−57.95
241521302214.43511726−9−51.51
251471281912.79521221−9−72.66
261421271510.835378−1−11.67
27137128106.95
Table A6. RUL prediction results based on multi-sample similarity fusion with DTWD.
Table A6. RUL prediction results based on multi-sample similarity fusion with DTWD.
NO.RULErrorRelative Error (%)NO.RULErrorRelative Error (%)
Actual ValuePredicted ValueActual ValuePredicted Value
1267284−17−6.40281321330−0.35
2262263−1−0.4729127137−10−8.07
3257272−15−5.7930122136−14−11.36
42522242811.0431117132−15−12.51
5247224239.3432112116−4−3.29
62422162610.5833107113−6−5.30
7237215229.1934102111−9−8.33
8232213198.403597115−18−18.84
9227207208.903692109−17−17.94
10222211114.893787109−21−24.59
11217205125.653882101−18−22.37
12212200125.59397791−14−18.31
13207191167.64407291−19−26.62
14202189136.46416790−23−34.15
15197187105.09426292−30−48.85
1619218663.38435780−23−39.99
1718718520.87445276−24−45.74
1818218200.26454755−8−16.65
191771552212.63464255−13−30.48
201721541810.52473740−3−7.66
2116716253.18483240−8−24.40
22162152106.42492735−8−28.88
2315714995.43502239−17−76.00
2415214674.34511718−1−4.89
2514714431.80521210217.78
26142143−1−0.615375230.21
27137142−5−3.83

References

  1. Wang, Z.Q.; Hu, C.H.; Si, X.S.; Enrico, Z. Remaining useful life prediction of degrading systems subjected to imperfect maintenance: Application to draught fans. Mech. Syst. Sig. Process. 2018, 100, 802–813. [Google Scholar] [CrossRef]
  2. Martinez-Morales, J.D.; Palacios-Hernandez, E.R.; Campos-Delgado, D.U. Multiple-fault diagnosis in induction motors through support vector machine classification at variable operating conditions. Electr. Eng. 2018, 100, 59–73. [Google Scholar] [CrossRef]
  3. Wang, T. Trajectory Similarity Based Prediction for Remaining Useful Life Estimation. Ph.D. Thesis, University of Cincinnati, Cincinnati, OH, USA, 2010. [Google Scholar]
  4. Sikorska, J.Z.; Hodkiewicz, M.; Ma, L. Prognostic modelling options for remaining useful life estimation by industry. Mech. Syst. Sig. Process. 2011, 25, 1803–1836. [Google Scholar] [CrossRef]
  5. Lei, Y.; Li, N.; Guo, L.; Li, N.; Yan, T.; Lin, J. Machinery health prognostics: A systematic review from data acquisition to RUL prediction. Mech. Syst. Sig. Process. 2018, 104, 799–834. [Google Scholar] [CrossRef]
  6. Frank, A.X.P.H.; Pinter, G. Numerical Assessment of PE 80 and PE 100 Pipe Lifetime Based on Paris—Erdogan Formula. Macromol. Symp. 2012, 311, 112–121. [Google Scholar] [CrossRef]
  7. Hu, Y.; Baraldi, P.; Maio, F.D.; Enrico, Z. Online Performance Assessment Method for a Model-Based Prognostic Approach. IEEE Trans. Reliab. 2016, 65, 718–735. [Google Scholar] [CrossRef]
  8. Hussain, S.; Gabbar, H.A. Vibration analysis and time series prediction for wind turbine gearbox prognostics. Int. J. Progn. Health Manag. 2013, 4, 718–735. [Google Scholar]
  9. Aroussi, M.E. Bearings prognostic using Mixture of Gaussians Hidden Markov Model and Support Vector Machine. Int. J. Netw. Secur. Appl. 2013, 5, 4. [Google Scholar]
  10. Dong, S.; Luo, T. Bearing degradation process prediction based on the PCA and optimized LS-SVM model. Measurement 2013, 46, 3143–3152. [Google Scholar] [CrossRef]
  11. Wang, P.; Youn, B.D.; Hu, C. A generic probabilistic framework for structural health prognostics and uncertainty management. Mech. Syst. Sig. Process. 2012, 28, 622–637. [Google Scholar] [CrossRef]
  12. You, M.Y.; Meng, G. A generalized similarity measure for similarity-based residual life prediction. Proc. Inst. Mech. Eng. Part E J. Process. Mech. Eng. 2011, 225, 151–160. [Google Scholar] [CrossRef]
  13. Mosallam, A.; Medjaher, K.; Zerhouni, N. Data-driven prognostic method based on Bayesian approaches for direct remaining useful life prediction. J. Intell. Manuf. 2016, 27, 1037–1048. [Google Scholar] [CrossRef]
  14. Zhang, Q.; Tse, W.T.; Wan, X.; Xu, G. Remaining useful life estimation for mechanical systems based on similarity of phase space trajectory. Expert Syst. Appl. 2015, 42, 2353–2360. [Google Scholar] [CrossRef]
  15. Xiong, X.; Yang, H.; Cheng, N.; Li, Q. Remaining Useful Life Prognostics of Aircraft Engines Based on Damage Propagation Modeling and Data Analysis. In Proceedings of the International Symposium on Computational Intelligence and Design, Hangzhou, China, 12–13 December 2015. [Google Scholar]
  16. Moghaddass, R.; Zuo, M.J. An integrated framework for online diagnostic and prognostic health monitoring using a multistate deterioration process. Reliab. Eng. Syst. Saf. 2014, 124, 92–104. [Google Scholar] [CrossRef]
  17. Xiao, Y.; Zhang, R.; Zhang, Q. Permutation flow shop scheduling with order acceptance and weighted tardiness. Appl. Math. Comput. 2015, 270, 312–333. [Google Scholar] [CrossRef]
  18. Epinette, J.; Jolles-Haeberli, B.M. Comparative Results from a National Joint Registry Hip Data Set of a New Cross-Linked Annealed Polyethylene vs. Both Conventional Polyethylene and Ceramic Bearings. J. Arthropl. 2016, 31, 1483–1491. [Google Scholar] [CrossRef] [PubMed]
  19. Liu, Y.; He, B.; Liu, F.; Lu, S.; Zhao, Y.; Zhao, J. Remaining Useful Life Prediction of Rolling Bearings Using PSR, JADE, and Extreme Learning Machine. Math. Prob. Eng. 2016, 2016, 13. [Google Scholar] [CrossRef]
  20. Candan Çağatay Ozaktas, H.M. Sampling and series expansion theorems for fractional Fourier and other transforms. Sig. Process. 2003, 83, 2455–2457. [Google Scholar] [CrossRef] [Green Version]
  21. Lei, Y.; He, Z.; Zi, Y. A new approach to intelligent fault diagnosis of rotating machinery. Expert Syst. Appl. 2008, 35, 1593–1600. [Google Scholar] [CrossRef]
  22. Zhou, S.; Qian, S.; Chang, W.; Xiao, Y.; Cheng, Y. A Novel Bearing Multi-Fault Diagnosis Approach Based on Weighted Permutation Entropy and an Improved SVM Ensemble Classifier. Sensors 2018, 18, 6. [Google Scholar] [CrossRef]
  23. Zheng, J.; Pan, H.; Yang, S.; Cheng, J. Generalized composite multiscale permutation entropy and Laplacian score based rolling bearing fault diagnosis. Mech. Syst. Sig. Process. 2018, 99, 229–243. [Google Scholar] [CrossRef]
  24. Huang, J.; Pei, W.; Cao, D.; Yu, S. Fuzzy Filter Based on Approximate Entropy. J. Data Acquis. Process. 1998, 13, 140–143. [Google Scholar]
  25. Josue, E.; Guillermo, V.; Francisco, R.; Lopez, E.; Gerardo, S.; Jose, A. Efficient predictive vibration control of a building-like structure. Asian J. Control 2018. [Google Scholar] [CrossRef]
  26. Ahmad, W.; Khan, S.A.; Islam, M.M.; Kim, J.M. A reliable technique for remaining useful life estimation of rolling element bearings using dynamic regression models. Reliab. Eng. Syst. Saf. 2018, 184, 67–76. [Google Scholar] [CrossRef]
  27. Tanzi, M.; Pereira, T.; Van Strien, S. Robustness of ergodic properties of non-autonomous piecewise expanding maps. Ergod. Theory Dyn. Syst. 2017, 39, 32. [Google Scholar] [CrossRef]
  28. Gu, M.; Chen, Y. Two improvements of similarity-based residual life prediction methods. J. Intell. Manuf. 2019, 30, 303–315. [Google Scholar] [CrossRef]
  29. Aghabozorgi, S.; Shirkhorshidi, A.S.; Wah, T.Y. Time-series clustering—A decade review. Inf. Syst. 2015, 53, 16–38. [Google Scholar] [CrossRef]
  30. Santos-Ruiz, J.R.; Bermúdez, F.R.; López-Estrada, V.; Puig, L.; Torres, J.A. Diagnosis of Fluid Leaks in Pipelines Using Dynamic PCA. Int. Fed. Autom. Control 2018, 51, 373–380. [Google Scholar] [CrossRef]
  31. Jia, X.; Jin, C.; Buzza, M.; Wang, W. Wind turbine performance degradation assessment based on a novel similarity metric for machine performance curves. Renew. Energy 2016, 99, 1191–1201. [Google Scholar] [CrossRef]
  32. Yao, B.; Su, J.; Wu, L.; Guan, Y. Modified Local Linear Embedding Algorithm for Rolling Element Bearing Fault Diagnosis. Appl. Sci. 2017, 7, 1178. [Google Scholar] [CrossRef]
  33. Zhang, B.; Zhang, L.; Xu, J. Degradation Feature Selection for Remaining Useful Life Prediction of Rolling Element Bearings. Qual. Reliab. Eng. Int. 2016, 32, 547–554. [Google Scholar] [CrossRef]
  34. Wu, B.; Li, W.; Qiu, M.Q. Remaining Useful Life Prediction of Bearing with Vibration Signals Based on a Novel Indicator. Shock Vibr. 2017, 10. [Google Scholar] [CrossRef]
  35. Vladimiroff, T. Ab initio and density functional calculations of mean-square amplitudes of vibration for benzene and cubane. J. Mol. Struct. Theochem. 2000, 507, 111–118. [Google Scholar] [CrossRef]
  36. Ji, M.; Guo, H.J.; Zhang, Y.D.; Li, T.; Gao, L. Hierarchic Analysis Method to Evaluate Rock Burst Risk. Math. Probl. Eng. 2015, 2015, 8. [Google Scholar] [CrossRef]
Figure 1. Performance of the system.
Figure 1. Performance of the system.
Entropy 21 00861 g001
Figure 2. Curves comparison of FS10 before and after wave filtering.
Figure 2. Curves comparison of FS10 before and after wave filtering.
Entropy 21 00861 g002
Figure 3. Principle of similarity-based Remaining Useful Life (RUL) prediction.
Figure 3. Principle of similarity-based Remaining Useful Life (RUL) prediction.
Entropy 21 00861 g003
Figure 4. The dynamic time warping path.
Figure 4. The dynamic time warping path.
Entropy 21 00861 g004
Figure 5. Similarity fusion of multi-parameter.
Figure 5. Similarity fusion of multi-parameter.
Entropy 21 00861 g005
Figure 6. Similarity fusion of multi-sample.
Figure 6. Similarity fusion of multi-sample.
Entropy 21 00861 g006
Figure 7. Combination of the two methods.
Figure 7. Combination of the two methods.
Entropy 21 00861 g007
Figure 8. Typical diesel engine (Inside the black frame is a gearbox).
Figure 8. Typical diesel engine (Inside the black frame is a gearbox).
Entropy 21 00861 g008
Figure 9. Vibration signal diagram in the life-cycle degradation process.
Figure 9. Vibration signal diagram in the life-cycle degradation process.
Entropy 21 00861 g009
Figure 10. Life-cycle diagrams of six selected performance damage indicators.
Figure 10. Life-cycle diagrams of six selected performance damage indicators.
Entropy 21 00861 g010
Figure 11. Curves of the six performance damage indicators from different samples.
Figure 11. Curves of the six performance damage indicators from different samples.
Entropy 21 00861 g011
Figure 12. Actual values and predicted values of RUL based on multi-parameter similarity fusion with Euclidean distance.
Figure 12. Actual values and predicted values of RUL based on multi-parameter similarity fusion with Euclidean distance.
Entropy 21 00861 g012
Figure 13. Actual values and predicted values of RUL based on multi-parameter similarity fusion with Dynamic Time Warping Distance (DTWD).
Figure 13. Actual values and predicted values of RUL based on multi-parameter similarity fusion with Dynamic Time Warping Distance (DTWD).
Entropy 21 00861 g013
Figure 14. Actual values and predicted values of RUL based on multi-sample similarity fusion with Euclidean distance.
Figure 14. Actual values and predicted values of RUL based on multi-sample similarity fusion with Euclidean distance.
Entropy 21 00861 g014
Figure 15. Actual values and predicted values of RUL based on multi-sample similarity fusion with DTWD.
Figure 15. Actual values and predicted values of RUL based on multi-sample similarity fusion with DTWD.
Entropy 21 00861 g015
Figure 16. First- and second-order principal component of Sample GU’s performance damage indicator system.
Figure 16. First- and second-order principal component of Sample GU’s performance damage indicator system.
Entropy 21 00861 g016
Figure 17. Comparison of four RUL prediction methods.
Figure 17. Comparison of four RUL prediction methods.
Entropy 21 00861 g017
Table 1. Vibration data sets for the diesel generator gearbox.
Table 1. Vibration data sets for the diesel generator gearbox.
Gearbox No.Service Time (Unit: h)Record Interval (Unit: min)
GU14675 or 10
GU23905 or 10
GU34105 or 10
GU44085 or 10
Table 2. Evaluation results of time-domain and frequency-domain damage indicators.
Table 2. Evaluation results of time-domain and frequency-domain damage indicators.
Damage IndicatorsCorrMonRobWRankingDamage IndicatorsCorrMonRobWRanking
Fs10.07230.00600.39360.1355425Fp40.63790.04960.94910.4371110
Fs20.73740.06410.94180.462075Fp50.55850.06410.87250.405514
Fs30.52240.06750.90840.4107513Fp60.91340.00510.90160.455718
Fs40.74520.06750.95140.468213Fp70.13050.04360.50690.1999724
Fs50.74400.00510.45420.2876122Fp80.10910.00170.99560.3213520
Fs60.54630.00090.97970.4036215Fp90.82010.05130.96960.480551
Fs70.52280.00090.86780.3653518Fp100.46150.01880.99280.3995416
Fs80.64180.02390.83010.3893417Fp110.56500.02390.99170.4224611
Fs90.71280.04740.85010.4212912Fp120.04470.02650.99500.3206921
Fs100.83280.02480.86960.439849Fp130.88960.06150.89850.478222
Fp10.91350.00850.90710.459086Fp140.44180.00940.51560.2477423
Fp20.91350.00850.90710.459087Fp150.35340.05810.84130.3521219
Fp30.91050.00770.92220.462614
Table 3. Comparison with single-parameter Remaining Useful Life (RUL) prediction.
Table 3. Comparison with single-parameter Remaining Useful Life (RUL) prediction.
ParameterPerformance Evaluation Indicators Q
MAEMSEMAPEESDMADM
Fp923751.8636.51%24.7821.10
Fp1320709.7120.30%26.4519.97
Fs421676.8121.35%21.9818.92
Fp3414142.5379.74%63.7540.72
Fs2231014.1740.64%25.2219.34
Fp1383023.6769.93%54.8238.20
PCA-113241.2218.79%15.3913.05
PCA-216490.6022.69%21.9016.02
Table 4. Results of different prediction methods.
Table 4. Results of different prediction methods.
RUL Prediction MethodSimilariy MeasurePerformance Evaluation Indicators Q
MAEMSEMAPEESDMADM
multi-parameter similarity fusionEuclidean distance23808.8130.90%27.7822.56
DTWD function14292.1117.15%16.8213.73
multi-sample similarity fusionEuclidean distance22684.3021.43%22.4119.35
DTWD function12219.3714.00%14.7612.39
MSVMEuclidean distance19351.0223.34%17.8312.24
DTWD function17287.3420.14%16.2111.45

Share and Cite

MDPI and ACS Style

Zhou, S.; Xu, X.; Xiao, Y.; Chang, W.; Qian, S.; Pan, X. Remaining Useful Life Prediction with Similarity Fusion of Multi-Parameter and Multi-Sample Based on the Vibration Signals of Diesel Generator Gearbox. Entropy 2019, 21, 861. https://doi.org/10.3390/e21090861

AMA Style

Zhou S, Xu X, Xiao Y, Chang W, Qian S, Pan X. Remaining Useful Life Prediction with Similarity Fusion of Multi-Parameter and Multi-Sample Based on the Vibration Signals of Diesel Generator Gearbox. Entropy. 2019; 21(9):861. https://doi.org/10.3390/e21090861

Chicago/Turabian Style

Zhou, Shenghan, Xingxing Xu, Yiyong Xiao, Wenbing Chang, Silin Qian, and Xing Pan. 2019. "Remaining Useful Life Prediction with Similarity Fusion of Multi-Parameter and Multi-Sample Based on the Vibration Signals of Diesel Generator Gearbox" Entropy 21, no. 9: 861. https://doi.org/10.3390/e21090861

APA Style

Zhou, S., Xu, X., Xiao, Y., Chang, W., Qian, S., & Pan, X. (2019). Remaining Useful Life Prediction with Similarity Fusion of Multi-Parameter and Multi-Sample Based on the Vibration Signals of Diesel Generator Gearbox. Entropy, 21(9), 861. https://doi.org/10.3390/e21090861

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop