TL;DR
This paper introduces a deep learning framework using CNNs and Hilbert-Huang transform for accurate, online prediction of remaining useful life in bearings, improving reliability in industrial systems.
Contribution
It presents a novel CNN-based method combined with HHT for automatic RUL prediction, reducing feature extraction efforts and enhancing adaptability across conditions.
Findings
Outperforms existing RUL prediction methods in experiments.
Successfully transfers across different bearing operating conditions.
Demonstrates high accuracy and robustness in RUL estimation.
Abstract
In industrial applications, nearly half the failures of motors are caused by the degradation of rolling element bearings (REBs). Therefore, accurately estimating the remaining useful life (RUL) for REBs are of crucial importance to ensure the reliability and safety of mechanical systems. To tackle this challenge, model-based approaches are often limited by the complexity of mathematical modeling. Conventional data-driven approaches, on the other hand, require massive efforts to extract the degradation features and construct health index. In this paper, a novel online data-driven framework is proposed to exploit the adoption of deep convolutional neural networks (CNN) in predicting the RUL of bearings. More concretely, the raw vibrations of training bearings are first processed using the Hilbert-Huang transform (HHT) and a novel nonlinear degradation indicator is constructed as the label…
| Symbol | Description | Expression |
|---|---|---|
| Inner race frequency | ||
| Outer race frequency | ||
| Ball frequency |
| Physical parameter | Value |
|---|---|
| Number of balls of the bearing () | 13 |
| Ball diameter of the bearing () | 3.5 mm |
| Pitch diameter of the bearing () | 25.6 mm |
| Contact angle of the bearing () | |
| Rotation frequency (), bearing1 | 1800 r/min |
| Rotation frequency (), bearing2 | 1600 r/min |
| Maximum dynamic load (F), bearing1 | 4000 N |
| Maximum dynamic load (F), bearing2 | 4200 N |
| Layer | Filters | Kernel size/Stride | Output size |
|---|---|---|---|
| Input | … | … | 1x2560x1 |
| Conv1 | 64 | 1x100/50 | 1x50x64 |
| Maxpooling1 | … | 1x2/2 | 1x25x64 |
| Conv2 | 64 | 1x2/1 | 1x24x64 |
| Maxpooling2 | … | 1x2/2 | 1x12x64 |
| Flatten | … | … | 1x768 |
| FC1 | … | … | 1x100 |
| Output | … | … | 1x1 |
| DEI | CNN | SVR | |
|---|---|---|---|
| C1 | - | ✓ | ✓ |
| C2 | ✓ | - | ✓ |
| Proposed method | ✓ | ✓ | ✓ |
| Bearing1_4 ( =339 s) | Bearing1_5 ( =1610 s) | Bearing1_6 ( =1460 s) | |||||||||
| Methods | ETA | ETA | ETA | ||||||||
| Proposed approach | 340 s | -0.29% | 0.96 | 1500 s | 6.83% | 0.79 | 1480 s | -1.37% | 0.83 | ||
| C1 | 30 s | 91.15% | 0.04 | 820 s | 49.07% | 0.18 | 1181 s | 19.11% | 0.52 | ||
| C2 | N/A (0 s) | N/A (100%) | 0.03 | 1140 s | 29.19% | 0.36 | 1080 s | 26.02% | 0.41 | ||
| MAE | NRMSE | |||||||
|---|---|---|---|---|---|---|---|---|
| Methods | Bearing1_4 | Bearing1_5 | Bearing1_6 | Bearing2_4 | Bearing 2_6 | |||
| Proposed approach | -0.29% | 6.83% | -1.37% | 5.75% | 1.55% | 0.87 | 46.2 | 0.05 |
| FCAMN [23] | 21.95% | -15.22% | -5.74% | - | - | 0.35 | 696.7 | 0.13 |
| Multi-scale CNN [24] | 10.69% | -148.20% | -21.51% | - | - | 0.25 | 1003.3 | 0.50 |
| CWT-CNN [34] | 20.35% | 11.18% | 34.93% | -1.44% | -42.64% | 0.46 | 265.8 | 0.29 |
| RNN [35] | 62.07% | -22.98% | 21.23% | -19.42% | -13.95% | 0.17 | 586.0 | 0.57 |
| Particle Filtering [36] | 5.60% | 100.00% | 28.08% | 8.63% | 58.91% | 0.42 | 583.8 | 1.29 |
| FFT+Ratio [37] | 80.00% | 9.00% | -5.00% | 10% | 49% | 0.44 | 252 | 0.32 |
| LSTM [38] | 38.69% | -99.40% | -120.07% | 19.81% | 17.87% | 0.26 | 996.2 | 0.57 |
| SOM [39] | -20.94% | -278.26% | 19.18% | 51.80% | -20.93% | 0.16 | 1164.2 | 1.02 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
A deep learning-based remaining useful life prediction approach for bearings
Cheng Cheng, Guijun Ma, Yong Zhang, Mingyang Sun, Fei Teng, Han Ding, and Ye Yuan This work was supported by the National Natural Science Foundation of China [Grant number 91748112] and by the Primary Research & Development Plan of Jiangsu Province [Grant number BE2017002].C. Cheng and Y. Yuan are with the Key Laboratory of Image Processing and Intelligent Control, School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan, 430074, China. Y. Yuan, G. Ma and H. Ding are with the State Key Laboratory of Digital Manufacturing Equipment and Technology, Huazhong University of Science and Technology, Wuhan 430074, China.G. Ma and H. Ding are with the School of Mechanical Science and Engineering, Huazhong University of Science and Technology, Wuhan 430074, China.Y. Zhang is with School of Information Science and Engineering, Wuhan University of Science and Technology, Wuhan 430081, China.M. Sun is with the College of Control Science and Engineering, Zhejiang University, Hangzhou, 310007, China.F. Teng is with the Department of Electrical & Electronic Engineering, Imperial College London, London, SW7 2AZ, UK.For correspondence, contact Prof. Ye Yuan ([email protected]).
Abstract
In industrial applications, nearly half the failures of motors are caused by the degradation of rolling element bearings (REBs). Therefore, accurately estimating the remaining useful life (RUL) for REBs is of crucial importance to ensure the reliability and safety of mechanical systems. To tackle this challenge, model-based approaches are limited by the complexity of mathematical modeling. Conventional data-driven approaches, on the other hand, require massive efforts to extract the degradation features and construct the health index. In this paper, a novel data-driven framework is proposed to exploit the adoption of deep convolutional neural networks (CNN) in predicting the RULs of bearings. More concretely, raw vibrations of training bearings are first processed using the Hilbert-Huang transform to construct a novel nonlinear degradation energy indicator which can be used as the training label. The CNN is then employed to identify the hidden pattern between the extracted degradation energy indicator and the raw vibrations of training bearings, which makes it possible to estimate the degradation of the test bearings automatically. Finally, testing bearings’ RULs are predicted through using a -support vector regression model. The superior performance of the proposed RUL estimation framework, compared with the state-of-the-art approaches, is demonstrated through the experimental results. The generality of the proposed CNN model is also validated by performance test on other bearings undergoing different operating conditions.
Index Terms:
Remaining useful life estimation, rolling bearings, Hilbert-Huang transform, convolutional neural networks.
Nomenclature
CNN
Convolutional neural network.
DEI
Degradation energy indicator.
EMD
Empirical mode decomposition.
ETA
Exponential transformed accuracy.
HHT
Hilbert-huang transform.
IMF
Intrinsic mode function.
MAE
Mean average error.
NRMSE
Normalized root mean square error.
MSE
Mean square error.
REB
Rolling element bearing.
RUL
Remaining useful life.
SVR
Support vector regression.
FT or
Failure threshold.
Length of historical units of training bearing.
Length of historical units of test bearing.
No. of predicted units of test bearing.
Sensor measurement signal of -th unit.
Sampling time.
Number of measurements in -th unit.
Time interval between two recording phases.
DEI of training bearing.
Normalized DEI of training bearing.
Estimated DEI of test bearing.
Predicted DEI of test bearing.
Total number of layers.
Weights in -th convolutional layer.
Bias in -th convolutional layer.
Predicted RUL of test bearing.
Real RUL of test bearing.
Number of test bearings.
Relative percentage error of prediction.
Average score of prediction.
Training set for SVR.
The set of real numbers.
The set of positive intergers.
I Introduction
To constrain relative motion while reducing friction between moving parts, rolling element bearings (REBs) are one of the most widely used elements in industrial machinery. Prognostics and health management of bearings is of significance for safety, reliability and effectiveness of the mechanical systems [1, 2]. The literature [3] shows nearly half of motor failures are related to the degradation of bearings. As such, estimating the remaining useful life (RUL) (i.e., time-to-failure prognostics) of bearings has attracted a great deal of attention in recent years [4]. RUL prediction helps users monitor the condition of the bearings and provides an estimation of time left before a failure occurs. Compared with fault diagnosis, which has been well investigated over past few decades [5], the problem of RUL prediction studied in this paper is a relatively new and challenging topic due to the huge amount of uncertainties of environment and operating condition.
In general, RUL prediction approaches can be categorized into model-based and data-driven approaches. Model-based approaches aim to build a physical model to represent the degradation of the rolling bearing [6]. Li et al. [7] predicted the defect growth on a bearing unit using Paris’s law for fatigue. However, it is difficult to construct a precise physical degradation model due to the sensitivity of the model parameters and noised operating environments. This limits the practical applications of the model-based approaches. On the other hand, data-driven approaches benefit from the extensive expertise in signal processing and machine learning [8], and infer the degradation process of bearings without knowing any physics of degradation failure. The prognostic framework of the data-driven approaches mainly consists of three stages: 1) feature extraction from noisy sensory signals, which helps to build up the health indicator for the learning of system degradation behavior; 2) degradation models are trained on the training bearing using statistical or machine learning techniques; and 3) the degradation indicator of the test bearings can be estimated based on the model trained in the second stage. Then, the unknown degradation process can be predicted by applying regression techniques (i.e., -SVR).
To extract features from raw signals, time-domain, frequency-domain, and time-frequency domain analysis are commonly adopted. Among them, the time-frequency analysis has been found to be the most efficient due to its ability to characterize transient signals over time and frequency domains [9]. Well-known time-frequency techniques for extracting bearing features include short-time Fourier transform [10], wavelets [11], Wigner-Ville distribution [12], and Hilbert-Huang transform (HHT) [13]. Implementation of the short-time Fourier transform is limited by its time-frequency resolution capability; for instance, low frequencies are difficult to identify with short windows. On the other hand, wavelets and the Winger-Ville distribution provide richer pictures than short-time Fourier transform; however, their effectiveness depend on estimating the Hurst parameter and the quality of the analyzed signal. HHT shows better computational efficiency and resolution over other time-frequency analysis. which uses the techniques of empirical mode decomposition (EMD) and Hilbert transform (HT) to decompose the original vibration signal into a number of intrinsic mode functions (IMFs) in various frequency scales. Frequency components of each IMF are related to both the sampling frequency and the signal itself, thus demonstrating that HHT is a self-adaptive signal processing technique perfectly suited to non-stationary signals. Wu et al. [14] analyzed the time-to-failure prognostics of REBs, which extracts ten statistical features using time and frequency analysis and eleven IMF features using HHT time-frequency analysis. The gear fault identification method proposed in [15] is based on the HHT, and first six IMFs are selected as inputs for SOM neural networks for fault diagnosis.
Data-driven RUL prediction approaches are mainly based on statistical and machine learning techniques, such as artificial neural networks (ANN) [16], fuzzy logic systems [17], and auto-regressive (AR) models [18]. The computational cost of ANN is relatively high in terms of optimizing the weights of the model. The performances of the AR models and fuzzy logic systems require precise trend of historical observations and high-quality training data, respectively. Recently, deep learning has merged into research and industry fields, and has beaten other machine learning techniques in speech recognition and image recognition tasks [19]. Deep learning model is good at discovering high level abstractions from labeled data using a back-propagation algorithm [20]. Specifically, it learns feature representations automatically rather than designing the hand-created features by experience. As the most well-known model in deep learning, in recent years, CNN dominates the recognition and detection problems in computer vision domain, which is distinguished by three characteristics, namely local connections, shared weights, and local pooling [21, 22]. The first two characteristics show that the CNN model requires fewer parameters to detect local information than multilayer perceptron, while the last characteristic ensures shift invariance to the networks. Typically, 1-D CNN will be employed to this work to learn the latent space of input sensory time-series vibrations, which has been applied with great success to speech recognition and document reading tasks. Few attempts have been made for the prediction problem using CNN-based models [23, 24, 25, 26]. This paper exploits the adoption of CNN technique in estimating the RUL of bearings, as a prognosis problem, to learn about the nonlinear degradation behavior according to raw vibration data and an extracted label. Instead of using the CNN technique to perform the time-series prediction, the main function of the CNN model in this paper is to reveal the hidden dependencies between the vibration data and the DEI of the training bearing, which makes full use of the advantages of CNN in automatic feature extraction.
In this work, we propose a data-driven framework for predicting the RUL of REBs by applying the HHT, CNN, and -Support vector regressorion (-SVR). The raw vibration signals collected from sensors are processed by the HHT method and a novel time-series degradation indicator, i.e., DEI, is constructed. Subsequently, a CNN model is trained to learn the features from the input raw vibration to the DEI label on the training bearings, and used to predict the DEIs of testing bearings. Then, a -SVR model is introduced so that the evolution of the degradation can be forecast till the bearing failure. The effectiveness of the proposed based framework for RUL prediction is validated on an experimental platform (i.e., PRONOSTIA). Much lower RUL prediction errors are achieved, compared with eight existing approaches in previous papers and two tested methods designed in this paper, indicating the superior performance of the proposed method.
This work makes the following contributions: 1) The proposed method successfully extracts a novel nonlinear degradation energy indicator (DEI) (see Fig. 1, compared with the linear time degradation indicator) to describe the degradation trend of the training bearing, according to the nature frequencies of bearing components; 2) The proposed CNN architecture is general and robust for similar operation conditions, it can transfer to another bearing undergoing different operating condition and obtain good prediction results, without changing CNN hyper parameters and the depth of layers; 3) The propose DEI is an integrated indicator with regards to the maximum vibration levels among different bearing components, which considers all the possible detects on the rolling element bearing. This is a more realistic indicator as the localized defects are not initially initiated in real industrial applications, meaning that all the types of defects have to be considered; and 4) CNN scales all the indicators of training bearings and test bearings into a consistent latent space. Thus, training and testing can share a same failure threshold (FT), i.e., the maximum value of the indicator for the training bearing.
The outline of this paper is as follow. Section II presents the proposed RUL prediction framework with technical details. In Section III, experimental results obtained from bearing degradation tests are carried out. Thereby, the performance of the proposed framework is validated and the results show improved accuracy in predicting the RUL compared with eight state-of-the-art approaches and two designed test methods in this paper. Section IV summarizes the paper and discusses future works.
II Degradation indicator training and RUL prediction algorithm
The overall framework for the prediction of the RUL can be decomposed into three parts. The schematic of the overall framework is shown in Fig. 2. The key challenges of this work involve: obtain the DEI to represent the degradation behaviour; establish a CNN model to map raw vibration signal to the DEI; and construct an -SVR to predict the RUL. Thus, in the following subsections, the explicit expression of degradation feature extraction, CNN model, and -SVR forecasting model will be derived in Section II-A, Section II-B, and Section II-C.
II-A Degradation indicator extraction
To begin with, for a training bearing, it is assumed that the raw vibration signal till the end of lifetime with historical units have been acquired. The sensory vibration signal , with sampling time , is measured at each historical unit for , where is the number of measurements recorded in each historical unit.
EMD is a self-adaptive method which is normally applied to analyze non-stationary and nonlinear signals. It decomposes the raw vibration data into number of IMFs, illustrating the natural oscillation modes from fast to low oscillations.
For the -th unit, the -th mode () of IMF, IMF for , is calculated iteratively associated with the iteration number .
First let and initialize for by:
[TABLE]
Define IMF for only if the IMF meets the following two conditions:
(I)
The IMF should have one or zero difference between the extrema number and the number of zero crossing,
(II)
along the time axis, the average value of upper and lower bound of the IMF should be zero everywhere.
Otherwise, update the IMF for through the following iteration procedure with until the IMF satisfies both aforementioned conditions (I) and (II):
[TABLE]
where are the mean values of the upper envelope and the lower envelope of .
Once obtaining all IMFs, the analytical form of each IMF can be written as:
[TABLE]
where i is the imaginary part of the . is the Hilbert transformation of by convolution with function , given as:
[TABLE]
By this means, we can calculate the instantaneous amplitude and phase
[TABLE]
Accordingly, it is easy to derive the instantaneous frequency
[TABLE]
for .
Then, the Hilbert Spectrum of is obtained by
[TABLE]
The marginal Hilbert spectrum (MHS) can be written as:
[TABLE]
Nature frequencies of bearing components depend on the geometry of the bearing and its rotation speed. Expression of these frequencies are given in Table I, in where is the number of balls, is the rotation frequency, is the contact angle, and and are the ball diameter and pitch diameter, respectively.
With the bearing frequencies of different components, the value of the DEI, , at historical unit is defined as the maximum value of the MHS by substituting the , and into , given that
[TABLE]
The extracted DEI \bf{L}$$=[L_{1},...,L_{N}] is nromalized before training:
[TABLE]
for , and is an infinitesimal that used to avoid the value of label equal to 0 or 1. Thus, the normalized DEI is
[TABLE]
II-B DEI pattern learning
In this work, layers with repeated components are stacked in a CNN architecture, including convolutional layers, pooling layers, fully connected layers, and a regression layer [21].
Convolutional layer contains organized patches in convolutional layers, each patch is calculated by composing the features of the previous layer through a filter bank with the following equation:
[TABLE]
where denotes the output of the -th unit in the -th layer. is the input data the -th sub-vector in the previous layer , where is kernel size in layer . and denote the connecting weights and bias in the -th layer, respectively. ‘’ means the convolution operation. It is noted that when , is a sub-vector of the raw vibration data . Here we define all neurons in each layer is for {1,2,,}, where is the number of neurons in the -th layer and is the total number of layers. For convolutional layer, and is the stride in convolutional layer.
Activation function is introduced after convolutional layer. Among various activation functions, Rectified linear unit (ReLU) is chosen as the nonlinear activation function to prevent the issue of vanishing gradient which may significantly increase the training time or even lead to the non-convergence.
Pooling layer is then used as a nonlinear down-sampling layer to extract the maximum feature values in each patches of the input data. Its function is to save computation time and downsize the number of parameters of the model as well as control overfitting. More specifically, pooling transforms small windows into single values by maxing or averaging. Consequently, the features extracted within the small window are similar and therefore illustrating the shift invariance property of CNN. Max-pooling layer is selected in this work as it is an algorithmic choice to ensure the generalization of neural networks [29], which is given by:
[TABLE]
where is the pooling size, and is the stride in max pooling layer.
Fully connected layer and regression layer, like a classic ANN network, take the results of the convolution and max-pooling processes and use them to generate a predicted label. Since we use a normalized DEI as the label for learning, the sigmoid function with an output value between (0,1) is applied to the last layer for normalized output. Hereby, mean square error (MSE) function is used to compute the loss with the expression:
[TABLE]
in where the proposed CNN model is minimizing the loss function between ground label DEI and predicted label . Algorithm 1 outlines the proposed CNN modeling procudure.
II-C RUL prediction
With the obtained CNN model, for a new test bearing with historical units, the estimated DEI can be automatically generated by the trained CNN with the new vibration signal , where . Then, to predict the RUL , a -SVR forecasting model [31] is formalized to predict the upcoming degradation , , based on the estimated DEI by sliding window method. The forecasting process contains three sub-steps:
Extract training features from the estimated DEI over a sliding window. The schematic of this step is illustrated in Fig. 4. The estimated DEI is decomposed into overlapping windows associated with sampling window size and sliding size . for represents a training feature for -SVR, where denotes the mean value and denotes the variance of each sampling window. Thus, the training set for -SVR is obtained, where corresponds to the next value in of the -th sampling window; 2. 2.
-SVR modeling is described in Algorithm 2, while at the application level, two parameters (distance limit and penalty parameter ) can be set manually when training the prediction model. A radial basis function (RBF) is necessary when we intend to train a nonlinear model; 3. 3.
The SVR model learned in Algorithm 2 is then used to predict the RUL by sliding window method (with the same and in step 1). Since the test bearing undergoing same operating condition as the training bearing, it is reasonable to define the FTs (denoted as ) of the test bearing equals to the last feature of the DEI of the training bearing, such that =. Hence, the first prediction can be calculated as , and the predicted DEI can be obtained by shifting the sampling window, with . This will consequently lead to , where is the time interval between two recording phases.
III Experiments
III-A Data description
The validation of the proposed RUL prediction framework is conducted on an experimentation platform named PRONOSTIA (see Fig. 5). This platform is built as a combination of three parts: a rotating part, a loading part, and a measurement part. The rotation of the test bearing is driven by the low speed shaft whose rotating torque is transmitted from an AC motor. A radial force generated by this loading part is applied on the external ring of the testing bearing. Since this external radial force exceeds the bearing’s allowable dynamic load (4000N), the degradation behavior is accelerated so that we can observe its degradation process within a relatively short time in few hours. During experiment tests, two high-frequency accelerometers (Type DYTRAN 3035B) are placed orthogonally on the external race of the test bearing to acquire the horizontal and vertical vibrations respectively. The accelerometer bandwidth is 0.510 kHz ( 5%), and its natural/resonant frequency is 45 kHz. In this work, we extract our degradation labels by using the horizontal vibrations.
Bearing1 is chosen to validate the proposed framework. More specifically, the training set bearing1_2 is used for extracting the DEI and training the CNN model. Test sets bearing1_4, bearing1_5, and bearing1_6 are then used for estimating their DEIs and predicting the RULs by applying -SVR forecasting model. Results of the Bearing2 under different rotational frequency and external dynamic load are also provided and compared. The geometry parameters and the operation conditions of the bearing1 and bearing2 are listed in Table II. Sampling frequency of the vibration sensor is 25.6 kHz. 0.1 s accelerometer vibration signals are recorded at a fixed time interval s. Therefore, each recording phase contains measurements. More detailed description of the data set, bearings, and sensors can refer to the data description in [32].
III-B Degradation indicator extraction
The DEI of the bearing1_2 is extracted by substituting its outer ring frequency = 168 Hz, inner ring frequency = 221 Hz, and ball frequency = 215.4 Hz into Eq. (9). The evolution of the final extracted DEI of the bearing1_2 is showed in Fig. 6(a). We also show the time evolutions of intermediate features , , and of Eq. (9) in Fig. 7. It can be observed that the magnitude value of each time point in Fig. 6(a) is the maximum of , , and at that time.
III-C Degradation indicator estimation
We use the vibration signal in the horizontal direction of the bearing1_2 as the input of the CNN, and the normalized DEI is used as the label which contains historical units . The normalized DEI, , is shown in Fig. 6(b). A less complex architecture of the CNN model is designed to improve the robustness of the network. As shown in Fig. 3, our finalized CNN model consists of layers: two convolutional layers (Conv1 and Conv2), two max-pooling layers (Maxpooling1 and Maxpooling2), one fully connected layer (FC1), and one regression layer for output. Before model training, Adam is set to be the optimizer with a small value 0.00001 as it guarantees a quick loss convergence compared with a larger or smaller learning rate for the CNN training process. The activation function in the output layer is the Sigmoid function, while ReLU function is used in the previous layers. Details of parameters in the proposed CNN model are concluded in Table III. The convolutional window sizes (kernel sizes) of convolutional layers are set to a large value 100 and a small value 2, respectively. The kernel size of Conv1 is relevantly large in order to extract more features from the raw vibration signal for more impressive power, meanwhile, small kernel size is selected for Conv2 to prevent overfitting. Hyper-parameters are obtained after 1000 iterations of training.
The estimated DEIs as the output of the CNN model are shown in Fig. 8. In Fig. 8(a), estimated DEI of the training bearing1_2 shows the similar time evolution as the DEI label in Fig. 6(b). The final estimated DEI value of the bearing1_2, , is defined as the failure threshold for the test bearings in Fig. 8(b)-(d).
III-D RUL prediction
As presented in Section III-C, the estimated DEIs have obtained from the trained CNN model. However, the DEIs of the test sets shown in Fig. 8(b)-(d) do not reach their fault limit, which need a regression algorithm to predict the estimated DEI till the end-of-life of each test bearing. A -SVR method is proposed to predict the upcoming degradation process of the test bearings. We conduct a training on the predicted DEI of the bearing1_2. The sampling window size and the moving size in Fig. 4 is set to 50 and 1, respectively. The kernel function used in the prediction case is an RBF and penalty parameter C of the error term is chosen as 5.09 by grid search and cross validation using the method of GridSearchCV [33]. We estimate the DEI after 1000 steps based on the existing DEI, and using the maximum value of predicted DEI of bearing1_2 (i.e., ) to limit the termination time of the test sets.
Fig. 9(b)-(d) show the predicted RULs of the test bearings till the failures occur. Red lines represent the predicted evolution of the bearings’ degradation behavior using the -SVR method. The RULs are calculated as the difference between the final time when DEI reaching the failure threshold and the time of the last known point of the test bearings. For bearing1_4, the predicted RUL is 340s, while 1500s and 1480s are the predicted RULs for bearing1_5 and bearing1_6, respectively.
III-E Comparison and discussion
To assess the accuracy of the proposed method and compare to other existing approaches, two metrics are commonly adopted: 1) The relative percentage error () which is given by Eq. (15); and 2) The exponential transformed accuracy (ETA) proposed in IEEE PHM 2012 [32]. ETA is an assessment index to distinguish the seriousness of the underestimate and overestimate of RUL prediction. It is clearly that underestimate (i.e., early warning) is preferred than overestimation (i.e., warning after damage) to prevent more severe damage of the bearing. The formulas are expressed in Eq. (16).
[TABLE]
[TABLE]
where where is the real RUL for the test bearing. A higher \mid$$Er\%$$\mid means a worse RUL prediction result. On the other hand, ETA value varies from 0 to 1, and a higher score means a better RUL prediction result. In this work, is the common choice of most previous literature, thus it will be used for comparison.
In addition to the two metrics that evaluate the prediction performance for a specific bearing, three more assessment metrics, namely average score , mean average error MAE, and the normalized root mean square error NRMSE, to make a comprehensive comparison of different methods, which are with the expression forms
[TABLE]
[TABLE]
[TABLE]
where is the number of test bearings.
To verify the benefits of the DEI and CNN techniques on RUL prediction, here we also develop other two tested methods for comparison purpose (see Table IV). The proposed method and the tested methods are denoted and explained as follows:
C1: CNN and -SVR: This tested method uses a conventional linear time degradation label for CNN training rather than the nonlinear DEI. By this means, we can illustrate the impact of the DEI on RUL prediction. 2. 2.
C2: DEI and -SVR: Without training the CNN model for feature extraction, DEI in this tested method is extracted manually and the -SVR is followed for the RUL prediction. Note that computing a DEI for a new bearing requires high computational power and longer time. In the meantime, the FT of each test bearing has to be pre-defined artificially, which increase uncertainties of the RUL prediction affected by different working conditions. By this means, we can illustrate the impact of CNN modeling on the prediction of the final RUL. 3. 3.
Proposed method: DEI-based CNN and -SVR: This is the proposed framework which integrates DEI extraction, CNN, and -SVR into one framework.
The predicted numerical errors of the test bearings with the proposed approach and the tested methods are listed in Table V. Our approach achieves of -0.29%, 6.83%, and -1.37% for bearing1_4, bearing1_5, and bearing1_6, respectively, which are much more smaller than the C1 and C2 methods. C1 uses a linear time degradation label for the training of the CNN model. The results show more than 19% prediction errors for test bearings and even 91.15% prediction error is obtained from bearing1_4, indicating that time degradation label is less effective than the DEI for the CNN training process. C2 is the method extracting the degradation indicator of the test bearings and define the FTs manually. As testing bearing1_4, 1_5, and 1_6 operate under same working conditions of bearing1_2, we employ the maximum and minimum value of bearing1_2 to normalize the extracted DEIs of C2 method. With same working condition and same normalization parameters, FT of the proposed method could be reasonablely used in C2 as well. To evaluate the impact of CNN modeling on the estimation of the final RUL, for the test bearings, the estimated DEIs extracted using HHT and calculated by trained CNN model are compared in Fig. 10. Without the CNN modeling procedure, one of the main drawback of the C2 method is that it requires long time to calculate (2 s of each sampling period). This limits the practical application of this method in industry. Moreover, Fig. 10(a) shows that due to the uncertainties and huge mount of noise, the DEI extracted by C2 method has already exceeded the FT before the exact failure time, resulting in a 100% . Similarly, in Fig. 10(c), the DEI extracted of C2 method is much noised than it of proposed method. At 16470 s, the magnitude of DEI:C2 is almost close to the FT, this phenomenon might lead to a waste of sources due to much underestimated of RUL. The comparison results in Table V demonstrate the benefits of using DEI and CNN in estimating RULs of REBs.
To demonstrate the generality of our proposed RUL estimation framework, other test bearings (i.e, bearing2_4 and bearing2_6) that operates undergoing a different external load and rotational speed are analyzed. Bearing2_2 is used for training, and the trained CNN model is obtained without changing any hyper-parameters and the architecture of model in Table III. We just fine-tune the -SVR forecasting model by changing penalty parameter C from 5.09 to 7.09, resulting in 5.75% and 1.55% for Bearing2_4 and bearing2_6, respectively. Good RUL prediction results of bearings with different operating conditions indicate the repeatability and robustness of our proposed method, with respect to the hyper parameters and the architecture of the CNN.
To further verify the proposed approach, the predicted numerical errors of RULs generated by the proposed method and eight published methods are compared and listed in Table VI. Other published approaches include a recurrent neural network method based health indicator [35], the method proposed by the winner of the IEEE PHM 2012 prognostic [37], a convolutional long-short-term memory network (LSTM) method [38], and a self-organization Mapping (SOM) method, etc.. In addition, recent CNN-based approaches, including the frozen convolution and activated memory network (FCAMN) [23], multi-scale CNN [24], and continuous wavelet transform CNN (CWT-CNN) [34] are also compared. The results of the comparison shown in Table VI confirm that our approach significantly outperforms the referenced methods with the highest average score , and the smallest MAE = 46.2 and NRMSE = 0.05. In particular, a -0.29% of bearing1_4 is achieved, owing to a 1s absolute time error. This result benefits from a good nonlinear degradation indicator extracted using the HHT method. In addition, the CNN is a powerful tool for discovering the hidden pattern of the extracted degradation indicator and the underlying bearing system, further increasing the accuracy of the predicted RUL.
It can be concluded from the experiment results that the proposed data-driven RUL estimation approach has much smaller prediction errors, compared with both the tested methods in this work and the other published methods in previous studies.
IV Conclusion
In this paper, a data-driven framework for RUL prediction of rolling element bearing is presented using the HHT method, a CNN model, and an -SVR forecasting model. A nonlinear degradation indicator DEI is first extracted from the raw vibration signals using the HHT method, which is defined as the label for the training. A CNN model is trained to discover the hidden pattern between the extracted DEI and the raw vibration data of the training bearing. In this way, predicted DEIs are automatically obtained when applying the trained CNN model to the test bearings. Finally, the RULs of the testing bearings are obtained using an -SVR forecasting model. An experimentation platform that allows to observe the accelerated degradation process of bearings is employed to validate the proposed framework. The proposed framework achieves much smaller prediction errors for RUL predictions than previous published approaches.
Future work includes the application of the proposed framework to a wider range of case studies on experimental data in other applications [40], and the investigation of other potential degradation labels to achieve even higher accuracy in estimating RUL.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] T. A. Harris, Rolling bearing analysis . John Wiley and sons, 2001.
- 2[2] Y. Yuan, H.-T. Zhang, Y. Wu, T. Zhu, and H. Ding, “Bayesian learning-based model-predictive vibration control for thin-walled workpiece machining processes,” IEEE/ASME Transactions on Mechatronics , vol. 22, no. 1, pp. 509–520, 2017.
- 3[3] A. Heng, S. Zhang, A. C. Tan, and J. Mathew, “Rotating machinery prognostics: State of the art, challenges and opportunities,” Mechanical Systems and Signal Processing , vol. 23, no. 3, pp. 724–739, 2009.
- 4[4] L. R. Rodrigues, “Remaining useful life prediction for multiple-component systems based on a system-level performance indicator,” IEEE/ASME Transactions on Mechatronics , vol. 23, no. 1, pp. 141–150, 2018.
- 5[5] I. V. de Bessa, R. M. Palhares, M. F. S. V. D’Angelo, and J. E. Chaves Filho, “Data-driven fault detection and isolation scheme for a wind turbine benchmark,” Renewable Energy , vol. 87, pp. 634–645, 2016.
- 6[6] Z.-Q. Wang, C.-H. Hu, and H.-D. Fan, “Real-time remaining useful life prediction for a nonlinear degrading system in service: Application to bearing data,” IEEE/ASME Transactions on Mechatronics , vol. 23, no. 1, pp. 211–222, 2018.
- 7[7] Y. Li, T. Kurfess, and S. Liang, “Stochastic prognostics for rolling element bearings,” Mechanical Systems and Signal Processing , vol. 14, no. 5, pp. 747–762, 2000.
- 8[8] X.-S. Si, W. Wang, C.-H. Hu, and D.-H. Zhou, “Remaining useful life estimation–a review on the statistical data driven approaches,” European Journal of Operational Research , vol. 213, no. 1, pp. 1–14, 2011.
