Traffic Forecasting of Back Servers Based on ARIMA-LSTM-CF Hybrid Model

Yao, Erzhuang; Zhang, Lanjie; Li, Xuehua; Yun, Xiang

doi:10.1007/s44196-023-00232-7

Traffic Forecasting of Back Servers Based on ARIMA-LSTM-CF Hybrid Model

Research Article
Open access
Published: 28 April 2023

Volume 16, article number 65, (2023)
Cite this article

Download PDF

You have full access to this open access article

International Journal of Computational Intelligence Systems Aims and scope Submit manuscript

Traffic Forecasting of Back Servers Based on ARIMA-LSTM-CF Hybrid Model

Download PDF

Erzhuang Yao¹,
Lanjie Zhang¹,
Xuehua Li¹ &
…
Xiang Yun²

1877 Accesses
2 Citations
Explore all metrics

Abstract

Accurate server traffic prediction can help enterprises formulate network resource allocation strategies in advance and reduce the probability of network congestion. Traditional prediction models ignore the unique data characteristics of server traffic that can be used to optimize the prediction model, so they often cannot meet the long-term and high-precision prediction required by server traffic prediction. To solve this problem, this paper establishes a hybrid model ARIMA-LSTM-CF, which combines the advantages of linear and nonlinear models, as well as the periodic fluctuation characteristics of server traffic data obtained from banks. In addition, this paper also uses the optimized K-means clustering method to extract the traffic data of workdays and non workdays. The results show that the new hybrid model performs better than the single ARIMA and LSTM models in predicting the long-term trend of server traffic. RMSE (root mean square error) and MAE (mean absolute error) are reduced by 50%. R2 score index reached 0.64. The results show that the model can effectively extract the data characteristics of server traffic data, and the model has accurate and stable long-term prediction ability.

TASM: technocrat ARIMA and SVR model for workload prediction of web applications in cloud

Article 23 November 2018

ARIMA for Traffic Load Prediction in Software Defined Networks

Mobile Traffic Prediction Based on AR-GARCH-LightGBM Hybrid Model

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

With the continuous expansion of the scale of large Internet enterprises and bank servers, the number of service users continues to grow, resulting in a surge in traffic data. To ensure network quality, enterprises and banks must accurately plan their network traffic and server resources [1, 2]. The traditional server resource allocation and scheduling only feedback when the network is congested, which is inefficient Traffic prediction can accurately predict the change trend of traffic by analyzing the historical data of server equipment, and analyze the network conditions in real time to make dynamic network adjustment and allocation decisions. Traffic prediction not only helps to better meet the future needs of users by reallocating resources, but also helps to build a more efficient network, improve the utilization of server resources, and effectively reduce the probability of network failure Therefore, the establishment of an accurate flow forecasting model has extremely important theoretical significance and application value for enterprises and banks.

Server equipment data belong to the typical financial time series data, which is characterized by high signal-to-noise ratio, instability and nonlinearity The methods in this field include various time series models, namely various traditional statistical methods, machine learning algorithms and deep learning (neural networks) Traditional prediction models of networks include historical average (HA) model, autoregressive (AR) model, autoregressive moving average (ARMA) model, autoregressive comprehensive moving average (ARIMA) model, etc. [8, 9] Although linear statistical models are simpler and have better theoretical interpretation ability for time series prediction problems, they cannot capture complex and nonlinear models hidden in traffic data. With the development of artificial intelligence and the improvement of modern computer computing ability, deep neural networks or their combination with linear models are now widely used in Internet traffic prediction or related problems Common models include recurrent neural network (RNN), convolutional neural network (CNN), artificial neural network (ANN), long-term and short-term memory (LSTM), gated recurrent unit (GRU), etc. [13,14,15,16]. In recent years, many time series prediction algorithms for this scenario are studied [17,18,19].

ARIMA model was established by Box and Jenkins as a classic time series prediction algorithm [3]. ARIMA model is widely used to find linear relationships in stationary data and has achieved good results in many fields, such as power [4], agriculture [5, 6], meteorology [7], and traffic prediction. Laner et al. [8] simply fitted the ARMA model to remotely related network traffic; Guo et al. [9] proposed to use the multiplicative seasonal difference integration ARIMA model in combination with data characteristics to predict mobile communication traffic. Raimundo Milton et al. [10] considered the volatility of financial data and fused the wavelet model and SVR model to improve the prediction effect of financial time series. Yang et al. [11] proposed a modified ARIMA algorithm based on SA (simulated annealing). By integrating the linear model ARIMA, the nonlinear model BPNN (back propagation neural network), and the optimization algorithm SA, the linear and nonlinear features of historical network data are fully studied, and the prediction effect of the prediction algorithm model is effectively enhanced.

LSTM model is an optimization model based on the RNN network structure proposed in 1997 [12]. The LSTM algorithm can capture nonlinear trends and correlation, to deal with long-term dependence problems well. The results [13,14,15] show that LSTM outperforms traditional decline curves for time series prediction problems affected by multiple factors. In addition, Li et al. [16] proposed the integrated prediction model of CNN and LSTM, which fully combines the ability of CNN to effectively extract air quality features and the ability of LSTM to capture long-term trends. Zheng et al. [17] built a Conv LSTM model and used it for traffic prediction. This model converts matrix multiplication of each LSTM gate into convolution calculation, to capture the time and space characteristics of traffic; Hachem et al. [18] combined LSTM with fast Fourier transform (FFT), to reduce the complexity and prediction time of data sets to obtain higher prediction accuracy; Qu [19] and others believe that the traffic data have the characteristics of randomness and unbalanced distribution, and built a new end-to-end hybrid model M-B-LSTM. This model constructs an online self-learning network as the data reflection layer to learn and balance the statistical distribution of traffic, reduce the distribution imbalance and over-fitting problems in the network learning process, and achieve better short-term prediction. Kasun Bandara et al. [20] built a partition-based prediction algorithm model LSTM-MS net for multivariate periodic time series data. The network model obtains key patterns and structures shared by time series sets through global training of the LSTM network.

However, there are many problems in a single machine learning model, such as slow convergence, the influence of outliers, local minima, inability, etc. [24]. These problems make it difficult for these models to capture the composite features of time-series traffic data. To improve forecasting performance, some hybrid models that take advantage of each component model are proposed to deeply analyze the characteristics of data. Li et al. [21] constructed the ARIMA-BPNN hybrid model using the linear prediction result of ARIMA prediction time series and the nonlinear prediction result of BPNN prediction, thus improving the prediction accuracy of stock time series data. Liu et al. [22] proposed a new mixed model named CSSAP, which integrates ARIMA and LSTM to better fit the relationship between linear and nonlinear. The prediction results show that under different scenarios, the RMSE of the CSSAP algorithm increased by 6–66%, and the MAE increased by 4–71%. Ji et al. [23] used the ARIMA model to predict the linear data part of futures price, then computed ARIMA residual term M, then used the CNN model to get the trend of residual, and finally used the LSTM model to predict the long-term trend of residual. Comparing the single model with the mixed model, it can be concluded that the ARIMA-CNN-LSTM model has the smallest MAPE and RMSE. Fan et al. [24] considered the impact of nonlinear fluctuation caused by manual operation on oil well production and proposed to input the switch data into the ARIMA model as additional input data, and then input the residual term generated by the ARIMA model into the LSTM model to improve the forecast accuracy of the oil well production. Wang et al. [25] combined CNN’s time expansion and LSTM's long-term memory advantages. The attention mechanism is introduced at the LSTM side to give sufficient attention to key information, so that the model can focus on learning more important data features and improve the prediction performance. Zheng H et al. [26] proposed a bi-directional LSTM (Bi-LSTM) module to extract the daily and weekly periodic characteristics, so as to capture the change trend of traffic flow in the front and back directions. Lotfi Hachemi et al. [18] improved LSTM by combining them with filters [especially Fast Fourier Transform (FFT)] to better extract the characteristics of the time series to be predicted, while reducing the time complexity.

The existing models, such as ARIMA and LSTM, can only predict the linear or nonlinear characteristics of data. Although the hybrid model ARIMA-LSTM can capture both linear and nonlinear features, it can only maintain high accuracy in short-term prediction. When applied to long-term prediction, the prediction result is a curve with a poor fitting effect, which is inconsistent with the high accuracy of traffic equipment prediction and the demand for long-term prediction. The main reason why long-term prediction is difficult to be accurate is that errors can easily accumulate over time. Thus, based on the ARIMA-LSTM model, this paper proposes a new hybrid model named ARIMA-LSTM-CF that combines the traffic data features of the bank server as an additional input to the LSTM model to optimize the prediction of long-term traffic data. The ARIMA-LSTM-CF model uses the optimized clustering algorithm (SDK means++) to extract the workday traffic data from the data set and uses the ARIMA model to predict the linear trend in the extracted data. Then to better capture the nonlinear characteristics, the residual term produced by the ARIMA model and the historical time means data which can reflect the cyclic fluctuation features (CF) of the extracted data as the additional input data for the LSTM model (named LSTM-CF model). Finally, the ARIMA-LSTM-CF is established by adding the results of linear prediction and nonlinear prediction results. The main contributions of this study include two aspects: in theory, the ARIMA-LSTM model can be optimized by clustering the workload data and analyzing its unique data characteristics. In practical application, it formulates network resource allocation strategies for banks or other enterprises in advance to reduce the probability of network congestion, thereby improving the staff and user experience. ARIMA-LSTM-CF is the best choice when the predicted time series has obvious cyclic fluctuation characteristics.

In Sect. 2, we introduce the dataset and data features used in this paper. In Sect. 3, we describe the ARIMA-LSTM-CF model structure in detail. In Sect. 4, the linear and nonlinear prediction results from the ARIMA model and LSTM model as well as the coupling prediction results from the ARIMA-LSTM-CF are analyzed. Then the prediction accuracies obtained from these models are compared and discussed using different prediction indexes. In Sect. 5, we present the conclusions of this paper.

2 Data and Preprocessing

The traffic data set used in this article is provided by a bank server device, as shown in Fig. 1. There are 6240 observations from Dec 11, 2021, to Dec 23, 2021. The data acquisition frequency is one data per 3 min, then 480 data are collected per day. Here, we collect 6240 data in 13 days. The data, according to different characteristics, can be classified into workdays and non-workdays. Non-workdays mainly include weekends, holidays, and equipment maintenance. Figure 1 shows that the data features of workdays and non-workdays are different.

2.1 Data Preprocessing

The collected data show that the traffic data are typical time series data, including both linear and nonlinear components, with obvious multi-period characteristics. Among them, the relatively stable days are obtained from weekends or holidays, while the most volatile days are workdays. By analyzing the characteristic direction of traffic data, we find that the non-weekday data are always stable at the position close to zero, while the weekday data reach its peak in the morning and afternoon respectively, showing an M-shaped feature. Obviously, working days and non-working days have different characteristics. Considering the difference of data characteristics between workdays and non-workdays, when the last day is a non-workday, the model cannot accurately predict the situation of workdays, so distinguishing them will enable the model to better capture the characteristics of traffic data. Thus, we first divide the time series into workday data sets and non-working data sets according to the data features using the SDK means++ algorithm provided in reference [27].

The SDK means++ algorithm can be used for time-series data clustering by utilizing the DWT as a distance measurement method. However, the SDK means++ algorithm performs data clustering based on the largest sum of distance and the Davies–Bouldin index (DBI), thus it is inevitably sensitive to amplitude and the data features with small amplitude will be ignored in clustering. Here, we use the Z_score standardization to eliminate the dimensionality of these collected traffic data before clustering. The formula of Z_Score standardized as follow:

$$\begin{array}{*{20}c} {x^{*} = \frac{{x - \overline{x} }}{\sigma }} \\ \end{array}$$

(1)

where $\overline{x}$ is the data mean and $\sigma$ is the data standard deviation. The traffic data after standardization is shown in Fig. 2. We set the number of clusters to 2 according to the data classes and used DTW as the distance measurement method. Figure 3 gives the clustering effect. The SDK means++ algorithm can divide well the original data into workdays and non-workdays. The original data are re-spliced according to the clustering results, as shown in Fig. 4. The workday data set is given on the upper side of Fig. 4, and the non-workday data set is given on the lower side of Fig. 4.

2.2 Data Feature

Figure 4 shows the details of data characteristics of workdays and non-workdays, respectively. It can be seen from Fig. 4 that the data set on non-workdays is stable at the position approaching zero, indicating that the equipment generates less traffic on non-workdays. On the other hand, the data set on weekdays has obvious data fluctuation, and the amplitude reaches a high peak in the morning and afternoon, indicating that the equipment generates more traffic on weekdays. Therefore, the follow-up experiments mainly analyzed the characteristics of the weekday dataset.

It can be seen from Fig. 6 that the characteristics of workday data in different periods are different. From 00:00 to 08:00, the data are stable and the data value is almost zero, which indicates that there is no manual operation or external service except that the necessary background programs are occupying traffic during this period. From 08:00 to 11:00, the traffic of the device gradually climbed to the first peak, which means that the bank devices began to work. From 11:00 to 13:00, the traffic of the device gradually decreases, but it has not reached zero, indicating that the business volume gradually decreases, but it is still working. From 13:00 to 16:00, the device traffic gradually climbed to the second peak but did not exceed the peak in the morning. From 16:00 to 20:00, the traffic of the device decreases linearly and remains stable at the position tending to zero after 20:00. As a whole, it presents an obvious M-shaped data feature (Fig. 5).

3 Model

3.1 ARIMA

ARIMA model prediction generally takes four steps. (1) Check the stability of the data and determine the parameter D; (2) Autocorrelation and partial autocorrelation analysis of time series; (3) Estimate the P and Q values according to the tailing and tailing conditions; (4) Check whether the model residual is white noise by checking the model inspection table or ACF/PACF diagram of model residual; According to the model parameter table, we can get the model formula.

The p-order autoregressive process AR (P) is obtained by weighting the data itself as a variable. The specific formula is as follows.

$$\begin{array}{*{20}c} {Y_{t} = \gamma_{1} Y_{t - 1} + \gamma_{2} Y_{t - 2} + \cdots \gamma_{p} Y_{p - 1} + E} \\ \end{array}$$

(2)

where ${Y}_{t}$ is the current data value, $\gamma$ is the autocorrelation coefficient, and $E$ is the error term.

The q-order MA (q) is obtained by weighting the white noise E. The specific formula is there.

$$\begin{array}{*{20}c} {Y_{t} = e_{t} + \theta e_{t - 1} + \cdots + \theta e_{t - q} } \\ \end{array}$$

(3)

The specific formula of the ARMA (P, q) model obtained by mixing AR (P) and MA (q) is as follows.

$$\begin{array}{*{20}c} {Y_{t} = e_{t} + \theta e_{t - 1} + \cdots + \theta e_{t - q} + \gamma_{1} Y_{t - 1} + \gamma_{2} Y_{t - 2} + \cdots \gamma_{p} Y_{p - 1} + E} \\ \end{array}$$

(4)

Finally, the ARIMA model formula is used for prediction.

3.2 LSTM

Recurrent neural network (RNNs) is widely used in research fields related to time series data, such as text, audio, and video [28]. However, the RNN composed of sigma units or tanh units cannot effectively mine the relevant information of the input data when the input data gap is large. Long short-term memory (LSTM) [12] can effectively deal with the problem of long-term dependence well by introducing entry functions into the cell structure. Since the LSTM model was proposed, it has had good performance in language modeling, emotion analysis, stock market prediction, machine translation, and other application fields [29], making LSTM the research focus of deep learning. Compared with the hidden layer of RNN, LSTM adds three control gates and cell states. The specific network structure of LSTM at time t is shown in Fig. 6.

The LSTM network architecture can be described by formula (5–9). First, the input to the cell at the time is connected with the cell information at the time and then processed through the forgetting gate, the input gate, and the output gate through the input node. The specific formula is as follows

$$\begin{array}{*{20}c} {f\left( t \right) = \sigma \left( {W_{f} x_{t} + U_{f} h_{t - 1} + b_{f} } \right)} \\ \end{array}$$

(5)

$$\begin{array}{*{20}c} {i\left( t \right) = \sigma \left( {W_{i} x_{t} + U_{i} h_{t - 1} + b_{i} } \right)} \\ \end{array}$$

(6)

$$\begin{array}{*{20}c} {\tilde{C}\left( t \right) = \tan h\left( {W_{C} x_{t} + U_{C} h_{t - 1} + b_{C} } \right)} \\ \end{array}$$

(7)

$$\begin{array}{*{20}c} {C_{t} = f_{t} \cdot C_{t - 1} + i_{t} \cdot \tilde{C}_{t} } \\ \end{array}$$

(8)

$$\begin{array}{*{20}c} {h_{t} = o_{t} \cdot \tan h\left( {C_{t} } \right)} \\ \end{array}$$

(9)

3.3 Hybrid Model

Traffic data belong to typical time-series data. The weekday and non-weekday traffic data can be extracted using the SDK means++ algorithm from the reference [27]. Then the weekday traffic data can be divided into linear and nonlinear parts as follows: ${x}_{t}={L}_{t}{N}_{t}$, where ${L}_{t}$ is the linear parts of the data at time t, ${N}_{t}$ is the nonlinear parts. To fully combine the cyclic fluctuation features of server traffic data from workdays and the advantages of ARIMA and LSTM models to improve the prediction accuracy of long-term traffic data, this paper constructs an ARIMA-LSTM-CF hybrid model, as shown in Fig. 7.

The structure of ARIMA-LSTM-CF hybrid model ARIMA-LSTM-CF model is completed in four steps. (1) The time series clustering algorithm SDK means++ is used to divide the original data set into workday data and non-workday data, so as to better capture the data characteristics of workday. (2) ARIMA model is used to fit the linear part of traffic data. And obtain the residuals ${\varepsilon }_{t}$ by subtracting the predicted value from the real value (3) The LSTM model takes the residual term generated in (2) as the target variable, to fit the nonlinear part of the traffic data. Residuals and the CF data set obtained by analyzing the data characteristics are used as the input of LSTM to predict the nonlinear part ${N}_{t}$ of the traffic data. (4) Finally, the prediction results of the two single models (ARIMA and LSTM-CF) are combined to obtain the hybrid model ARIMA-LSTM-CF, and the final prediction results are obtained. Algorithm 1 gives the pseudocode of ARIMA-LSTM-CF integral.

3.4 Index

To evaluate the prediction effect of different models, four evaluation indexes are selected to test the prediction accuracy of traffic data, including root mean square error (RMSE), mean absolute error (MAE), mean absolute percentage error (MAPE), and R² score. The mathematical description of these evaluation indexes is given in formulas (10)–(13).

$$\begin{array}{*{20}c} {{\text{RMSE}} = \sqrt {\frac{1}{n}\mathop \sum \nolimits_{i = 1}^{n} \left( {x_{i} - x_{i}^{^{\prime}} } \right)^{2} } } \\ \end{array}$$

(10)

$$\begin{array}{*{20}c} {{\text{MAE}} = \frac{1}{n}\mathop \sum \limits_{i = 1}^{n} \left| {x_{i} - x_{i}^{^{\prime}} } \right|} \\ \end{array}$$

(11)

$$\begin{array}{*{20}c} {{\text{MAPE}} = \frac{1}{n}\mathop \sum \limits_{i = 1}^{n} \left| {\frac{{x_{i} - x_{i}^{^{\prime}} }}{{x_{i} }}} \right|} \\ \end{array}$$

(12)

$$\begin{array}{*{20}c} {R^{2} = 1 - \frac{{{\text{SSE}}}}{{{\text{SST}}}} = 1 - \frac{{\frac{{\mathop \sum \nolimits_{i = 1}^{n} \left( {x_{i} - x^{\prime}} \right)^{2} }}{n}}}{{\frac{{\mathop \sum \nolimits_{i = 1}^{n} \left( {x_{i} - \overline{x}} \right)^{2} }}{n}}} = 1 - \frac{{{\text{RMSE}}}}{{{\text{Var}}}}} \\ \end{array}$$

(13)

where ${x}_{i}$ is the original value, ${x}_{i}^{^{\prime}}$ is the prediction value, $n$ and is the number of time-series data. SSE is the residual square sum, and SST is the total dispersion square sum. Detailed information on these evaluation indexes is given in Table 1.

Table 1 The detailed information on the evaluation index

Full size table

4 Experiments

4.1 ARIMA Prediction

There are three steps to realize the prediction of the linear part ${L}_{t}$ and calculate the residuals ${\varepsilon }_{t}$ through the ARIMA model: first, judge whether the input data are stable through the unit root test. If it is not stable, do a differential treatment, and the parameter D increases by 1. For the server traffic data set, the stationarity condition is satisfied when d = 1. Then select the model type through ACF and PACF curves and determine the maximum values of parameters p and q. The ACF and PACF curves of the traffic data set are trailing. ARMA (P, q) model shall be selected, and the maximum values of parameters p and q are 2 and 6. The order of parameters p and Q is determined by comparing the values of information criterion AIC and BIC under different parameters. Finally, the output model parameter is ARIMA (2, 1, 2), and the model detection is given in Table 2.

Table 2 ARIMA (2, 1, 2) checklist

Full size table

In the experiment, the data set is divided into two parts: the training data set and the test data set. The data of the first 12 days are used as training data for ARIMA model fitting, and the last day is used as test data for testing. The prediction results of the ARIMA model are given in Fig. 8. It can capture the initial upward trend, but cannot capture the subsequent M-type fluctuation and the final downward trend, and can only maintain accuracy in short-term prediction. The residual item obtained by making a difference between the actual value and the predicted value is given in Fig. 9. The residual term still contains the M-type data feature. In the Fig, the blue line data is used as the training data of the LSTM model, and the black line is used as the test data of the LSTM model.

4.2 LSTM Prediction

LSTM and LSTM-CF predict the nonlinear part. LSTM uses only the residual values generated from the linear prediction as the only input to the model. LSTM-CF adds the historical data means $\overline{x}_{t}$ of prediction time t to the input of the LSTM model. After many experiments, considering the prediction accuracy and over-fitting, the LSTM model parameters are finally given in Table 3.

Table 3 LSTM model parameters

Full size table

The prediction results of LSTM and LSTM-CF are given in Fig. 10, and the evaluation of prediction results is given in Table 4. The residual term obtained by comparing the predicted value with the real value is shown in Fig. 9. The residual term still contains the M-type data feature. In the Fig, the blue line data is used as the training data of the LSTM model, and the black line is used as the test data of the LSTM model. Table 4 lists the prediction and evaluation results of the ARIMA residual term using the LSTM and LSTM models. When using the LSTM-CF model, the values of RMSE, MAE, and MAPE decreased, and R2 score increased significantly. The results show that the LSTM-CF model is better than the single-input LSTM model in long-term prediction.

Table 4 The evaluation of prediction results of LSTM and LSTM-CF

Full size table

4.3 Hybrid Model Prediction

The linear part predicted by the ARIMA model and the nonlinear part predicted by the LSTM model or LSTM-CF is added to obtain the final prediction result of the ARIMA-LSTM and ARIMA-LSTMN-CF model. The single-day traffic prediction results based on different prediction models are given in Fig. 11. The prediction results of ARIMA, RNN, LSTM,GRU,CNN-LSTM, ARIMA-LSTM, and ARIMA-LSTM-CF models are compared.

As shown in Fig. 12, ARIMA model, RNN, LSTM, CNN-LSTM and ARIMA-LSTM can be used to accurately predict the bank traffic data at 00:00–08:00 and 20:00–24:00. However, the data oscillation characteristics of 08:00–20:00 cannot be found. This is because the predicted values predicted by these models are used as new input values. With the sliding of the window, the actual data will decrease and the forecast data will increase, so the error will accumulate and expand, which will affect the long-term prediction ability of the model. ARIMA-LSTM-CF considers the average value of the historical data during the prediction, corrects the prediction results without increasing the actual value, and realizes the long-term prediction of periodic traffic data.

Through the above evaluation methods, the prediction results of the above seven algorithms are compared. Table 5 shows the evaluation results. The closer the values of RMSE, MAE and MAPE are to 0, R2_The closer the score is to 1, the better the performance is. According to Table 5, the RMSE and MAE of ARIMA-LSTM-CF are 43.24 and 31.62 respectively, which are more than 50% lower than other models, and the MAPE is 0.79, which is higher than RNN, LSTM and CNN-LSTM. However, the long-term prediction result of these models is a straight line, which only predicts the period of low traffic volume after work, so as to get a better MAPE index. They cannot predict the long-term trend of traffic data. The R2 value of ARIMA-LSTM-CF mixed model is the highest, which is 0.6491, indicating that the prediction results match the original data well. This lays a solid foundation for subsequent anomaly detection based on prediction results.

Table 5 Evaluation results of different prediction models

Full size table

5 Discussion

To further verify the effectiveness of the model, we apply all models to a new dataset to observe their performance. The data set describes the traffic data (Mbps) of another banking device from 2021.10.4 to 2021.10.13 and has the same M-type feature as the dataset studied above. The prediction results of all models are shown in Fig. 12, and the indicator performance is shown in Table 6

Table 6 Evaluation results of different prediction models on another dataset

Full size table

As shown in Fig. 12, ARIMA model can only predict the rise of data, RNN can only fit the data trend at 00:00–02:00, LSTM, GRU, CNN-LSTM, ARIMA-LSTM can better fit the data trend at 00:00–12:00, but with the accumulation of errors, these models can not capture the subsequent data fluctuation and decline trend, by comparison, Only ARIMA-LSTM-CF, which considers the average value of historical data in the prediction process, can complete the task of accurate long-term prediction.

We used the same evaluation method to compare the performance of the seven models on the new data set. The evaluation results are shown in Table 6. The RMSE, MAE and MAPE of ARIMA-LSTM-CF are 13.63, 10.09 and 0.4587, respectively, which are superior to all other models. From the perspective of trend prediction, other models only maintain accuracy in the early stage and cannot predict long-term M-type trends. ARIMA-LSTM-CF has the highest R² value of 0.6480, indicating that the prediction results are in good agreement with the original data, and the model taking into account the data characteristics has good long-term prediction ability.

6 Conclusion

The focus of this research is to provide reliable and accurate long-term traffic data prediction to help enterprises better allocate network resources and improve user experience. Firstly, this paper uses the SDK means++ clustering algorithm to accurately extract the workday period data. Then, the linear part is predicted by the ARIMA model, the nonlinear part is predicted by the LSTM-CF model, and the auxiliary CF data set obtained by analyzing the data feature is used as the additional input of the LSTM model to improve the long-term prediction accuracy of traffic data. Finally, the model is verified on an additional dataset to further illustrate the effectiveness of the model.

Comparing and analyzing different prediction models, ARIMA can only model the linear part, and LSTM can only model the nonlinear part. Although ARIMA-LSTM can model both linear and nonlinear parts, it can only maintain short-term accuracy. In the long-term prediction of dataset 1, compared with other models, The RMSE and MAE of the ARIMA-LSTM-CF model were reduced by 55% and 51.8% respectively compared with the ARIMA-LSTM model, and can fully capture the M-type data feature of traffic data, R² score reached 0.64, which was much higher than other models. In the long-term prediction of dataset 2, compared with other models, The RMSE, MAE MAPE and R² score of ARIMA-LSTM-CF are 13.63, 10.09 0.4587 and 0.6480, respectively, which are superior to all other models. From the perspective of trend prediction, other models only maintain accuracy in the early stage and cannot predict long-term M-type trends. Therefore, we can conclude that ARIMA-LSTM-CF proposed in this paper has significant advantages in long-term time-series data prediction, and can better serve the long-term data prediction of server traffic equipment, to help enterprises formulate network resource allocation strategies in advance.

ARIMA-LSTM-CF built in this paper only combines the advantages of ARIMA and LSTM models. Future research can further combine more models to improve the prediction accuracy of the long-term time-series data. In addition, the traffic data of bank equipment has obvious M characteristics on workdays. ARIMA-LSTM-CF model improves the long-term prediction effect of the model mainly by combining the characteristics of bank traffic data. Different enterprise traffic data often have different characteristics. In the future, we will further combine server traffic data with different characteristics to improve the generalization ability of the model.

Availability of Data and Material

The original data set involved in this study cannot be shared because the bank information is confidential.

Abbreviations

RMSE:: Root mean square error
MAE:: Mean absolute error
MAPE:: Mean absolute percentage error
HA:: Historical average model
AR:: Autoregressive model
ARMA:: Autoregressive moving average model
ARIMA:: Autoregressive integrated moving average model
RNN:: Recurrent neural network
CNN:: Convolutional neural network
ANN:: Artificial neural network
LSTM:: Long-term and Short-term memory
GRU:: Gated recursive unit
SA:: Simulated annealing
BPNN:: Back propagation neural network
FFT:: Fast Fourier transform
CF:: Fluctuation features
DBI:: Davies–Bouldin index Introduction

References

Feng, J., Chen, X., Gao, R., Zeng, M., Li, Y.: DeepTP: an end-to-end neural network for mobile cellular traffic prediction. IEEE Netw. 32, 108–115 (2018)
Article Google Scholar
Andreoletti, D., Troia, S., Musumeci, F., Giordano, S., Maier, G., Tornatore, M.: Network traffic prediction based on diffusion convolutional recurrent neural networks. In Proceedings of the IEEE INFOCOM 2019—IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), Paris, France, 29 April–2 May 2019: IEEE: Piscataway, NJ, USA, pp. 246–251 (2019)
Box GEP, Jenkins GM, Reinsel GC.: Time series analysis forecasting and control. 37 (2): p. 238e42, Oakland, California (1994)
Wang, Y.J., Wang, G.Z., Dong, Y.: Application of residual modification approach in seasonal ARIMA for electricity demand forecasting: a case study of China. Energy Pol. 48, 284–294 (2012)
Article Google Scholar
Mishra, P., Sarkar, C., Vishwajith, K.P., et al.: Instability and forecasting using ARIMA model in area, production and productivity of onion in India. J. Crop Weed 9(2), 96–101 (2013)
Google Scholar
Hossain, M.M., Faruq, A.: Forecasting the sugarcane production in Bangladesh by ARIMA model. J. Stat. Appl. 4(2), 297–303 (2015)
MATH Google Scholar
Shamsnia S.A., Shahidi N., Ali L., et al.: Modeling of weather parameters using stochastic methods (ARIMA model) (case study: Abadeh region, Iran). In: International conference on environment and industrial in Decation IPCBEE Singapore (2011)
Laner, M., Svoboda, P., Rupp, M.: Parsimonious fitting of long-range dependent network traffic using ARMA models. IEEE Commun. Lett. 17(12), 2368–2371 (2013)
Article Google Scholar
Guo, S., Lin, Y., Feng, N., et al.: Attention based spatial-temporal graph convolutional networks for traffic traffic forecasting. Proc. AAAI Conf. Artif. Intell. 33, 922–929 (2019)
Google Scholar
Raimundo, M. S., Okamoto, J.: SVR-wavelet adaptive model for forecasting financial time series. In: 2018 International Conference on Information and Computer Technologies (ICICT). IEEE (2018)
Yang, H., Li, X., Qiang, W., et al.: A network traffic forecasting method based on SA optimized ARIMA–BP neural network. Comput. Netw. 193, 108102 (2021)
Article Google Scholar
Hochreiter, S., Jrgen, S.: Long short term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Xuanyi, S., Yuetian, L., Liang, X., et al.: Time-series well performance prediction based on long short-term memory (LSTM) neural network model. J. Petrol. Sci. Eng. 186, 106682 (2020)
Article Google Scholar
Sagheer, A., Mostafa, K.: Time series forecasting of petroleum production using deep LSTM recurrent networks. Neurocomputing 323, 203–213 (2019)
Article Google Scholar
Kyungbook, L., Jungtek, L., Daeung, Y.: Prediction of shale-gas production at duvernay formation using deep-learning algorithm. SPE J 24(6), 2423–2437 (2019)
Article Google Scholar
Li, T., Hua, M., Wu, X.: A hybrid CNN-LSTM model for forecasting particulate matter (PM2.5). IEEE Access. 8, 26933–26940 (2020)
Article Google Scholar
Zheng, H., Lin, F., Feng, X., et al.: A hybrid deep learning model with attention-based Conv-LSTM networks for short-term traffic traffic prediction. IEEE Trans. Intell. Transp. Syst. 22(11), 6910–6920 (2021)
Article Google Scholar
Hachemi M. L., Ghomari A., Hadjadj-Aoul Y., et al.: Mobile traffic forecasting using a combined FFT/LSTM strategy in SDN networks. In: 2021 IEEE 22nd International Conference on High Performance Switching and Routing (HPSR). IEEE, 1–6 (2021)
Zhaowei, Q., Haitao, L., Zhihui, L., et al.: Short-term traffic traffic forecasting method with MB-LSTM hybrid network. IEEE Trans. Intell. Transport. Syst. 23, 225–235 (2020)
Article Google Scholar
Bandara, K., Bergmeir, C., Hewamalage, H.: LSTM-MSNet: leveraging forecasts on sets of related time series with multiple seasonal patterns. IEEE Trans. Neural Netw. Learn. SYST. 32(4), 1586–1599 (2020)
Article Google Scholar
Xiong, L., Lu, Y.: Hybrid ARIMA-BPNN model for time series prediction of the Chinese stock market. In: 2017 3rd International conference on information management (ICIM). IEEE, 93–97 (2017)
Liu, J., Tan, X., Wang, Y.: CSSAP: software aging prediction for cloud services based on ARIMA-LSTM hybrid model. In: 2019 IEEE International Conference on Web Services (ICWS). IEEE, 2019: 283–290.
Ji, L., Zou, Y., He, K., et al.: Carbon futures price forecasting based with ARIMA-CNN-LSTM model. Procedia Comput. Sci. 162, 33–38 (2019)
Article Google Scholar
Fan, D., Sun, H., Yao, J., et al.: Well production forecasting based on ARIMA-LSTM model considering manual operations. Energy 220, 119708 (2021)
Article Google Scholar
Wang, K., Ma, C., Qiao, Y., et al.: A hybrid deep learning model with 1DCNN-LSTM-Attention networks for short-term traffic flow prediction. Physica A 583, 126293 (2021)
Article Google Scholar
Zheng, H., Lin, F., Feng, X., et al.: A hybrid deep learning model with attention-based conv-LSTM networks for short-term traffic flow prediction. IEEE Trans. Intell. Transp. Syst. 22(11), 6910–6920 (2020)
Article Google Scholar
Zhaowei, Q., Haitao, L., Zhihui, L., et al.: Short-term traffic flow forecasting method with MB-LSTM hybrid network. IEEE Trans. Intell. Transp. Syst. 23(1), 225–235 (2020)
Article Google Scholar
Du, G., Li, X., Zhang, L., et al.: Decel automated K-means++ algorithm for financial data sets. Math. Probl. Eng. 2021, 1–12 (2021)
Article Google Scholar
Smagulova, K., James, A.P.: A survey on LSTM memristive neural network architectures and applications. Eur. Phys. J. Spec. Top. 228(10), 2313–2324 (2019)
Article Google Scholar

Download references

Acknowledgements

Thank my Senior Brother Guoyu Du for taking me into the field of time series prediction and helping me solve various difficult problems during the process of writing my paper. I cannot finish this paper without his help.

Funding

This work was Supported by the Beijing Natural Science Foundation L222004.

Author information

Authors and Affiliations

Information and Communication Engineering, Beijing Information Science and Technology University, Beijing, China
Erzhuang Yao, Lanjie Zhang & Xuehua Li
Beijing Baicells Technology Co., LTD, Beijing, China
Xiang Yun

Authors

Erzhuang Yao
View author publications
You can also search for this author in PubMed Google Scholar
Lanjie Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Xuehua Li
View author publications
You can also search for this author in PubMed Google Scholar
Xiang Yun
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

EY: conceptualization, methodology, software, investigation, writing—original draft, writing—review & editing, visualization. LZ: software, investigation, visualization, writing—original draft, Funding acquisition. XL: conceptualization, methodology, formal analysis, writing—review & editing. XY: code guidance for data provision.

Corresponding author

Correspondence to Lanjie Zhang.

Ethics declarations

Conflict of Interest

All authors disclosed no relevant relationships.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Yao, E., Zhang, L., Li, X. et al. Traffic Forecasting of Back Servers Based on ARIMA-LSTM-CF Hybrid Model. Int J Comput Intell Syst 16, 65 (2023). https://doi.org/10.1007/s44196-023-00232-7

Download citation

Received: 11 October 2022
Accepted: 26 March 2023
Published: 28 April 2023
DOI: https://doi.org/10.1007/s44196-023-00232-7

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Traffic Forecasting of Back Servers Based on ARIMA-LSTM-CF Hybrid Model

Abstract

Similar content being viewed by others

TASM: technocrat ARIMA and SVR model for workload prediction of web applications in cloud

ARIMA for Traffic Load Prediction in Software Defined Networks

Mobile Traffic Prediction Based on AR-GARCH-LightGBM Hybrid Model

1 Introduction