Next Article in Journal
Anomaly Identification of Wind Turbine Yaw System Based on Two-Stage Attention–Informer Algorithm
Previous Article in Journal
Test–Retest Reliability of Concentric and Eccentric Muscle Strength in Knee Flexion–Extension Controlled by Functional Electromechanical Dynamometry in Female Soccer
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Prediction of PM2.5 Concentration Based on Deep Learning for High-Dimensional Time Series

by
Jie Hu
1,2,
Yuan Jia
3,
Zhen-Hong Jia
1,2,*,
Cong-Bing He
1,2,
Fei Shi
1,2 and
Xiao-Hui Huang
1,2
1
School of Computer Science and Technology, Xinjiang University, Urumqi 830046, China
2
Xinjiang Uygur Autonomous Region Signal Detection and Processing Key Laboratory, Xinjiang University, Urumqi 830046, China
3
School of Statistics, Renmin University of China, Beijing 100872, China
*
Author to whom correspondence should be addressed.
Submission received: 13 August 2024 / Revised: 25 September 2024 / Accepted: 26 September 2024 / Published: 27 September 2024
(This article belongs to the Section Ecology Science and Engineering)

Abstract

:
PM2.5 poses a serious threat to human life and health, so the accurate prediction of PM2.5 concentration is essential for controlling air pollution. However, previous studies lacked the generalization ability to predict high-dimensional PM2.5 concentration time series. Therefore, a new model for predicting PM2.5 concentration was proposed to address this in this paper. Firstly, the linear rectification function with leakage (LeakyRelu) was used to replace the activation function in the Temporal Convolutional Network (TCN) to better capture the dependence of feature data over long distances. Next, the residual structure, dilated rate, and feature-matching convolution position of the TCN were adjusted to improve the performance of the improved TCN (LR-TCN) and reduce the amount of computation. Finally, a new prediction model (GRU-LR-TCN) was established, which adaptively integrated the prediction of the fused Gated Recurrent Unit (GRU) and LR-TCN based on the inverse ratio of root mean square error (RMSE) weighting. The experimental results show that, for monitoring station #1001, LR-TCN increased the RMSE, mean absolute error (MAE), and determination coefficient (R2) by 12.9%, 11.3%, and 3.8%, respectively, compared with baselines. Compared with LR-TCN, GRU-LR-TCN improved the index symmetric mean absolute percentage error (SMAPE) by 7.1%. In addition, by comparing the estimation results with other models on other air quality datasets, all the indicators have advantages, and it is further demonstrated that the GRU-LR-TCN model exhibits superior generalization across various datasets, proving to be more efficient and applicable in predicting urban PM2.5 concentration. This can contribute to enhancing air quality and safeguarding public health.

1. Introduction

As urbanization in China accelerates rapidly, people’s production activities have caused an increase in the pollutant emission base, resulting in the haze phenomenon occurring from time to time in certain areas and cities across China; this issue has emerged as a significant challenge in both urban and regional air pollution across China in recent years. During hazy conditions, the particulate matter (PM) concentration rises sharply compared to clear weather, highlighting that elevated levels of particulate matter are a key factor contributing to haze formation [1].
Particles with a diameter of 2.5 μm or smaller are referred to as PM2.5 [2]. These miniscule particles not only pose a significant threat to human life and well-being [3,4] but also induce various other detrimental impacts [5]. The Global Air Quality Database, published by the World Health Organization in 2018, reports that air pollution, both indoors and outdoors, is responsible for around 7 million deaths each year globally [6]. Moreover, the burden of air pollution on the global economy is about USD 225 billion per year [6]. Hence, it becomes imperative to accurately and reliably predict PM2.5 concentration to evaluate air pollution severity effectively, thereby enabling measures to be implemented to mitigate PM2.5 levels and consequently reduce the associated health risks. Traditional machine learning models such as Autoregressive Integrated Moving Average (ARIMA), Support Vector Machine (SVM), Markov model (HMM), and Random Forest (RF) have been used to predict PM2.5 concentration [7,8,9,10,11]. However, due to the nonlinear relationship between the change in PM2.5 concentration and external factors, these models cannot accurately predict PM2.5 concentration.
Neural network-based machine learning models effectively predict PM2.5 concentration [12]. Recurrent Neural Network (RNN) and Convolutional Neural Network (CNN) have been used to learn complex nonlinear relationships in PM2.5 concentration time series data to improve prediction accuracy [13,14]. However, RNNs face challenges such as the vanishing gradient and exploding gradient issues when dealing with long sequences.
To more effectively address the vanishing gradient issue and the long-term dependency challenge in RNN, Long Short-Term Memory neural network (LSTM) [15] and gated recurrent network (GRU) [16] have been proposed to solve this problem. Since then, Shi et al. have proposed a balanced social LSTM (BS-LSTM) for predicting PM2.5 concentration in cities [17]. Bhimavarapu and Sreedevi introduced an enhanced loss function (ELF) to decrease the error and improve the accurate prediction of daily PM2.5 concentration in India [18]. Zhang et al. proposed a model based on inverse convolution and LSTM so that the model can extract the spatial feature correlation of atmospheric pollutant concentration data to realize the accurate prediction of PM2.5 concentration [19]. Yang et al. established a new hybrid model combining CNN, LSTM, and GRU for predicting PM2.5 concentration in Seoul, South Korea [20]. In addition, Ding and Zhu developed an LSTM model that incorporates Principal Component Analysis (PCA) and an attention mechanism. This approach mitigated the correlation effects between indicators, simplified the model, and yielded improved prediction results in their experiments [21]. However, as the length of time series data increases, LSTM and GRU models may deteriorate into random guessing.
Compared to the LSTM network, the GRU network features a simpler architecture, reduced computational complexity, and quicker training [22]. Unlike LSTM and GRU, the Temporal Convolutional Network (TCN) offers the more comprehensive parallel processing of information, a simpler structure, fewer parameters, and is better suited for time series modeling [23]. The TCN has been shown to be more effective than LSTMs for time series prediction tasks [24].
Since PM2.5 concentration is measured at specific monitoring stations, traditional methods may struggle to generalize effectively across different locations and environmental conditions. The advantage of hybrid deep learning methods lies in their ability to integrate multiple data sources and modeling techniques, capturing the complex relationships between meteorological factors, geographic variations, and pollutant dispersion patterns. This not only improves the accuracy and generalization capability of PM2.5 predictions but also provides more reliable support for air quality management and public health protection.
Therefore, a combined model called GRU-LR-TCN is proposed to achieve more accurate PM2.5 concentration prediction under complex time-varying conditions in cities in this paper. Based on the original TCN, an improved version of the TCN, known as LR-TCN, is introduced to strengthen the feature extraction ability of the TCN. The enhanced feature extraction of LR-TCN is then combined with the time series prediction capabilities of GRU. In contrast to other studies that aim to enhance accuracy by optimizing parameters or increasing model complexity, this model emphasizes leveraging the strengths of different models to reduce complexity, shortening training time, and boosting prediction accuracy.
Compared with previous work on PM2.5 concentration prediction, the method proposed in this paper has the following contributions.
  • In this paper, LR-TCN is built upon the foundation of the TCN, and LR-TCN can predict future PM2.5 concentrations.
  • Based on LR-TCN proposed in this paper, a combined prediction model is established by combining GRU and LR-TCN, and the outputs of the GRU prediction model and the LR-TCN prediction model are weighted and fused according to the inverse root mean square error ratio to realize the short-term prediction of PM2.5 concentration.
  • Comparison experiments with other models reveal that the GRU-LR-TCN prediction model demonstrates better prediction performance and generalization ability, helping to improve air quality and protect public health.
The remainder of the paper is organized as follows: Section 2 cites some recent work using TCN-based prediction. A brief description of the TCN principle, the GRU principle, and the proposed method is given in Section 3. The dataset, hyperparameter settings, and experimental results are discussed in Section 4. Finally, the experiments and the proposed method are summarized with an outlook in Section 5.

2. Related Work

Many improvement methods have been proposed for prediction models based on the TCN. These methods can be roughly categorized into two main groups: (1) TCN-based combinatorial modeling methods and (2) TCN structure-based improvement methods. The former aims to apply new attention mechanisms and combinatorial networks on the TCN to improve the accuracy of prediction, while the latter is based on the network structure of the TCN and tries to improve the model’s performance by improving the activation function, weight initialization, and other aspects.

2.1. Combination Model

Shi et al. suggested extracting features with the TCN first and then combining them with Bi-GRU to achieve more accurate PM2.5 concentration predictions [25]. Similarly, attention-based mechanisms, time-window strategies, or autoencoders have been further designed to capture the importance of different temporal stages and different feature states [26]. Therefore, Chen proposed a new attentional mechanism combined with the TCN for the accurate hour-by-hour prediction of PM2.5 concentration [27]. Liu and Deng proposed an enhanced hybrid integrated deep learning model for the modal decomposition of PM2.5 concentration data for parallel prediction fusion, which was able to accurately predict PM2.5 concentration [28]. Yuan et al. integrated four basic models—simple-RNN, LSTM, GRU, and TCN—into a new hybrid deep learning (HDL) model for predicting PM2.5 concentration in Changsha City [29]. For predicting PM2.5 concentration across various areas within the city, Zhang et al. combined the correlation features between urban areas to train the TCN to improve the accuracy of PM2.5 concentration prediction in the next hour [30]. Shi et al. integrated the LASSO regression algorithm, attention mechanism, and the TCN to predict indoor PM2.5 concentrations. This approach combined LASSO regression, attention mechanisms, and the TCN for indoor PM2.5 prediction [31].

2.2. Modified TCN

Zeng et al. proposed a two-channel TCN based on TCN(DD-TCN) for improving the accuracy of the regression prediction of mixed gas concentrations [32]. Li et al. adjusted the activation function and weight initialization of TCN(GL-TCN), and the model has a better fitting performance on high-dimensional time series datasets [33]. Ni et al. adjusted the activation function and residual structure of TCN(Gaussian-TCN), and the model outperformed the traditional recurrent network in terms of prediction accuracy [34]. Lei et al. proposed a multi-channel asymmetric structure prediction model based on the TCN for PM2.5 concentration prediction in Fushun City, Liaoning Province [35].
The methods proposed in this paper differ from previous approaches in the following aspects.
  • In this paper, the TCN is improved in more aspects. The activation function, residual structure, expansion rate, and feature-matching convolution position of the TCN are adjusted to make the LR-TCN model perform better.
  • The proposed LR-TCN is combined with other models to enhance the generalization ability of the model.
  • Adaptive weights are used, which can be adapted to different datasets.
  • Better generalization capability makes the model robust.
  • Model complexity is reduced and less training time is required.

3. Materials and Methods

In this section, the fundamentals of the TCN with LeakyRelu (L-TCN) are first briefly described. Then, the improved TCN model (LR-TCN) is highlighted. Finally, the combined GRU-LR-TCN model is introduced. The method flow of this paper is shown in Figure 1.

3.1. TCN with LeakyRelu (L-TCN)

The linear rectification function with leakage (LeakyReLu) [36] is more effective for time series prediction tasks than the linear rectification function (ReLu). To enhance the learning of long-term dependencies in time series data, LeakyReLU can be used in place of ReLU as the activation function in the TCN. Its structure is shown in Figure 2, which will be abbreviated as L-TCN in this paper. When the input value x is negative, the gradient of LeakyReLu is a constant λ 0 ,   1 instead of 0, allowing the neuron to update the weight on the negative input. LeakyReLu and ReLu are consistent when the input value is positive. The operational algorithms are the following Equations (1) and (2):
x ~ = max 0 , W t x + b i f   x > 0
x ~ = max 0 , W t λ x + b i f   x 0
where λ is constant and λ 0 ,   1 .
In addition, opting for LeakyReLU over ReLU can help maintain model performance while minimizing the expansion rate. Usually, the dilated rate will increase exponentially with the number of dilated convolution layers. Because of the larger dilated rate, although the receptive field is increased, the amount of computation is also increased. Therefore, different from the general TCN dilated rate size selection, too large dilated rates are not used to reduce the amount of model computation in this paper, as shown in Figure 3. The dilated causal convolution algorithms are the following Equations (3)–(5):
Suppose that by given a one-dimensional time series X = x 0 , x 1 , x 2 , , x t the corresponding output sequence is Y = y 0 , y 1 , y 2 , , y t , the causal convolution operation on the time series is formulated as follows:
P x t = t = 0 T P x t | x 0 , x 1 , x 2 , , x t 1
where P x t is the predicted probability and T is the total moment.
The dilated convolution D operation on the time series is formulated as follows:
D T = X f d t = i = 0 k 1 f i · X t 2         i 1
  D T = X f d t = i = 0 k 1 f i · X t 1         i = 0
where d denotes the expansion factor, k is the filter size, f i is the ith element in the convolution kernel, and denotes the convolution operation.

3.2. Improved TCN (LR-TCN)

Studies have concluded that residual structures with more than two layers are usually needed to maintain stability as the network becomes deeper and larger, with more obvious advantages [37]. Since PM2.5 concentration time series data in cities have more nonlinear and dynamic features and require deeper residual structures to be sufficient for extracting complex features, the residual structure used in the TCN architecture is not adopted in this paper. The new residual structure proposed in this paper contains two L-TCNs, as shown in Figure 4. At the same time, the feature-matching convolution of the residual module in the TCN is moved before the first dilated convolution to match the hidden features of the dilated convolution, which will be referred to as LR-TCN in this paper. Because the application times of feature-matching convolution are reduced, the training time of the model is shortened.

3.3. Integrated Model (GRU-LR-TCN)

TCN and LSTM are two different types of neural networks used to process time series data. Since they have their characteristics and advantages, better prediction results can be obtained by combining them through adaptive weighted fusion to construct an integrated prediction model [38]. Inspired by the TCN-LSTM integrated prediction model, since LR-TCN is better than the TCN in capturing local and global patterns in the time series, GRU is superior to LSTM in handling long-term dependencies. For this reason, the GRU-LR-TCN integrated prediction model was proposed in this paper, which fuses the prediction results of the GRU prediction model and the LR-TCN prediction model and combines the characteristics and advantages of the two models to enhance the prediction accuracy of the model. The GRU-LR-TCN integrated prediction model is shown in Figure 5.
The GRU-LR-TCN integrated prediction model used the root mean square error of the validation sets of the two models for the weight calculation of the adaptive weighted fusion. If the RMSE of the model is smaller, its weight in the combined prediction model is greater. Assuming that both models predict the nth time step concurrently, the weighting algorithms are the following Equations (6)–(8):
W n T = 1 S n T 1 S n T + 1 S n G
W n G = 1 S n G 1 S n T + 1 S n G
y n = W n T y n T + W n G y n G
where S n T is the LR-TCN validation set RMSE metric, S n G is the GRU validation set RMSE metric, W n T is the LR-TCN prediction result weight, W n G is the GRU prediction result weight, y n is the fusion prediction result, and the operator ‘ ’ denotes the sequential multiplication of array elements.

4. Experimental Results and Discussion

4.1. Data Description and Setup

The air quality dataset from the Urban Air project of the Urban Computing team at Microsoft Research is used in this paper [39,40,41]. The dataset contains urban data, regional data, air monitoring station data, air quality data, meteorological data, and weather predict data for 43 cities, including Beijing, for the time from 1 May 2014 to 30 April 2015. All data are geographically aligned with latitude and longitude, with air quality data recorded every hour and meteorological data recorded every three hours.
In this paper, air quality and meteorological data from this dataset are chosen as experimental datasets and preprocessed. The specific preprocessing is as follows:
(1)
For air quality data, there are missing values in the air quality data for some time periods at each monitoring station, and at the same moment, certain gas concentration data may also be missing. To prevent information leakage and minimize its impact on the experimental results, missing values in the air quality data are filled with the gas concentration data from the preceding time point. For time periods with missing data, the data from the preceding time step of that period are used for supplementation, resulting in a complete set of air quality data comprising 8760 time steps.
(2)
Because meteorological data are recorded every three hours and cannot correspond with the gas concentration data at each moment, the meteorological data for each moment are applied to the data for the following two hours. This is restructured to include 8760 time steps of meteorological data.
In this paper, the dataset of monitoring station 1001 is selected for ablation and comparison experiments of the model, and the generalization of the model is verified on the dataset of monitoring station 1002, monitoring station 1003, and monitoring station 1023. In this paper, sensor data samples were normalized using min–max normalization to a range of [0, 1]. The dataset was split into training, validation, and test sets with a ratio of 8:1:1.

4.2. Multiple Linear Regression and Collinearity Analysis

For the given dataset, estimating linear relationships might obscure the importance of variables, resulting in larger parameter estimation errors, which could have adverse effects on the estimated accuracy of the model. Therefore, evaluating the linear relationships among input variables is necessary. In this paper, Pearson’s correlation coefficient is employed as a key metric for correlation analysis [42]. Pearson’s correlations between PM2.5-related features at monitoring station 1001 are shown in Figure 6. On the other hand, there is a certain degree of correlation between O3 concentration, temperature, wind speed and wind direction, and PM2.5 concentration in the city [43,44]. Therefore, all 12 variables were used as input variables in this paper.

4.3. Evaluation Metrics

The output variable of the model is the PM2.5 concentration in the future. Therefore, this paper selects seven representative evaluation metrics to assess the model’s performance, including root mean square error (RMSE), Mean Squared Error (MSE), mean absolute error (MAE), symmetric mean absolute percentage error (SMAPE), Normalized Absolute Error (NAE), coefficient of determination (R2), and the Index of Agreement (IA). Seven representative evaluation metrics were selected to assess the model’s performance, which include root mean square error (RMSE), mean square error (MSE), mean absolute error (MAE), symmetric mean absolute percentage error (SMAPE), standardized absolute error (NAE), coefficient of determination (R2), and integrated assessment indicator (IA). The seven metrics are formulated as the following Equations (9)–(15).
R M S E = 1 n i = 1 n y i y ^ i 2
M S E = 1 n i = 1 n y i y ^ i 2
M A E = 1 n i = 1 n y ^ i y i
S M A P E = 100 % n i = 1 n y ^ i y i ( y ^ i + y i ) / 2
N A E = 1 n i = 1 n y ^ i y i m a x ( y ^ i , y i + ε )
R 2 = 1 i = 1 n y i y ^ i 2 i = 1 n y ¯ i y i
I A = 1 i = 1 n y i y ^ i 2 i = 1 n ( y ¯ i y i + y ^ i y i 2
where y i is the true value, y ^ i is the predicted value, y ¯ i is the mean of the true values, and ε = 1 × 10−8 .
Among the seven evaluation metrics, smaller values for RMSE, MSE, MAE, SMAPE, and NAE indicate better estimated the performance of the model, while smaller values of R2 and IA indicate worse estimated performance of the model.
In addition, the training time required for different models on the same data are used to assess model complexity and highlight the advantages of the proposed model.

4.4. Model Parameter Selection

In order to improve the ability of the model to solve problems, the link of model reference fitting is very important. The parameter settings directly affect the performance of the model. For this reason, this paper used randomized multi-parameter combinations to select the best parameter combinations after many experiments; the specific parameter settings are shown in Table 1.
Table 1. Parameter settings of each model.
Table 1. Parameter settings of each model.
Learning RateEpochsOptimizerFilter SizeDilation FactorLevelsDropoutLoss FunctionBatch Size
LSTM [15]0.0001100AdamNoneNone2NoneMSELoss128
GRU [16]0.0001100AdamNoneNone2NoneMSELoss128
SRU [45]0.0001100AdamNoneNone2NoneMSELoss128
TCN [23]0.0001100Adam2[1,2,4,8]40.2MSELoss128
Gaussian-TCN [34]0.0001100Adam2[1,2,4,8]40.2MSELoss128
GL-TCN [33]0.0001100Adam2[1,2,4,8]40.2MSELoss128
DD-TCN [32]0.0001100Adam2[1,2,4,8]40.2MSELoss128
D-TCN [46]0.0001100Adam2[1,2,4,8]40.2MSELoss128
ST-TCN [44]0.0001100Adam4[1,2,4,8]40.2MSELoss128
DMSnet [47]0.0001100Adam4[1,2,4,8,16]50.2MSELoss128
LR-TCN0.0001100Adam2[1,2,2,2]20.2MSELoss128
In Table 1, given the need for the same number of dilated causal convolutional layers, each LR-TCN contains two L-TCN, so the LR-TCN model is set to two layers. Usually, the dilated rate increases exponentially with the number of dilated convolutional layers, and model computation becomes larger. To achieve the effect of reducing model computation, the dilated rate of the four L-TCN is [1,2,2,2] in turn. In addition, according to the characteristics of the dataset, 24 time steps were chosen to estimate one time step.
All algorithms were implemented in Python V3.8 using the integrated development environment PyCharm 2021.2.3 (Community Edition). The programs were executed on a Windows 11 (x64) operating system with a 12th Gen Intel(R) Core (TM) i7-12700H CPU (Intel, Santa Clara, CA, USA), NVIDIA GTX 3060 GPU (NVIDIA, Santa Clara, CA, USA), and 16 GB of RAM, utilizing the CUDA-enabled version of PyTorch (V1.13.1) as the primary computational framework.

4.5. Ablation Experiment

In this paper, the structure of the TCN is adjusted by the activation function, residual structure, dilation rate, and feature-matching convolution position to improve the performance of the neural network. At the same time, to prove that each of these optimizations has its contribution to the model and prove that each of these optimizations has its importance, an ablation experiment is carried out using the monitoring station 1001 dataset, and the experimental results are shown in Table 2. In Table 2, CNN-TCN denotes the application of feature-matching convolutional layers first and [1,2,4,8] indicates the dilated rate used by the model sequentially. The absence of [1,2,4,8] indicates that the model employs dilation rates of [1,2,2,2]. ‘Re’ indicates that the activation function used by the model is ReLu, and ‘Lr’ indicates that the activation function used by the model is LeakyReLu. All eight networks in the experiment, except the ‘original TCN’, used the new residual structure proposed in this paper.
As shown in Table 2, in the nine ablation experiments, all the metrics of the ‘original TCN’ are the worst, and all metrics are improved after adjusting the residual structure. The adjustment of the activation function and the dilated rate improves each indicator, and the positional adjustment of the feature-matching convolution resulted in significant improvements in each metric. Therefore, the optimal improved TCN model can be obtained by combining all adjustment strategies.

4.6. LR-TCN Comparison Experiment

To thoroughly evaluate the performance of LR-TCN, eight comparison models were selected to experimentally analyze and compare their estimation results across seven metrics, highlighting the performance advantages of LR-TCN. When selecting comparison models, considering the characteristics of the dataset in this paper, preference was given to models designed for time series data. The models include LSTM [15], GRU [16], SRU [44], TCN [23], Gaussian-TCN [34], GL-TCN [33], DD-TCN [32], D-TCN [45], and LR-TCN, shown in Table 3. From Table 3, it can be seen that LR-TCN performs the best on seven metrics except for SMAPE. However, compared to some other models, LR-TCN still has an advantage in terms of SMAPE, and it requires the least training time compared to other improved TCN models. Compared with TCN, the indicator RMSE has increased by 12.9%, the metric MAE has increased by 11.3%, and the metric R2 has increased by 3.8%. The seven metrics of the GRU model outperform the LSTM and TCN models. Compared to the TCN, the RMSE metric is reduced by 8.9%, the MAE metric is reduced by 18.3%, and the SMAPE metric is reduced by 11.6%. The R2 value of the GRU model is 0.970, ranking second among all models, just behind LR-TCN, indicating a high goodness of fit.

4.7. Integrated Model Ablation Experiment

To better evaluate the performance of GRU-LR-TCN, the estimation results of different improved TCNs combined with different gated recurrent network models are experimentally analyzed and compared in seven metrics and time as a way to prove that GRU-LR-TCN has excellent estimation performance, and the experimental results are shown in Table 4. From Table 4, it can be seen that in the monitoring station 1001 dataset, GRU-LR-TCN outperforms all other combination models in seven metrics. In addition, compared with LR-TCN, the seven metrics are improved, and the SMAPE index has increased by 7.1%.

4.8. Generality Experiment

To better evaluate the generality of LR-TCN and GRU-LR-TCN, the datasets of monitoring station 1002, monitoring station 1003, and monitoring station 1023 were selected for experiments. The estimation performance of different improved TCNs, different gated networks, and the proposed integrated model is experimentally analyzed and compared with seven metrics and time. To make the conclusions more convincing, comparisons were also made with some integrated algorithms Spatiotemporal causal convolutional network (ST-TCN) [46], dual memory scale network (DMSnet) [47], and the experimental results are shown in Table 5. As shown in Table 5, on the same dataset, the estimation performance of LR-TCN outperforms the other improved TCN models in most metrics, while GRU outperforms the other models based on the gating mechanism in most metrics. The integrated GRU-LR-TCN further improves the performance of GRU and LR-TCN models. On different datasets, combining all the metrics, LR-TCN can be applied more generally than other improved TCN models. However, on datasets with higher complexity and severe data loss, its estimation performance is not satisfactory. This requires a detailed analysis of the spatial relationships between monitoring stations and the specific conditions during data collection. Combining datasets from correlated monitoring stations for model training can improve the accuracy of the evaluation results. Compared with other gating-based models, GRU has better generalization ability, while GRU-LR-TCN effectively compensates for the limitations of LR-TCN, combining the advantages of the two models to achieve better generalization. In addition, the experimental comparisons of GRU-LR-TCN over other combined models on different datasets all demonstrate better estimation performance and generalization ability of the proposed model. For station 1002, the RMSE of GRU-LR-TCN improved by 1.3% and 4% compared to GRU and LR-TCN, respectively. For station 1003, the RMSE of GRU-LR-TCN improved by 1% compared to GRU. For station 1023, the RMSE of GRU-LR-TCN improved by 1.7% compared to LR-TCN.

4.9. Estimating Results

Figure 7 shows the estimation results of PM2.5 concentration for the test sets of monitoring stations 1001, 1002, 1003, and 1023 using GRU, LR-TCN, and GRU-LR-TCN. The subplots display the zoomed-in estimation results from the 20th to the 30th hour for each monitoring station. In Figure 7, the red solid line represents the actual measurements, the yellow dashed line represents the GRU estimation, the black dashed line represents the LR-TCN estimation, and the blue dashed line represents the GRU-LR-TCN estimation. From Figure 7, it can be observed that the GRU-LR-TCN estimation is closer to the actual measurements, indicating better estimation performance. However, GRU, LR-TCN, and GRU-LR-TCN all show inaccuracies in estimating both high and low local PM2.5 concentrations, which may be due to excessive missing values in the dataset. This suggests that the data preprocessing methods need to be further improved. In the future, machine learning-based missing value imputation techniques can be considered to further enhance data quality. It is also necessary to consider the spatial relationships between monitoring stations and the spatiotemporal characteristics of the dataset to improve the model’s feature extraction capabilities during training, thereby enhancing the accuracy of the evaluation results. It is also necessary to consider the spatial relationship between monitoring stations and the spatiotemporal characteristics of the dataset to enhance the performance of the model. Additionally, the estimation results across different monitoring stations indicate that GRU-LR-TCN generally outperforms GRU and LR-TCN in terms of estimation performance and spatial generalization.

5. Conclusions

PM2.5 concentration is an important indicator for environmental evaluation and occupies an important position in the field of air pollutant monitoring. In this paper, the relationship between meteorological characteristics and the concentration of six characteristics, including PM2.5, and the influence of nonlinearity and dynamics of time series on PM2.5 concentration prediction are fully considered, and an integrated prediction model based on GRU and LR-TCN is proposed for PM2.5 concentration prediction. In this paper, the air quality data and meteorological data of monitoring station 1001, monitoring station 1002, monitoring station 1003, and monitoring station 1023 in Beijing are selected for experiments to compare the estimation performance of LR-TCN with traditional models, single models, and integrated models and to verify the estimation performance and universality of GRU-LR-TCN. Compared with the TCN, LR-TCN is proposed in this paper. Firstly, the linear rectification function with leakage was used to replace the activation function in the TCN, which helped LR-TCN to learn long-term dependencies in the time series data. Then, by adjusting the residual structure of the TCN with feature-matching convolutional positions, optimizing the expansion rate of the TCN helps to stabilize the LR-TCN model and reduce model training time. The experimental results on the dataset of monitoring station 1001 in Beijing showed that LR-TCN can effectively improve the estimation accuracy of PM2.5 concentration and shorten the model training time at the same time. Finally, the GRU-LR-TCN model proposed in this paper based on LR-TCN combines the estimation results of the GRU model and the LR-TCN model in a weighted manner and uses the inverse root mean square error to correct the time series data with the large error of a single model to reduce the error of a single model. The experimental results show that on the dataset of monitoring station 1001, LR-TCN improved the RMSE, mean absolute error (MAE), and determination coefficient (R2) by 12.9%, 11.3%, and 3.8%, respectively, compared with the baseline model. Compared to LR-TCN, GRU-LR-TCN improved the symmetric mean absolute percentage error (SMAPE) by 7.1%. Meanwhile, datasets from Beijing monitoring stations 1002, 1003, and 1023 were selected to test the generalization ability of the model. The experimental results show that the GRU-LR-TCN integrated model has a better generalization ability than both the LR-TCN and GRU models and outperforms some integrated models in terms of performance. The estimation performance of GRU-LR-TCN is not satisfactory on datasets with high complexity and serious data loss. This requires the use of machine learning-based missing value filling techniques to further improve the data’s quality, as well as a detailed analysis of the spatial relationship between monitoring stations and the specific situation at the time of data collection to extract the spatiotemporal correlation features between monitoring stations. The model can be extended to other cities for PM2.5 concentration prediction, but the model parameters need to be tuned to obtain optimal results. The spatiotemporal correlation between monitoring stations in specific application cities also needs to be considered to improve the model appropriately.

Author Contributions

Conceptualization, J.H. and Y.J.; methodology, J.H.; software, Z.-H.J.; validation, J.H., C.-B.H. and Y.J.; formal analysis, Z.-H.J.; investigation, F.S.; resources, Z.-H.J.; data curation, X.-H.H.; writing—original draft preparation, J.H.; writing—review and editing, Z.-H.J.; visualization, Y.J.; supervision, F.S.; project administration, X.-H.H.; funding acquisition, Z.-H.J. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by Key R&D Program Projects in Xinjiang Autonomous Region (No. 2022B01010-3) and Tianshan Talent Training Project-Xinjiang Science and Technology Innovation Team Program (No. 2023TSYCTD).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The dataset used is in the public domain. The code can be requested from the authors.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Gan, T.; Liang, W.; Yang, H.; Liao, X. The effect of economic development on haze pollution (PM2.5) based on a spatial perspective: Urbanization as a mediating variable. J. Clean. Prod. 2020, 266, 121880. [Google Scholar] [CrossRef]
  2. Wang, C.; Tu, Y.; Yu, Z.; Lu, R. PM2.5 and cardiovascular diseases in the elderly: An overview. Int. J. Environ. Res. Public Health 2015, 12, 8187–8197. [Google Scholar] [CrossRef] [PubMed]
  3. Wang, L.; Bao, S.; Liu, X.; Wang, F.; Zhang, J.; Dang, P.; Huang, W.; Li, B.; Lin, Y. Low-dose exposure to black carbon significantly increase lung injury of cadmium by promoting cellular apoptosis. Ecotoxicol. Environ. Saf. 2021, 224, 112703. [Google Scholar] [CrossRef] [PubMed]
  4. Kranc, H.; Novack, V.; Shtein, A.; Sonkin, R.; Jaffe, E.; Novack, L. Ambient air pollution and out-of-hospital cardiac arrest. Israel nation wide assessment. Atmos. Environ. 2021, 261, 118567. [Google Scholar] [CrossRef]
  5. Hao, Y.; Peng, H.; Temulun, T.; Liu, L.Q.; Mao, J.; Lu, Z.N.; Chen, H. How harmful is air pollution to economic development? New evidence from PM2.5 concentrations of Chinese cities. J. Clean. Prod. 2018, 172, 743–757. [Google Scholar] [CrossRef]
  6. AirVisual, IQAir. 2018. Available online: https://www.airvisual.com/worldmost-polluted-cities/world-air-quality-report-2018-en.pdf (accessed on 3 April 2021).
  7. Zhang, L.; Lin, J.; Qiu, R.; Hu, X.; Zhang, H.; Chen, Q.; Tan, H.; Lin, D.; Wang, J. Trend analysis and forecast of PM2.5 in Fuzhou, China using the ARIMA model. Ecol. Indic. 2018, 95, 702–710. [Google Scholar] [CrossRef]
  8. Yan, X.; Enhua, X. ARIMA and Multiple Regression Additive Models for PM2.5 Based on Linear Interpolation. In Proceedings of the 2020 International Conference on Big Data & Artificial Intelligence & Software Engineering (ICBASE), Bangkok, Thailand, 30 October–1 November 2020; pp. 266–269. [Google Scholar] [CrossRef]
  9. Vong, C.M.; Ip, W.F.; Wong, P.K.; Yang, J.Y. Short-term prediction of air pollution in Macau using support vector machines. J. Control. Sci. Eng. 2012, 2012, 518032. [Google Scholar] [CrossRef]
  10. Yang, W.; Deng, M.; Xu, F.; Wang, H. Prediction of hourly PM2.5 using a space-time support vector regression model. Atmos. Environ. 2018, 181, 12–19. [Google Scholar] [CrossRef]
  11. Laña, I.; Del Ser, J.; Padró, A.; Vélez, M.; Casanova-Mateo, C. The role of local urban traffic and meteorological conditions in air pollution: A data-based case study in Madrid, Spain. Atmos. Environ. 2016, 145, 424–438. [Google Scholar] [CrossRef]
  12. Collado, J.; Pinzon, C. Air Pollution Prediction Using Machine Learning Algorithms: A Literature Review. In Proceedings of the 2022 V Congreso Internacional en Inteligencia Ambiental, Ingeniería de Software y Salud Electrónica y Móvil (AmITIC), San Jose, Costa Rica, 14–16 September 2022; pp. 1–6. [Google Scholar] [CrossRef]
  13. Singh, K.P.; Gupta, S.; Kumar, A.; Shukla, S.P. Linear and nonlinear modeling approaches for urban air quality prediction. Sci. Total Environ. 2012, 426, 244–255. [Google Scholar] [CrossRef]
  14. Samal, K.K.R.; Babu, K.S.; Das, S.K. Multi-directional temporal convolutional artificial neural network for PM2.5 forecasting with missing values: A deep learning approach. Urban Clim. 2021, 36, 100800. [Google Scholar] [CrossRef]
  15. Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
  16. Dey, R.; Salem, F.M. Gate-variants of gated recurrent unit (GRU) neural networks. In Proceedings of the 2017 IEEE 60th International Midwest Symposium on Circuits and Systems (MWSCAS), Boston, MA, USA, 6–9 August 2017; pp. 1597–1600. [Google Scholar] [CrossRef]
  17. Shi, L.; Zhang, H.; Xu, X.; Han, M.; Zuo, P. A balanced social LSTM for PM2.5 concentration prediction based on local spatiotemporal correlation. Chemosphere 2022, 291, 133124. [Google Scholar] [CrossRef] [PubMed]
  18. Bhimavarapu, U.; Sreedevi, M. An enhanced loss function in deep learning model to predict PM2.5 in India. Intell. Decis. Technol. 2023, 17, 363–376. [Google Scholar] [CrossRef]
  19. Zhang, B.; Liu, Y.; Yong, R.; Zou, G.; Yang, R.; Pan, J.; Li, M. A spatial correlation prediction model of urban PM2.5 concentration based on deconvolution and LSTM. Neurocomputing 2023, 544, 126280. [Google Scholar] [CrossRef]
  20. Yang, G.; Lee, H.; Lee, G. A hybrid deep learning model to forecast particulate matter concentration levels in Seoul, South Korea. Atmosphere 2020, 11, 348. [Google Scholar] [CrossRef]
  21. Ding, W.; Zhu, Y. Prediction of PM2.5 concentration in NingxiaHui autonomous region based on PCA-Attention-LSTM. Atmosphere 2022, 13, 1444. [Google Scholar] [CrossRef]
  22. Wang, B.; Kong, W.; Zhao, P. An air quality forecasting model based on improved convnet and RNN. Soft Comput. 2022, 25, 9209–9218. [Google Scholar] [CrossRef]
  23. Bai, S.; Kolter, J.Z.; Koltun, V. An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv 2018, arXiv:1803.01271. [Google Scholar] [CrossRef]
  24. Zhu, R.; Liao, W.; Wang, Y. Short-term prediction for wind power based on temporal convolutional network. Energy Rep. 2020, 6, 424–429. [Google Scholar] [CrossRef]
  25. Shi, T.; Li, P.; Yang, W.; Qi, A.; Qiao, J. Application of TCN-biGRU neural network in PM2.5 concentration prediction. Environ. Sci. Pollut. Res. 2023, 30, 119506–119517. [Google Scholar] [CrossRef] [PubMed]
  26. Samal, K.K.R.; Babu, K.S.; Das, S.K. A neural network approach with iterative strategy for long-term PM2.5 forecasting. In Proceedings of the 2021 IEEE 18th India Council International Conference (INDICON), Guwahati, India, 19–21 December 2021; pp. 1–6. [Google Scholar] [CrossRef]
  27. Chen, J. Short-Term Prediction of PM2.5 Concentration based on Self-Attention Mechanism Improved Temporal Convolution Network. In Proceedings of the 2023 International Seminar on Computer Science and Engineering Technology (SCSET), New York, NY, USA, 29–30 April 2023; pp. 528–534. [Google Scholar] [CrossRef]
  28. Liu, H.; Deng, D.H. An enhanced hybrid ensemble deep learning approach for forecasting daily PM2.5. J. Cent. South Univ. 2022, 29, 2074–2083. [Google Scholar] [CrossRef]
  29. Yuan, P.; Mei, Y.; Zhong, Y.; Xia, Y.; Fang, L. A Hybrid Deep Learning Model for Predicting PM2.5. In Proceedings of the 2022 7th International Conference on Intelligent Computing and Signal Processing (ICSP), Xi’an, China, 15–17 April 2022; pp. 274–278. [Google Scholar] [CrossRef]
  30. Zhang, H.; Zhan, Y.; Li, J.; Chao, C.Y.; Liu, Q.; Wang, C.; Jia, S.; Ma, L.; Biswas, P. Using Kriging incorporated with wind direction to investigate ground-level PM2.5 concentration. Sci. Total Environ. 2021, 751, 141813. [Google Scholar] [CrossRef] [PubMed]
  31. Shi, T.; Yang, W.; Qi, A.; Li, P.; Qiao, J. LASSO and attention-TCN: A concurrent method for indoor particulate matter prediction. Appl. Intell. 2023, 53, 20076–20090. [Google Scholar] [CrossRef]
  32. Zeng, L.; Xu, Y.; Ni, S.; Xu, M.; Jia, P. A mixed gas concentration regression prediction method for electronic nose based on two-channel TCN. Sens. Actuators B Chem. 2023, 382, 133528. [Google Scholar] [CrossRef]
  33. Li, X.; Jiang, Q.; Ni, S.; Xu, Y.; Xu, M.; Jia, P. An electronic nose for CO concentration prediction based on GL-TCN. Sens. Actuators B Chem. 2023, 387, 133821. [Google Scholar] [CrossRef]
  34. Ni, S.; Jia, P.; Xu, Y.; Zeng, L.; Li, X.; Xu, M. Prediction of CO concentration in different conditions based on Gaussian-TCN. Sens. Actuators B Chem. 2023, 376, 133010. [Google Scholar] [CrossRef]
  35. Lei, F.; Zhang, X.; Yang, Y. PM2.5 concentration prediction based on temporal convolutional network. In Proceedings of the International Conference on Cloud Computing, Performance Computing, and Deep Learning (CCPCDL 2022), Wuhan, China, 11–13 March 2022; pp. 472–479. [Google Scholar] [CrossRef]
  36. Xu, J.; Li, Z.; Du, B.; Zhang, M.; Liu, J. Reluplex made more practical: Leaky ReLU. In Proceedings of the 2020 IEEE Symposium on Computers and communications (ISCC), Rennes, France, 7–10 July 2020; pp. 1–7. [Google Scholar] [CrossRef]
  37. He, K.; Zhang, X.; Ren, S.; Sun, J. Identity mappings in deep residual networks. In Proceedings of the Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, 11–14 October 2016; Proceedings, Part IV 14. pp. 630–645. [Google Scholar]
  38. Zuo, K. Integrated Forecasting Models Based on LSTM and TCN for Short-Term Electricity Load Forecasting. In Proceedings of the 2023 9th International Conference on Electrical Engineering, Control and Robotics (EECR), Wuhan, China, 24–26 February 2023; pp. 207–211. [Google Scholar] [CrossRef]
  39. Zheng, Y.; Capra, L.; Wolfson, O.; Yang, H. Urban computing: Concepts, methodologies, and applications. ACM Trans. Intell. Syst. Technol. (TIST) 2014, 5, 1–55. [Google Scholar] [CrossRef]
  40. Zheng, Y.; Liu, F.; Hsieh, H.P. U-air: When urban air quality inference meets big data. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Chicago, IL, USA, 11–14 August 2013; pp. 1436–1444. [Google Scholar] [CrossRef]
  41. Zheng, Y.; Yi, X.; Li, M.; Li, R.; Shan, Z.; Chang, E.; Li, T. Forecasting fine-grained air quality based on big data. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Sydney, Australia, 10–13 August 2015; pp. 2267–2276. [Google Scholar] [CrossRef]
  42. Zheng, Q.; Tian, X.; Yu, Z.; Jin, B.; Jiang, N.; Ding, Y.; Yang, M.; Elhanashi, A.; Saponara, S.; Kpalma, K. Application of complete ensemble empirical mode decomposition based multi-stream informer (CEEMD-MsI) in PM2.5 concentration long-term prediction. Expert Syst. Appl. 2024, 245, 123008. [Google Scholar] [CrossRef]
  43. Shao, M.; Xu, X.; Lu, Y.; Dai, Q. Spatio-temporally differentiated impacts of temperature inversion on surface PM2.5 in eastern China. Sci. Total Environ. 2023, 855, 158785. [Google Scholar] [CrossRef]
  44. Zhang, L.; Na, J.; Zhu, J.; Shi, Z.; Zou, C.; Yang, L. Spatiotemporal causal convolutional network for forecasting hourly PM2.5 concentrations in Beijing, China. Comput. Geosci. 2021, 155, 104869. [Google Scholar] [CrossRef]
  45. Lei, T.; Zhang, Y.; Wang, S.I.; Dai, H.; Artzi, Y. Simple recurrent units for highly parallelizable recurrence. arXiv 2017, arXiv:1709.02755. [Google Scholar] [CrossRef]
  46. Liu, C.; Zhang, L.; Yao, R.; Wu, C. Dual attention-based temporal convolutional network for fault prognosis under time-varying operating conditions. IEEE Trans. Instrum. Meas. 2021, 70, 1–10. [Google Scholar] [CrossRef]
  47. Guo, Y.; Zhang, S.; Yang, J.; Yu, G.; Wang, Y. Dual memory scale network for multi-step time series forecasting in thermal environment of aquaculture facility: A case study of recirculating aquaculture water temperature. Expert Syst. Appl. 2022, 208, 118218. [Google Scholar] [CrossRef]
Figure 1. The flowchart diagram of the research design.
Figure 1. The flowchart diagram of the research design.
Applsci 14 08745 g001
Figure 2. L-TCN structure.
Figure 2. L-TCN structure.
Applsci 14 08745 g002
Figure 3. Dilated causal convolution. (a) TCN dilated causal convolution structure; (b) L-TCN dilated causal convolution structure.
Figure 3. Dilated causal convolution. (a) TCN dilated causal convolution structure; (b) L-TCN dilated causal convolution structure.
Applsci 14 08745 g003
Figure 4. LR-TCN structure.
Figure 4. LR-TCN structure.
Applsci 14 08745 g004
Figure 5. Flowchart of the GRU-LR-TCN integrated predicting model.
Figure 5. Flowchart of the GRU-LR-TCN integrated predicting model.
Applsci 14 08745 g005
Figure 6. Pearson correlation between features related to PM2.5 in monitoring station 1001.
Figure 6. Pearson correlation between features related to PM2.5 in monitoring station 1001.
Applsci 14 08745 g006
Figure 7. Results of PM2.5 concentration estimation for the next hour: (a) estimated results at monitoring station 1001; (b) estimated results at monitoring station 1002; (c) estimated results at monitoring station 1003; (d) estimated results at monitoring station 1023.
Figure 7. Results of PM2.5 concentration estimation for the next hour: (a) estimated results at monitoring station 1001; (b) estimated results at monitoring station 1002; (c) estimated results at monitoring station 1003; (d) estimated results at monitoring station 1023.
Applsci 14 08745 g007
Table 2. Ablation experiments for regression estimation of PM2.5 concentration in monitoring station 1001.
Table 2. Ablation experiments for regression estimation of PM2.5 concentration in monitoring station 1001.
RMSEMSEMAESMAPENAER2IA
Orginal TCN[1,2,4,8,Re]18.787352.97713.06723.3420.0420.8620.959
TCN[1,2,4,8,Re]18.773352.44212.88722.7530.0410.8620.962
TCN[1,2,4,8,Lr]17.686312.80911.40120.3560.0360.8780.966
TCN[Re]17.929321.46211.91721.5410.0380.8740.966
TCN[Lr]17.298299.24611.22120.0590.0360.8830.967
CNN-TCN[1,2,4,8,Re]16.433270.06410.68019.5420.0340.8940.970
CNN-TCN[1,2,4,8,Lr]16.529273.22310.69519.5960.0340.8930.970
CNN-TCN[Re]16.504272.41410.76220.1740.0340.8930.971
CNN-TCN[Lr]16.306265.90010.59320.4400.0340.8960.972
Table 3. Comparison experiments of regression estimation monolithic models for PM2.5 concentration in monitoring station 1001.
Table 3. Comparison experiments of regression estimation monolithic models for PM2.5 concentration in monitoring station 1001.
RMSEMSEMAESMAPENAER2IATime
LSTM18.719350.41111.94126.9590.0360.8630.96321.46 s
GRU17.090292.07510.67820.6280.0340.8860.97024.10 s
SRU17.223296.63210.54320.7200.0330.8840.96923.29 s
TCN18.787352.97713.06723.3420.0420.8620.95964.66 s
Gaussian-TCN17.592309.49911.42720.8740.0360.8790.96980.38 s
GL-TCN17.349300.98811.36121.1060.0360.8820.96867.69 s
DD-TCN17.621310.51811.22219.9060.0360.8790.969108.34 s
D-TCN17.675312.41111.211119.9210.0360.8780.96880.01 s
LR-TCN16.306265.90010.59320.4400.0340.8960.97255.62 s
Table 4. Comparison experiments of estimation integrated models for PM2.5 concentration in monitoring station 1001.
Table 4. Comparison experiments of estimation integrated models for PM2.5 concentration in monitoring station 1001.
RMSEMSEMAESMAPENAER2IATime
LSTM-TCN17.951322.26511.77121.8640.0370.8740.96497.09 s
GRU-TCN16.862284.34010.56119.1920.0330.8890.96991.47 s
SRU-TCN17.364301.54211.47521.1710.0360.8820.96790.09 s
LSTM-Gaussian-TCN18.019324.69311.72321.8810.0370.8730.967116.38 s
GRU-Gaussian-TCN16.954287.44510.62020.0460.0340.8880.970117.46 s
SRU-GaussianTCN17.144293.94310.86920.5330.0340.8850.970104.47 s
LSTM-GL-TCN17.855318.83511.47721.6930.0360.8750.96695.38 s
GRU-GL-TCN17.038290.29410.72420.5140.0340.8870.96995.49 s
SRU-GL-TCN16.848283.86710.62619.7370.0340.8890.97092.94 s
LSTM-DD-TCN17.351301.06610.92120.5970.0350.8820.969143.65 s
DDTCN-GRU17.040290.37510.70420.4800.0340.8860.970144.91 s
SRU-DD-TCN17.119293.06210.71919.6130.0340.8850.970133.45 s
LSTM-D-TCN17.754315.23211.49521.5120.0360.8770.967112.55 s
GRU-D-TCN17.333300.45810.82220.0690.0340.8830.969112.46 s
SRU-D-TCN17.063291.17910.57319.5080.0340.8860.970100.72 s
LSTM-LR-TCN17.001289.02110.81620.1840.0340.8870.97084.68 s
GRU-LR-TCN16.261264.44410.13818.9780.0320.8970.97278.54 s
SRU-LR-TCN16.369267.96910.34819.4760.0330.8970.89580.01 s
Table 5. Experiment on the generalization of regression estimation of PM2.5 concentration.
Table 5. Experiment on the generalization of regression estimation of PM2.5 concentration.
StationNetworkRMSEMSEMAESMAPENAER2IATime
LSTM17.100292.43811.14129.0900.0490.8740.96625.19 s
GRU15.053226.6189.09220.7950.0390.9030.97523.71 s
SRU15.236232.1589.18421.0660.0410.9000.97424.49 s
TCN15.577242.6619.62620.7250.0430.8960.97362.88 s
1002Gaussian-TCN15.370236.2519.14920.0880.0390.8980.97390.53 s
GL-TCN16.172261.5619.95520.7910.0420.8880.97268.26 s
DD-TCN15.752248.1329.48820.0850.0420.8930.973119.98 s
D-TCN16.111259.57910.73623.3520.0470.8880.97084.22 s
ST-TCN18.352341.81212.28135.1520.2850.6680.91367.38 s
DMSnet16.622280.29111.37128.6820.2100.8240.95717391 s
LR-TCN15.454238.8399.63720.8620.0420.8970.97459.99 s
GRU-LR-TCN14.837220.1589.02419.8130.0390.9050.97586.24 s
LSTM19.620384.95311.15420.5050.0370.8420.95721.64 s
GRU18.934358.51310.62517.8730.0350.8530.96122.77 s
SRU18.919357.95410.60118.0750.0350.8530.96119.83 s
TCN18.822354.27910.82017.9850.0360.8550.95970.24 s
1003Gaussian-TCN19.096364.67110.90117.5240.0360.8500.96181.81 s
GL-TCN18.834354.72710.50317.2500.0350.8550.96268.13 s
DD-TCN18.857355.61110.60916.9550.0350.8540.961119.54 s
D-TCN19.049362.88611.13217.7060.0370.8510.96077.66 s
ST-TCN19.643390.85518.28120.1170.2330.7210.90265.41 s
DMSnet18.845360.12613.34519.0910.1900.7360.93216974 s
LR-TCN18.695349.53010.55616.1620.0350.8570.96257.34 s
GRU-LR-TCN18.746351.42210.53016.4170.0350.8560.96285.42 s
LSTM18.692349.42111.53434.0340.0420.8670.96522.04 s
GRU17.832317.98510.13219.5630.0390.8790.96824.32 s
SRU17.830317.91010.53119.5850.0410.8790.96821.09 s
TCN18.586345.45211.00320.3840.0350.8690.96463.52 s
1023Gaussian-TCN18.607346.22811.00323.5370.0370.8680.96688.23 s
GL-TCN18.317335.53211.34319.6930.0430.8720.96769.91 s
DD-TCN18.067326.41610.74120.5320.0410.8760.968119.86 s
D-TCN18.130328.71810.91020.0680.0410.8750.96788.71 s
ST-TCN19.628411.20115.05629.2300.3060.6650.88563.72 s
DMSnet18.662358.74212.46926.9360.1940.8010.94316810 s
LR-TCN18.200331.27010.98018.9170.0400.8740.96854.65 s
GRU-LR-TCN17.878319.64610.63021.7680.0410.8780.96983.10 s
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Hu, J.; Jia, Y.; Jia, Z.-H.; He, C.-B.; Shi, F.; Huang, X.-H. Prediction of PM2.5 Concentration Based on Deep Learning for High-Dimensional Time Series. Appl. Sci. 2024, 14, 8745. https://doi.org/10.3390/app14198745

AMA Style

Hu J, Jia Y, Jia Z-H, He C-B, Shi F, Huang X-H. Prediction of PM2.5 Concentration Based on Deep Learning for High-Dimensional Time Series. Applied Sciences. 2024; 14(19):8745. https://doi.org/10.3390/app14198745

Chicago/Turabian Style

Hu, Jie, Yuan Jia, Zhen-Hong Jia, Cong-Bing He, Fei Shi, and Xiao-Hui Huang. 2024. "Prediction of PM2.5 Concentration Based on Deep Learning for High-Dimensional Time Series" Applied Sciences 14, no. 19: 8745. https://doi.org/10.3390/app14198745

APA Style

Hu, J., Jia, Y., Jia, Z.-H., He, C.-B., Shi, F., & Huang, X.-H. (2024). Prediction of PM2.5 Concentration Based on Deep Learning for High-Dimensional Time Series. Applied Sciences, 14(19), 8745. https://doi.org/10.3390/app14198745

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop