Abstract
Sea ice plays a pivotal role in ocean-related research, necessitating the development of highly accurate and robust techniques for its extraction from diverse satellite remote sensing imagery. However, conventional learning methods face limitations due to the soaring cost and time associated with manually collecting sufficient sea ice data for model training. This paper introduces an innovative approach where Neural Dynamics (ND) algorithms are seamlessly integrated with a recurrent neural network, resulting in a Transfer-Learning-Like Neural Dynamics (TLLND) algorithm specifically tailored for sea ice extraction. Firstly, given the susceptibility of the image extraction process to noise in practical scenarios, an ND algorithm with noise tolerance and high extraction accuracy is proposed to address this challenge. Secondly, The internal coefficients of the ND algorithm are determined using a parametric method. Subsequently, the ND algorithm is formulated as a decoupled dynamical system. This enables the coefficients trained on a linear equation problem dataset to be directly generalized to solve the sea ice extraction challenges. Theoretical analysis ensures that the effectiveness of the proposed TLLND algorithm remains unaffected by the specific characteristics of various dataset. To validate its efficacy, robustness, and generalization performance, several comparative experiments are conducted using diverse Arctic sea ice satellite imagery with varying levels of noise. The outcomes of these experiments affirm the competence of the proposed TLLND algorithm in addressing the complexities associated with sea ice extraction.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
1 Introduction
The escalating challenges posed by global warming and the subsequent melting of sea ice have put the resource-rich polar regions into the global spotlight. Specifically, the opening of Arctic sea routes could potentially alter the worldwide navigation landscape, underlining the growing strategic significance of this region[1,2,3]. Furthermore, the Arctic’s integral role in Earth’s ecology as a natural cold source is vital, which will influence ecological equilibrium, energy transfer, and climate change [4]. Consequently, changes in the motion, melting, area, and thickness of Arctic sea ice directly impact the energy transfer and balance between the Arctic and neighboring oceans. Therefore, real-time dynamic monitoring and observation of Arctic sea ice are of great importance for the exploitation of Arctic resources, maintaining the balance of the ecosystem, and predicting the world’s extreme climate.
With the improvement of the quality and accuracy of high-resolution satellite images, remote sensing has become one of the most effective means of monitoring sea ice conditions[5]. Sea ice identification is an important part of sea ice monitoring using remote sensing imagery, but noise disturbances cannot be ignored in the practical application of remote sensing imagery in real scenarios[6]. In recent years, SOTA methods for the extraction of sea ice can be divided into the following three categories: observation-based methods, scattering-coefficients-based methods, and neural-networks-based methods.
Observation-based methods rely on observations of sea ice conditions to identify and extract sea ice from remote sensing imagery. These methods often use manual or semi-automatic schemes for sea ice identification, such as threshold and edge detection[7]. However, the scattering-coefficients-based methods are techniques that use the reflected signals from satellite scatterometers to distinguish sea ice from seawater[8]. neural-networks-based methods apply machine learning techniques to analyse and identify sea ice types and distributions. These methods can use satellite remote sensing data, such as synthetic aperture radar imagery, to extract sea ice characteristics and parameters such as concentration, thickness, shape, and movement [9].
Although these three major methods and other related investigations for sea ice extraction have demonstrated constructive and promising results in methodology and practical applications, they are not without limitations. For example, the observation-based methods, relying on intuition or experience to build a model, lack rationality and generalization thereby its broader application is limited. The scattering-coefficients-based methods heavily depend on professional knowledge in remote sensing information processing and accumulated technique in remote image acquisition, thus limiting its wider application. The neural-networks-based methods exhibit competency for various complex sea ice extraction tasks, however its performance is constrained by the quantity and quality of the dataset, while it can be expensive to collect enough sea ice data manually to build a specialized model. Therefore, it is necessary to construct a new method that combines both the high performance of learning methods with the convenience of traditional numerical methods.
In the 1990s, Zhu et al. present a theory of utilizing learning methods to learn partial differential equations for image noise processing [10]. The combination of learning methods and numerical methods has achieved positive results before deep learning becomes popular [11, 12]. Subsequently, researchers discuss the idea of modelling high dimensional non-linear functions with dynamical systems, i.e. the relationship between differential equations and deep learning. In [13], Weinan E demonstrates that deep neural networks can be viewed as discrete dynamical systems. A well-known example is that each residual block of ResNets can be considered as \({X}_{k + 1} - {X_k}=f(X_k)\), and the number of iterations k can be interpreted as the artificial time [14]. Then, it can be obtained that \(\dot{X}=f(X)\), which is the forward Euler discretization. Inspired by the connection between deep learning and dynamic systems, many meaningful works have been investigated in this field in recent years. For example, the structural design of deep learning models inspired by the connection between network structures and numerical differential equations [15], the design of optimizers that use numerical methods to improve performance [16], and the transfer of optimal control to deep learning to improve its stability [17]. The above works all take the advantages and characteristics of numerical methods to improve the performance of deep learning models, however work of utilizing the advantages of deep learning methods to improve the performance of dynamical system models is rarely seen.
Human has a remarkable ability to generalize from past experiences and learn new tasks quickly, which is essential for survival in a complex and dynamic world [18]. However, the current deep learning models rely heavily on large-scale supervised data, which limits their generalization performance and makes them vulnerable to data quality issues [19]. In contrast, human beings can learn from a few examples and transfer their knowledge across domains and tasks. This is especially important for applications such as sea ice extraction, where collecting enough labelled data to train a specialized model can be both costly and time-consuming, as sea ice data collection is challenging and requires specialized equipments and expertise [20]. Therefore, developing an algorithm with transferring and generalization capabilities is very meaningful. This algorithm can not only adapt and optimize in different domains and tasks, but can also learn and reason from a small amount of data.
The purpose of the deep learning algorithm is to learn a function that maps the input (i.e., the set of independent variables X) to the output (i.e., the target variable Y). For example, \(Y=z(X)+\epsilon \). To estimate the unknown function z(X), a model should be fitted with appropriate data. The form of function z(X) is usually unknown, so it may not be possible to obtain it without fitting different models or making some assumptions about the form of the function z(X). The method of making an assumption about the form of the function z(X) and using a suitable dataset to train the model and estimate the coefficient of the function z(X) through the learning process is called parametric method [21]. For example, assume that the unknown function z(X) is linear, i.e. \(f(X)={\zeta }_0+{\zeta }_1{X}_1+...+{\zeta }_n{X}_n\), in which \({\zeta }_i(i=1,2,\dots ,n)\) is the coefficient to be learned, n is the number of independent variables, and X is the input. Based on an assumption about the form of the function and selecting a model that fits the assumption, the learning process trains the model and estimates the coefficients.
It is feasible to estimate the coefficients of the ND algorithm using parametric methods. Parametric methods have the advantage of high computational performance and low training data requirements. The ND approaches have the adaptability to solve dynamical system problems with both the robustness and generalizability guaranteed by theoretical analyses. The combination of these two methods paves a new way to solve specific problems.
This paper designs and analyses a transfer-learning-like neural dynamics (TLLND) algorithm for aiding the constrained energy minimization (CEM) scheme. An ND algorithm for solving the CEM scheme with learnable coefficients is proposed, and then parameters of the TLLND algorithm are determined by a parametric method with the structure diagram shown in Fig. 1. To ensure the validity of the proposed TLLND algorithm, theoretical analyses are provided. Finally, comparative experiments are conducted, which show that the TLLND algorithm has better efficiency and robustness compared to SOTA methods. The rest of this paper is divided into four sections. Section 2 introduces the CEM scheme and parametric methods. Section 3 presents theoretical analyses on the convergence and stability of the proposed TLLND algorithm. In Sect. 4, comparative experiments of Arctic sea ice extraction are presented. Section 5 concludes this paper.
2 Preliminary and TLLND Algorithm
In this section, the CEM scheme formulations and the TLLND algorithm are provided.
2.1 Base Scheme
It should be noted that the extraction of sea ice from the Arctic needs to be supported by remote sensing satellite imagery. CEM schemes are presented to detect the targets in images, and the main principle is to suppress background information so that the target is highlighted. The CEM scheme does not require the background information of the image and can adaptively detect the spectral information of the target. More specifically, it determines a finite impulse response (FIR) filter \(\varvec{\omega }(t) \in \mathbb {R}^{r}\) from the original image data and targets, and allows the original image to pass through the filter to obtain detection results, where r is the number of bands in the original image. The CEM scheme can be derived by solving the following linear constrained optimization problem:
where \(\varvec{v}\in \mathbb {R}^{r\times 1}\) denotes the light spectrum vector of sea ice, the symmetric matrix \({R} \in \mathbb {R}^{r\times r}\) represents the self-correlation matrix of the original sea ice image, \(I\in \mathbb {R}^{1\times 1}\) denotes an identity matrix, and \(^{\textrm{T}}\) represents the transpose of matrix.
In this work, the above CEM scheme for sea ice extraction is computed with the aid of the TLLND algorithm. Next, a Lagrange-multiplier vector is utilized to solve CEM problem (1). The related Lagrange function is \( \vartheta (\varvec{\omega }(t), \zeta (t), t)=\varvec{\omega }^{\textrm{T}}(t){R}\varvec{\omega }(t)-\zeta (t)^{\textrm{T}}(\varvec{\omega }^{\textrm{T}}(t)\varvec{v}-{I}) , \) where \(\zeta (t)\in {\mathbb {R}^{ r}}\) denotes the Lagrange-multiplier vector.
Let
At last, (1) can be rearranged as a time-dependent linear equation form:
2.2 Continuous and Discrete ND Models
To solve the CEM scheme, the following loss function is first defined:
The integration-enhanced continuous ND algorithm is shown as follows [22]:
where \(\varrho >0\) and \(\lambda >0\). A continuous ND model for solving the CEM scheme is obtained by substituting (3) into (4)
where \( W^{\dagger }(t)\) represents the pseudoinverse of W(t).
Since there are very few continuous models that can accurately find analytical solutions, we often need numerical solutions, such as Runge-Kutta methods. These methods can approximate the solutions of differential equations by iterative calculations, thus simulating complex dynamical systems. In view of this, we need a discretization method to approximate the future \(\textbf{v}_{k+1}\) with the available information at time moment \(t_{k}\), where k means the k-th iteration. The form of the three-order discrete formula for discretizing (5) is shown as follows:
where \(\alpha \in \mathbb {R}, \beta \in \mathbb {R}\), and \(\gamma \in \mathbb {R}\) denote coefficients of the discrete formula.
2.3 Dynamical System and Recurrent Neural Network
In general, a model of an ordinary differential equation (ODE) can be written in the form of a dynamical system as follows:
There are many numerical methods for solving ODEs, such as Euler method, Runge–Kutta method, Adams method, etc. The Euler iteration formula, is presented here:
where h represents the step size. The idea of the Euler method is simple, which is to use \((\varvec{x}(t+h)-\varvec{x}(t))/{h}\) to approximate the derivative term \(\dot{\varvec{x}}(t)\). As long as the initial condition \(\varvec{x}(0)\) is given, we can iteratively calculate the result of each time point according to (8).
In short, as long as the model takes \((\varvec{x}_1,\varvec{x}_2,...,\varvec{x}_n)\) as input, \((\varvec{y}_1,\varvec{y}_2,...,\varvec{y}_k)\) as output, and satisfies the following recurrence relation
it can be called a recurrent neural network (RNN).
As mentioned above, the number of iterations k of a deep learning model can be considered as the artificial time t[14]. From Eqs. (8) and (9), we can see that, in (8), t is a floating variable and in (9) t is an integer variable, and otherwise there seems to be no significant difference between (8) and (9). In fact, in (8) we can consider h as time and let \(t = nh\), and then (8) becomes
where the time variable n is an integer. In this way, we know that the discrete dynamical system (8) is in fact just a special case of an RNN, and the same holds for the discrete form of (5).
2.4 Parametric Method
Deep neural networks and discrete dynamical systems have some similarities and connections. In this section, we propose a parametric method using the RNN to estimate the coefficients inside of a discrete dynamical system, which is in the form of (6). The network structure of the RNN can be designed according to the ND model for the solution of CEM schemes as follows:
where \(\alpha \) and \(\beta \) are trainable coefficients shared within each layer. During the training process, the learnable coefficients are continuously updated and the TLLND algorithm for solving CEM scheme (1) is obtained after the training phase. A schematic of the RNN training architecture is presented in Fig. 1. In the training phase of the RNN, the predicted value from the previous time step is fed back into the network as input at each subsequent time step, and the output is used to predict the value for the next time step. The true value for each time step is also provided as the label for the network to learn from and adjust its parameters to minimize the difference between the predicted values and the true values. Since we are using parametric methods to try to obtain a model of a dynamical system through deep learning, this differs from the standard dynamical system model inference and standard deep learning procedure, so some elements in the training process need to be addressed.
2.4.1 Supervisory Label
We can view the problem that the parametric method solves as a time-series prediction problem, where future information is predicted based on known information. During the training process, true values of future moments are used as the supervisory label.
2.4.2 Dataset
Now that the problem (1) has been transformed into a system of linear equations problem, we can generate the training data. To generate the training data, we first determine the coefficient matrix and constant term vector of the system of linear equations. We can then use any solution method, such as Gaussian elimination or iterative methods, to obtain the solution vectors for the system of linear equations. This gives us a set of training data, including the input (coefficient matrix and constant term vector) and the output (solution vector). We can repeat this process several times to generate different training data. Finally, we can use this training data to train and evaluate deep learning models.
2.4.3 Coefficients Initialization
In the back-propagation of the RNN model, the initialization should be done carefully and the learning rate should be chosen carefully to avoid the issue of gradient explosion. To prevent the solution from becoming divergent, we set the initial coefficients to be zero stable. Zero stability can effectively improve the convergence speed and generalization ability of the model, because it can reduce the model’s sensitivity to noise and disturbance, thereby improving the model’s robustness and reliability [23]. In this paper, we will introduce a training method for the RNN model based on zero-stable initialization, and the characteristic equation of Eq. (10) is
To ensure zero stability at the initialization stage, we apply the root condition for Eq. (6) to coefficients \(\alpha \) and \(\beta \). During training, we use a low learning rate of 0.001 and a high number of epochs 200 to achieve optimal performance.
One of the challenges of solving a three-order ODE is that there are countless sets of valid coefficients, which means that a different initial value always leads to a different set of valid coefficients. This makes it difficult to find a general solution that works for any initial value. However, one possible approach is to use an RNN to learn the coefficients from data. By training the RNN on a sequence of initial values and their corresponding solutions, we can obtain a set of coefficients that approximate the ODE well. Substituting Eq. (6) with the coefficients trained by the RNN leads to a discretization equation that can be used to predict future values of the ODE given any initial value:
Substituting our parametric discrete Eq. (12) with (5) leads to
where \({h_1}=\varrho \tau \) and \({h_2}=\lambda \tau ^{2}\) are parameters of the discrete-time system of linear equations. The proposed TLLND algorithm (13) is obtained by embedding the coefficients into the ND algorithm after the training phase of the parametric method.
2.5 SOTA Algorithms
The SOTA algorithms for solving CEM scheme (1) are given in this subsection with the comparisons listed in Table 1.
-
1.
The proportional-integral iterative algorithm (PII)[24] designed for solve (1) is constructed as
$$\begin{aligned} {\textbf{v}_{k + 1}}= {\textbf{v}_{k}}-W_{k}^{\dagger } (h_1 (W_k\textbf{v}_{k}-\textbf{g}_{k})+h_2\sum _{i=0}^{k} (W_i\textbf{v}_{i}-\textbf{g}_{i})), \end{aligned}$$(14) -
2.
The discrete-time zeroing neurodynamics (DTZN) model[25, 26] designed for solve (1) is given as
$$\begin{aligned} \begin{aligned} \textbf{v}_{k + 1} =&\frac{1}{2} {\textbf{v}_k}+\frac{1}{3} {\textbf{v}_{k-1}}+\frac{1}{6} {\textbf{v}_{k-2}}-\frac{5}{3} W_{k}^{\dagger }\bigg (\tau \dot{ W_{k}}\textbf{v}_{k}\\&+h_1(W_k\textbf{v}_{k}+\textbf{g}_{k})-\tau \dot{\textbf{g}}_{k} \bigg ). \end{aligned} \end{aligned}$$(15) -
3.
In addition, the modified newton integration (MNI) neural algorithm[27] is provided here for comparison:
$$\begin{aligned} {\textbf{v}_{k + 1}}= {\textbf{v}_{k}}-W_{k}^{\dagger } (W_k\textbf{v}_{k}-\textbf{g}_{k}+h_2\sum _{i=0}^{k}( W_i \textbf{v}_i-\textbf{g}_i)). \end{aligned}$$(16)
3 Theoretical Analyses
In this section, we provide theoretical analyses on the convergence and stability of the proposed TLLND algorithm (13).
3.1 Nonlinear Transformation
By defining the error function and using the following theorem, we present a simple form of TLLND algorithm (13) to enable further analysis.
Theorem 1
From the error perspective, TLLND algorithm (13) is transformed into a linear decoupled dynamical system to solve CEM scheme (1) as follows:
where \(\varvec{l}_{k}=W_k\textbf{v}_{k}-\textbf{g}_{k} \) and \(\varvec{o}({\tau ^2})\) is the truncation errors of vectors.
Proof
The proposed TLLND algorithm (13) is written as
Expanding \(\textbf{v}_{k + 1}\), \({\textbf{v}_k}\), and \({\textbf{v}_{k - 1}}\) using the Taylor expansion results in the following equations:
Combining the above four equations, we can get
which is transformed into
The equation above can be simplified further as follows:
Since \(\varvec{l}_{k}=W_k\textbf{v}_{k}-\textbf{g}_{k} \), we have
Thus, (19) can be simplified as
Substituting \( \dot{\varvec{l}}_{k}\) with our parametric discrete Eq. (12) leads to
which can be further written as
At last, (20) can be further simplified as
The proof is thus completed.
3.2 Convergence Analyses
To demonstrate the convergence of the proposed algorithm, the following theorem is given.
Theorem 2
The proposed TLLND algorithm (13), used for solving CEM scheme (1), has a residual error \(\mathop {\lim }\limits _{k \rightarrow \infty } || {\varvec{l}_{k}}|{|_2}\) that globally converges to \({o}({\tau ^2})\).
Proof
The proposed TLLND algorithm (13) is converted into a linear system that includes a residual error term, as shown in Eq. (17). Let \(\varvec{l}_k^u\) be the \(u\text {th}\) element of \({\varvec{l}_{k}}\) with \(u \in \{1, \ldots , r+1\}\). Then, we get
and
When Eq. (22) is subtracted from Eq. (21), the resulting expression can be shown as follows:
Let \(\varvec{\sigma } _{k+1}^u = {[{{l}_{k + 1}^u} ;{{l}_{k }^u} ;{{l}_{k - 1}^u} ;{{l}_{k - 2}^u} ]}\), and then we get \(\varvec{\sigma } _{k}^u = {[{{l}_{k }^u} ;{{l}_{k-1 }^u} ;{{l}_{k - 2}^u} ;{{l}_{k - 3}^u} ]}\). Equation (23) can be described as the following equation:
where matrix D is defined as
with \(a=2.4391-{h_1}-{h_2}\); \(b={h_1}-2.3173\); \(c= 1.3173\); \(d=- 0.4391\). Expanding (24) leads to
The proposed TLLND algorithm (13) in this paper selects the parameters \({h_1=0.8}\) and \(h_2=0.08\), and the real part of the eigenvalue of matrix D has an absolute value that is less than 1. According to the spectral radius theorem, we have \(\mathop {\lim }\limits _{k \rightarrow \infty } D^k=0\), which leads to \(\varvec{\sigma }_{k+1}^{u}={o}\left( \tau ^{2}\right) \). From this, we can derive the following equation:
Based on the definition of \(\varvec{\sigma } _{k+1}^u\), if the parameters \({h_1}\) and \({h_2}\) meet certain conditions, it can be concluded that the steady-state computation error \(\lim _{k \rightarrow \infty }\left\| \varvec{l} _{k+1}\right\| _{2}\) of TLLND algorithm (13) for solving CEM scheme (1) is \({o}({\tau ^2})\).
Theorem 2 proves that TLLND algorithm (13) for solving CEM scheme (1) converges globally to the theoretical solution. Research on the robust performance analyses of TLLND algorithm (13) is presented in the following.
3.3 Robustness Analyses
The proposed TLLND algorithm (13) contains a sum term that helps it resist noises. This results in robust performance, which is proven by the following theorem.
Theorem 3
For CEM scheme (1) solved by TLLND algorithm (13), its steady-state computation error \(\lim _{k \rightarrow \infty }\left\| \varvec{l} _{k}\right\| _{2}\) is bounded by
with \({{k}} \rightarrow \infty \) in an environment with noise \({\varvec{f}}_{k} = \varvec{a}k\tau + \varvec{b} +\varvec{\varpi }\), where \(\varvec{a}k\tau + \varvec{b} \) represents the linear noise, and \(\varvec{\varpi }\) represents other noises.
Proof
The proposed TLLND algorithm (13) is a linear system, and the steady-state computation error \(\lim _{k \rightarrow \infty }\left\| \varvec{l} _{k}\right\| _{2}\) in a noise-polluted environment can be split into three subsystems, including \({o}\left( \tau ^{2}\right) \) generated by TLLND algorithm (13) itself, \(\left\| \varvec{a} \tau / h_{2}\right\| _{2}\) caused by a linear noise, and \(2 (r+1) \sup _{1 \le n \le k, 1 \le u\le r+1 }\left| \varvec{\varpi }_{n}^{u}\right| /\left( 1-\Vert D\Vert _{2}\right) \) derived from other noises. The residual error of TLLND algorithm (13) under different noises can be analyzed from different perspectives.
3.3.1 Linear Noise
Equation (17) with the linear noise \({\varvec{f}}_k = \varvec{a}k\tau + \varvec{b}\) can be presented as
The u-th subsystem of Eq. (25) is derived in the following way
Using the Z-transform to Eq. (26), we have
which is simplified as
The above equation can be further developed as
According to the final value theorem, we have
The above conclusion proves that the steady-state error of TLLND algorithm (13) is \(\left\| \varvec{a} \tau / h_{2}\right\| _{2}+{o}({\tau ^2})\) with linear noise \({\varvec{f}}_{k} = \varvec{a}k\tau + \varvec{b}\). The constant noise is a special case of linear noise \({\varvec{f}}_{k} = \varvec{a}k\tau + \varvec{b}\) with \( \varvec{a}=\varvec{0}\). Thus, the steady-state error of TLLND algorithm (13) for solving CEM scheme (1) problems with constant noise is \( {o}({\tau ^2})\).
3.3.2 Other Noise
The u-th subsystem of Eq. (17) under other noise \({\varvec{f}}_k=\varvec{\varpi }\) can be derived as follows:
Similarly, we can get the following euation:
By subtracting Eqs. (28) from (27), we get the following equation:
Define \(\varvec{\xi } _{{{k }}}^u=[\varvec{\varpi }_{{{k }}}^u-\varvec{\varpi }_{{{k-1}}}^u;0;0;0]\). According to Theorem 2, we get
which can be further derived as follows:
According to Theorem 2, we get \(\mathop {\lim }\limits _{k \rightarrow \infty } D^k=0\). Therefore, the above equation is further deduced:
Equation (30) can be expand as follows:
Based on the definition of \(\varvec{\varpi }_{n}^{u}\) and the fact that the term \({o}\left( \tau ^{2}\right) \) represents the residual error of TLLND algorithm (13), we can conclude that
In summary, under the noise \({\varvec{f}}_k = \varvec{a}k\tau + \varvec{b}+\varvec{\varpi }\), the steady-state error of TLLND algorithm (13) for solving CEM scheme (1) is bounded by \(\left\| \varvec{a} \tau / h_{2}\right\| _{2}+2 (r+1)\sup _{1 \le n \le k, 1 \le u \le r+1 }\left| \varvec{\varpi }_{n}^{u}\right| /\left( 1-\Vert D\Vert _{2}\right) +{o}\left( \tau ^{2}\right) .\) Thus, the proof is completed.
4 Application to Arctic Sea Ice Extraction
Theoretical analyses guarantee the convergence performances of the proposed TLLND algorithm (13) to solve CEM problem (1). To further verify its generalization performance, we apply it to an intuitive problem: Arctic sea ice extraction.
4.1 Satellite Imagery
It is important to note that the Arctic sea ice extraction is based on remote sensing satellite imagery, so reliable and accurate satellite imagery is essential. Comparative information on the various optional remote sensing satellites can be found in Table 2. In particular, the LandSat 8, Sentinel-2A, and Terra/Aqua satellites have the advantage of wide coverage of observations and simple pre-processing of the images. Therefore, the selection of the above three satellite imagery sufficiently supports to prove that the proposed algorithm can reliably and accurately extract sea ice from different satellite imagery under different image resolutions, which also demonstrates the generalisation of the proposed algorithm.
4.2 Evaluation Indices
In order to reflect the superiority of the proposed algorithm, the following evaluation indices are given to measure the extraction performance: mean square error (MSE), normalized mean square error (NMSE), peak signal noise ratio (PSNR), overall classification accuracy (OA), average classification accuracy (AA), and producer’s accuracy (PA). In addition, the definitions of some indices are given below:
where I(i, j) and K(i, j) denote the gray value of the ith-row jth-column pixel in the input and ground-truth Arctic sea ice remote-sensing image, respectively, and \(\textrm{MAX}\) represents the maximum possible value. In addition, TP represents the number of positive pixels correctly identified, TN denotes the number of negative pixels correctly identified, FP stands number of positive pixels with false identified, and FN denotes number of negative pixels with false identified.
4.3 Experiment 1: Different Satellite Imagery
In this subsection, three satellite imagery from the Sentinel-2A, LandSat 8, and Terra/Aqua are employed as experimental data to perform the Arctic sea ice extraction experiments aided with the proposed TLLND algorithm (13), DTZN algorithm (15), and MNI algorithm (16). In addition, the following experiments focus on the evaluation of the analytical performance of the above-mentioned algorithms in the presence of random noise \(\varvec{f}_k \in 4 \times [-1,1]\) and in the absence of noise \(\varvec{f}_k =0\) with \(h_1=0.8\), \(h_2=0.08\).
4.3.1 Sentinel-2A
Figure 2a represents the original satellite image of the Arctic sea ice download from European Space Agency copernicus open access hub, whose size is 1098*1098 pixels with 10 m * 10 m spatial resolution. The visualized results in Fig. 2b–d indicate that TLLND algorithm (13) and MNI algorithm (16) perform well in Arctic sea ice extraction under zero noise environment. Such problems do not seem to be handled by the higher-order algorithm DTZN (15).
In an environment with random noise, the extraction results are shown in Fig. 2f–h, and the processing result of DTZN algorithm (15) is still very unsatisfactory. The proposed TLLND algorithm (13) seems to be less affected by the noise than MNI algorithm (16). From Table 3, we can also see that TLLND algorithm (13) has better MSE, NMSE, PSNR, OA, and PA values, thus verifying the previous visual performance.
In an environment with the random noise, the extraction results are shown in Fig. 2f–h, and the processing result of DTZN algorithm (15) is still very unsatisfactory. The proposed TLLND algorithm (13) seems to be less affected by the noise than MNI algorithm (16). From Table 3, we can also see that TLLND algorithm (13) has better MSE, NMSE, PSNR, OA, and PA values, thus verifying the previous visual performance.
4.3.2 LandSat 8
Continuously, Fig. 3a represents the original satellite image of the Arctic sea ice download from the USGS Global Visualization Viewer, whose size is 1098*1071 pixels with 30 m * 30 m spatial resolution. Subsequently, DTZN (15) algorithm still does not seem to be able to handle this kind of problem. The proposed TLLND algorithm (13) and MNI algorithm (16) have the same performance under a zero noise environment. Similarly, in the random noise environment, Table 4 shows that TLLND algorithm (13) performs better than MNI algorithm (16).
4.3.3 Terra/Aqua
The third extraction experiment is based on satellite image Fig. 4a obtained from National Aeronautics and Space Administration website, whose size is 1092*1092 pixels with 500 m * 500 m patial resolution. Comparing the other two algorithms comprehensively, TLLND algorithm (13) still has better performance in dealing with the sea ice extraction problem, especially in the environment of the random noise.
The three experiments select remote sensing imagery from different satellites, where the background includes not only sea ice and seawater, but also clouds and fragmented sea ice, which extremely increases the difficulty of the Arctic sea ice extraction task. The proposed TLLND algorithm (13) still has high accuracy under such conditions, reflecting its stable performance and robustness.
4.4 Experiment 2: Different Noise Environment
In this part, a high-resolution observation image from Landsat 8 OLI is used as experimental data to extract Arctic sea ice under various noise disturbances, that is, zero noise \(\varvec{f}_k =0\), constant noise \(\varvec{f}_k =4\), linear noise \(\varvec{f}_k =5k\), and random noise \(\varvec{f}_k \in 4 \times [-1,1]\). The original LandSat 8 satellite image of the Arctic sea ice, downloaded from the USGS Global Visualization Viewer, has a size of 6504 by 6261 pixels and offers a spatial resolution of 30 m by 30 m. Figure 5 and Table 6 illustrate the results of extraction experiments.
4.4.1 Zero Noise
It can be found that MSE, NMSE, PSNR, OA, AA, and PA values of TLLND algorithm (13) and the MNI algorithm (16) are same in the extraction results. The proposed TLLND algorithm (13) and MNI algorithm (16) may have the same performance in the zero noise environment. The results of DTZN algorithm (15) are still not very good, and only a little outline can be extracted.
4.4.2 Constant Noise
Under the constant noise environment, all performance indexes of TLLND algorithm (13) are better, indicating that TLLND algorithm (13) has better noise tolerance than DTZN (15) and MNI algorithm (16) on Arctic sea ice extraction task.
4.4.3 Linear Noise
Similar to previous experiments results, TLLND algorithm (13) has the better performance compared to other algorithms in the face of the time-varying linear noise.
4.4.4 Random Noise
The extraction results of the proposed TLLND algorithm (13) still has the best performance compared to other algorithms when faced with high-resolution images.
4.5 Summary
Through the above comparative experiments, it could be concluded that the extraction results of TLLND algorithm (13) and MNI algorithm (16) have the same performance in the zero noise environment. However, under various noise environments, the extraction performance of the proposed TLLND algorithm (13) is best among three algorithms, which indicates that the proposed TLLND algorithm (13) has the best generalization and noise-tolerance ability.
5 Conclusion
In this paper, a transfer-learning-like neural dynamics (TLLND) algorithm has been proposed, which combines both the advantages of neural dynamics (ND) and deep learning, and can extract sea ice from different satellite remote sensing imagery. A parametric method has been used to estimate the coefficients in the ND algorithm to obtain the TLLND algorithm. The TLLND algorithm is used to aid the constrained energy minimization scheme for the sea ice extraction task. Then, rigorous theoretical proofs for the convergence and stability of the proposed TLLND algorithm have been provided, and its superiority in effectiveness, robustness, and generalization performance have been shown through comparative experiments with other state-of-the-art algorithms. This research has provided a new idea and framework for solving complex image processing problems by combining dynamical systems and deep learning methods. While the proposed algorithm has demonstrated promising performance in Arctic sea ice extraction, there is still room for improvement in terms of its robustness in complex environments, generalization capability to new datasets, computational efficiency, and applicability to other remote sensing image processing tasks. Future work could focus on addressing these limitations, and combining the algorithm with other advanced techniques to broaden its practical applications.
References
Zhu M, Hu G, Li S, Zhou H, Wang S (2022) FSFADet: arbitrary-oriented ship detection for SAR images based on feature separation and feature alignment. Neural Process Lett 54(3):1995–2005
Wang Y, Zhang L, Song Z (2023) Complex-valued UNet for radar image segmentation. Neural Process Lett 55(6):8151–8162
Smith DM, Eade R, Andrews M, Ayres H, Clark A, Chripko S, Deser C, Dunstone N, García-Serrano J, Gastineau G et al (2022) Robust but weak winter atmospheric circulation response to future Arctic sea ice loss. Nat Commun 13(1):727
Zhang P, Chen G, Ting M, Ruby Leung L, Guan B, Li L (2023) More frequent atmospheric rivers slow the seasonal recovery of Arctic sea ice. Nat Clim Chang 13(3):266–273
Song W, Gao W, He Q, Liotta A, Guo W (2021) Si-stsar-7: a large sar images dataset with spatial and temporal information for classification of winter sea ice in hudson bay. Remote Sensing 14(1):168
Bai Y, Zhao Z, Wang X, Jin X, Zhang B (2022) Continuous positioning with recurrent auto-regressive neural network for unmanned surface vehicles in GPS outages. Neural Process Lett 54(2):1413–1434
Dabboor M, Geldsetzer T (2014) Towards sea ice classification using simulated RADARSAT constellation mission compact polarimetric SAR imagery. Remote Sens Environ 140:189–195
Wakabayashi H, Mori Y, Nakamura K (2013) Sea ice detection in the sea of Okhotsk using PALSAR and MODIS data. IEEE J Sel Topics Appl Earth Observ Remote Sens 6(3):1516–1523
Venugopal N (2020) Automatic semantic segmentation with DeepLab dilated learning network for change detection in remote sensing images. Neural Process Lett 51:2355–2377
Zhu SC, Mumford D (1997) Prior learning and Gibbs reaction-diffusion. IEEE Trans Pattern Anal Mach Intell 19(11):1236–1250
Liu R, Lin Z, Zhang W, Su Z (2010) Learning PDEs for image restoration via optimal control. In: Computer Vision–ECCV 2010: 11th European Conference on Computer Vision, Heraklion, Crete, Greece, September 5-11, 2010, Proceedings, Part I 11, pp. 115–128. Springer
Zhang Y, He Z, Wei C (2002) Self-organizing transient chaotic neural network for cellular channel assignment. Neural Process Lett 16(24):29–41
Weinan E (2017) A proposal on machine learning via dynamical systems. Commun Math Stat 5(1):1–11
Lessard L, Recht B, Packard A (2016) Analysis and design of optimization algorithms via integral quadratic constraints. SIAM J Optim 26(1):57–95
Yuan Z, Ban X, Zhang Z, Li X, Dai H-N (2023) ODE-RSSM: learning stochastic recurrent state space model from irregularly sampled data. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 37, pp 11060–11068
Zhang B, Li X, Feng S, Ye Y, Ye R (2022) Metanode: prototype optimization as a neural ode for few-shot learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 36, pp 9014–9021
Cheng J, Xiong Y (2022) Parameter control based cuckoo search algorithm for numerical optimization. Neural Process Lett 54:3173–3200
Ren J, Xiong Y, Dai Y (2023) Learning transferable feature representation with swin transformer for object recognition. Neural Process Lett 55(4):2211–2223
Gasparin A, Lukovic S, Alippi C (2022) Deep learning for time series forecasting: the electric load case. CAAI Trans Intell Technol 7(1):1–25
Yu A, Huang W, Xu Q, Sun Q, Guo W, Ji S, Wen B, Qiu C (2023) Sea ice extraction via remote sensed imagery: Algorithms, datasets, applications and challenges. arXiv preprint arXiv:2306.00303
Trentin E (2023) Multivariate density estimation with deep neural mixture models. Neural Process Lett 55(3):9139–9154
Jin L, Li S, Hu B, Liu M, Yu J (2019) A noise-suppressing neural algorithm for solving the time-varying system of linear equations: A control-based approach. IEEE Trans Ind Inf 15(1):236–246
Chen L, Jin L, Shang M (2022) Zero stability well predicts performance of convolutional neural networks. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol 36, pp 6268–6277
Wang G, Hao Z, Huang H, Zhang B (2023) A proportional-integral iterative algorithm for time-variant equality-constrained quadratic programming problem with applications. Artif Intell Rev 56(5):4535–4556
Guo J, Zhang Y (2021) Stepsize interval confirmation of general four-step DTZN algorithm illustrated with future quadratic programming and tracking control of manipulators. IEEE Trans Syst Man Cybern Syst 51(3):1662–1670
Shi Y, Qiu B, Chen D, Li J, Zhang Y (2017) Proposing and validation of a new four-point finite-difference formula with manipulator application. IEEE Trans Ind Inf 14(4):1323–1333
Fu D, Huang H, Wei L, Xiao X, Jin L, Liao S, Fan J, Xie Z (2022) Modified newton integration algorithm with noise tolerance applied to robotics. IEEE Trans Syst Man Cybern Syst 52(4):2134–2144
Qi Y, Jin L, Luo X, Zhou M (2022) Recurrent neural dynamics models for perturbed nonstationary quadratic programs: a control-theoretical perspective. IEEE Trans Neural Netw Learn Syst 33(3):1216–1227
Jin J, Chen W, Chen C, Chen L, Tang Z, Chen L, Wu L, Zhu C (2023) A predefined fixed-time convergence ZNN and its applications to time-varying quadratic programming solving and dual-arm manipulator cooperative trajectory tracking. IEEE Trans Ind Inf 19(8):8691–8702
Liufu Y, Jin L, Xu J, Xiao X, Fu D (2022) Reformative noise-immune neural network for equality-constrained optimization applied to image target detection. IEEE Trans Emerg Top Comput 10(2):973–984
Fu D, Huang H, Xiao X, Xia L, Jin L (2022) A generalized complex-valued constrained energy minimization scheme for the arctic sea ice extraction aided with neural algorithm. IEEE Trans Geosci Remote Sens 60:1–17
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Peng, B., Zhang, K., Jin, L. et al. A Transfer-Learning-Like Neural Dynamics Algorithm for Arctic Sea Ice Extraction. Neural Process Lett 56, 221 (2024). https://doi.org/10.1007/s11063-024-11664-3
Accepted:
Published:
DOI: https://doi.org/10.1007/s11063-024-11664-3