1 Introduction

The escalating challenges posed by global warming and the subsequent melting of sea ice have put the resource-rich polar regions into the global spotlight. Specifically, the opening of Arctic sea routes could potentially alter the worldwide navigation landscape, underlining the growing strategic significance of this region[1,2,3]. Furthermore, the Arctic’s integral role in Earth’s ecology as a natural cold source is vital, which will influence ecological equilibrium, energy transfer, and climate change [4]. Consequently, changes in the motion, melting, area, and thickness of Arctic sea ice directly impact the energy transfer and balance between the Arctic and neighboring oceans. Therefore, real-time dynamic monitoring and observation of Arctic sea ice are of great importance for the exploitation of Arctic resources, maintaining the balance of the ecosystem, and predicting the world’s extreme climate.

With the improvement of the quality and accuracy of high-resolution satellite images, remote sensing has become one of the most effective means of monitoring sea ice conditions[5]. Sea ice identification is an important part of sea ice monitoring using remote sensing imagery, but noise disturbances cannot be ignored in the practical application of remote sensing imagery in real scenarios[6]. In recent years, SOTA methods for the extraction of sea ice can be divided into the following three categories: observation-based methods, scattering-coefficients-based methods, and neural-networks-based methods.

Observation-based methods rely on observations of sea ice conditions to identify and extract sea ice from remote sensing imagery. These methods often use manual or semi-automatic schemes for sea ice identification, such as threshold and edge detection[7]. However, the scattering-coefficients-based methods are techniques that use the reflected signals from satellite scatterometers to distinguish sea ice from seawater[8]. neural-networks-based methods apply machine learning techniques to analyse and identify sea ice types and distributions. These methods can use satellite remote sensing data, such as synthetic aperture radar imagery, to extract sea ice characteristics and parameters such as concentration, thickness, shape, and movement [9].

Although these three major methods and other related investigations for sea ice extraction have demonstrated constructive and promising results in methodology and practical applications, they are not without limitations. For example, the observation-based methods, relying on intuition or experience to build a model, lack rationality and generalization thereby its broader application is limited. The scattering-coefficients-based methods heavily depend on professional knowledge in remote sensing information processing and accumulated technique in remote image acquisition, thus limiting its wider application. The neural-networks-based methods exhibit competency for various complex sea ice extraction tasks, however its performance is constrained by the quantity and quality of the dataset, while it can be expensive to collect enough sea ice data manually to build a specialized model. Therefore, it is necessary to construct a new method that combines both the high performance of learning methods with the convenience of traditional numerical methods.

In the 1990s, Zhu et al. present a theory of utilizing learning methods to learn partial differential equations for image noise processing [10]. The combination of learning methods and numerical methods has achieved positive results before deep learning becomes popular [11, 12]. Subsequently, researchers discuss the idea of modelling high dimensional non-linear functions with dynamical systems, i.e. the relationship between differential equations and deep learning. In [13], Weinan E demonstrates that deep neural networks can be viewed as discrete dynamical systems. A well-known example is that each residual block of ResNets can be considered as \({X}_{k + 1} - {X_k}=f(X_k)\), and the number of iterations k can be interpreted as the artificial time [14]. Then, it can be obtained that \(\dot{X}=f(X)\), which is the forward Euler discretization. Inspired by the connection between deep learning and dynamic systems, many meaningful works have been investigated in this field in recent years. For example, the structural design of deep learning models inspired by the connection between network structures and numerical differential equations [15], the design of optimizers that use numerical methods to improve performance [16], and the transfer of optimal control to deep learning to improve its stability [17]. The above works all take the advantages and characteristics of numerical methods to improve the performance of deep learning models, however work of utilizing the advantages of deep learning methods to improve the performance of dynamical system models is rarely seen.

Human has a remarkable ability to generalize from past experiences and learn new tasks quickly, which is essential for survival in a complex and dynamic world [18]. However, the current deep learning models rely heavily on large-scale supervised data, which limits their generalization performance and makes them vulnerable to data quality issues [19]. In contrast, human beings can learn from a few examples and transfer their knowledge across domains and tasks. This is especially important for applications such as sea ice extraction, where collecting enough labelled data to train a specialized model can be both costly and time-consuming, as sea ice data collection is challenging and requires specialized equipments and expertise [20]. Therefore, developing an algorithm with transferring and generalization capabilities is very meaningful. This algorithm can not only adapt and optimize in different domains and tasks, but can also learn and reason from a small amount of data.

The purpose of the deep learning algorithm is to learn a function that maps the input (i.e., the set of independent variables X) to the output (i.e., the target variable Y). For example, \(Y=z(X)+\epsilon \). To estimate the unknown function z(X), a model should be fitted with appropriate data. The form of function z(X) is usually unknown, so it may not be possible to obtain it without fitting different models or making some assumptions about the form of the function z(X). The method of making an assumption about the form of the function z(X) and using a suitable dataset to train the model and estimate the coefficient of the function z(X) through the learning process is called parametric method [21]. For example, assume that the unknown function z(X) is linear, i.e. \(f(X)={\zeta }_0+{\zeta }_1{X}_1+...+{\zeta }_n{X}_n\), in which \({\zeta }_i(i=1,2,\dots ,n)\) is the coefficient to be learned, n is the number of independent variables, and X is the input. Based on an assumption about the form of the function and selecting a model that fits the assumption, the learning process trains the model and estimates the coefficients.

It is feasible to estimate the coefficients of the ND algorithm using parametric methods. Parametric methods have the advantage of high computational performance and low training data requirements. The ND approaches have the adaptability to solve dynamical system problems with both the robustness and generalizability guaranteed by theoretical analyses. The combination of these two methods paves a new way to solve specific problems.

Fig. 1
figure 1

Overview of this paper

This paper designs and analyses a transfer-learning-like neural dynamics (TLLND) algorithm for aiding the constrained energy minimization (CEM) scheme. An ND algorithm for solving the CEM scheme with learnable coefficients is proposed, and then parameters of the TLLND algorithm are determined by a parametric method with the structure diagram shown in Fig. 1. To ensure the validity of the proposed TLLND algorithm, theoretical analyses are provided. Finally, comparative experiments are conducted, which show that the TLLND algorithm has better efficiency and robustness compared to SOTA methods. The rest of this paper is divided into four sections. Section 2 introduces the CEM scheme and parametric methods. Section 3 presents theoretical analyses on the convergence and stability of the proposed TLLND algorithm. In Sect. 4, comparative experiments of Arctic sea ice extraction are presented. Section 5 concludes this paper.

2 Preliminary and TLLND Algorithm

In this section, the CEM scheme formulations and the TLLND algorithm are provided.

2.1 Base Scheme

It should be noted that the extraction of sea ice from the Arctic needs to be supported by remote sensing satellite imagery. CEM schemes are presented to detect the targets in images, and the main principle is to suppress background information so that the target is highlighted. The CEM scheme does not require the background information of the image and can adaptively detect the spectral information of the target. More specifically, it determines a finite impulse response (FIR) filter \(\varvec{\omega }(t) \in \mathbb {R}^{r}\) from the original image data and targets, and allows the original image to pass through the filter to obtain detection results, where r is the number of bands in the original image. The CEM scheme can be derived by solving the following linear constrained optimization problem:

$$\begin{aligned} \begin{aligned}&\text {min } \varvec{\omega }^{\textrm{T}}(t){R}\varvec{\omega }(t)\\&\text {s.t. } \varvec{\omega }^{\textrm{T}}(t)\varvec{v}={I}, \end{aligned} \end{aligned}$$
(1)

where \(\varvec{v}\in \mathbb {R}^{r\times 1}\) denotes the light spectrum vector of sea ice, the symmetric matrix \({R} \in \mathbb {R}^{r\times r}\) represents the self-correlation matrix of the original sea ice image, \(I\in \mathbb {R}^{1\times 1}\) denotes an identity matrix, and \(^{\textrm{T}}\) represents the transpose of matrix.

In this work, the above CEM scheme for sea ice extraction is computed with the aid of the TLLND algorithm. Next, a Lagrange-multiplier vector is utilized to solve CEM problem (1). The related Lagrange function is \( \vartheta (\varvec{\omega }(t), \zeta (t), t)=\varvec{\omega }^{\textrm{T}}(t){R}\varvec{\omega }(t)-\zeta (t)^{\textrm{T}}(\varvec{\omega }^{\textrm{T}}(t)\varvec{v}-{I}) , \) where \(\zeta (t)\in {\mathbb {R}^{ r}}\) denotes the Lagrange-multiplier vector.

Let

$$\begin{aligned} \begin{aligned}&W(t)=\left[ \begin{array}{ll} 2R(t) &{}\varvec{v}(t) \\ \varvec{v}^{\textrm{T}}(t) &{} 0_{1 \times 1} \end{array}\right] \in \mathbb {R}^{(r+1) \times (r+1)} \\&\textbf{v}(t)=\left[ \begin{array}{c} \varvec{\omega }(t) \\ \zeta (t) \end{array}\right] \in \mathbb {R}^{r+1}, \quad \textbf{g}(t)=\left[ \begin{array}{c} 0 \\ 1 \end{array}\right] \in \mathbb {R}^{r+1}, \end{aligned}. \end{aligned}$$

At last, (1) can be rearranged as a time-dependent linear equation form:

$$\begin{aligned} W(t)\textbf{v}(t)= \textbf{g}(t). \end{aligned}$$
(2)

2.2 Continuous and Discrete ND Models

To solve the CEM scheme, the following loss function is first defined:

$$\begin{aligned} \varvec{l}(t)= W(t)\textbf{v}(t)- \textbf{g}(t). \end{aligned}$$
(3)

The integration-enhanced continuous ND algorithm is shown as follows [22]:

$$\begin{aligned} \dot{\varvec{l}}(t)=-\varrho {\varvec{l}}(t)-\lambda \int _{0}^{t} {\varvec{l}}(\delta ) \textrm{d} \delta , \end{aligned}$$
(4)

where \(\varrho >0\) and \(\lambda >0\). A continuous ND model for solving the CEM scheme is obtained by substituting (3) into (4)

$$\begin{aligned} \begin{aligned} \dot{\textbf{v}}(t)=&W^{\dagger }(t)\bigg (\dot{\textbf{g}}(t)-\dot{ W}(t)\textbf{v}(t)-\varrho ( W(t)\textbf{v}(t)-\textbf{g}(t))\\&-\lambda \int _{0}^{t}( W(\delta ) \textbf{v}(\delta )-\textbf{g}(\delta )) \textrm{d} \delta \bigg ), \end{aligned} \end{aligned}$$
(5)

where \( W^{\dagger }(t)\) represents the pseudoinverse of W(t).

Since there are very few continuous models that can accurately find analytical solutions, we often need numerical solutions, such as Runge-Kutta methods. These methods can approximate the solutions of differential equations by iterative calculations, thus simulating complex dynamical systems. In view of this, we need a discretization method to approximate the future \(\textbf{v}_{k+1}\) with the available information at time moment \(t_{k}\), where k means the k-th iteration. The form of the three-order discrete formula for discretizing (5) is shown as follows:

$$\begin{aligned} {\textbf{v}_{k + 1}} -\alpha {\textbf{v}_k} -\beta {\textbf{v}_{k - 1}} -{\gamma \textbf{v}_{k - 2}}-\tau {\dot{\textbf{v}}_k} =0, \end{aligned}$$
(6)

where \(\alpha \in \mathbb {R}, \beta \in \mathbb {R}\), and \(\gamma \in \mathbb {R}\) denote coefficients of the discrete formula.

2.3 Dynamical System and Recurrent Neural Network

In general, a model of an ordinary differential equation (ODE) can be written in the form of a dynamical system as follows:

$$\begin{aligned} \dot{\varvec{x}}(t)=g(\varvec{x}(t), t), \quad t \in [0,+\infty ). \end{aligned}$$
(7)

There are many numerical methods for solving ODEs, such as Euler method, Runge–Kutta method, Adams method, etc. The Euler iteration formula, is presented here:

$$\begin{aligned} \varvec{x}(t+h)=\varvec{x}(h)+hf(\varvec{x}(h),t), \end{aligned}$$
(8)

where h represents the step size. The idea of the Euler method is simple, which is to use \((\varvec{x}(t+h)-\varvec{x}(t))/{h}\) to approximate the derivative term \(\dot{\varvec{x}}(t)\). As long as the initial condition \(\varvec{x}(0)\) is given, we can iteratively calculate the result of each time point according to (8).

In short, as long as the model takes \((\varvec{x}_1,\varvec{x}_2,...,\varvec{x}_n)\) as input, \((\varvec{y}_1,\varvec{y}_2,...,\varvec{y}_k)\) as output, and satisfies the following recurrence relation

$$\begin{aligned} {\varvec{y}_{k+1}} = f({\varvec{y}_{k}},{\varvec{x}_{k+1}},t),\end{aligned}$$
(9)

it can be called a recurrent neural network (RNN).

As mentioned above, the number of iterations k of a deep learning model can be considered as the artificial time t[14]. From Eqs. (8) and (9), we can see that, in (8), t is a floating variable and in (9) t is an integer variable, and otherwise there seems to be no significant difference between (8) and (9). In fact, in (8) we can consider h as time and let \(t = nh\), and then (8) becomes

$$\begin{aligned} \varvec{x}((n+1)h)=\varvec{x}(nh)+hf(\varvec{x}(nh),nh), \end{aligned}$$

where the time variable n is an integer. In this way, we know that the discrete dynamical system (8) is in fact just a special case of an RNN, and the same holds for the discrete form of (5).

2.4 Parametric Method

Deep neural networks and discrete dynamical systems have some similarities and connections. In this section, we propose a parametric method using the RNN to estimate the coefficients inside of a discrete dynamical system, which is in the form of (6). The network structure of the RNN can be designed according to the ND model for the solution of CEM schemes as follows:

$$\begin{aligned} \varvec{U}_{n+1}=\alpha \varvec{U}_n+\beta \varvec{U}_{n-1}+(1-\alpha -\beta ) \varvec{U}_{n-2}+f_n\left( \varvec{U}_n\right) , \end{aligned}$$
(10)

where \(\alpha \) and \(\beta \) are trainable coefficients shared within each layer. During the training process, the learnable coefficients are continuously updated and the TLLND algorithm for solving CEM scheme (1) is obtained after the training phase. A schematic of the RNN training architecture is presented in Fig. 1. In the training phase of the RNN, the predicted value from the previous time step is fed back into the network as input at each subsequent time step, and the output is used to predict the value for the next time step. The true value for each time step is also provided as the label for the network to learn from and adjust its parameters to minimize the difference between the predicted values and the true values. Since we are using parametric methods to try to obtain a model of a dynamical system through deep learning, this differs from the standard dynamical system model inference and standard deep learning procedure, so some elements in the training process need to be addressed.

2.4.1 Supervisory Label

We can view the problem that the parametric method solves as a time-series prediction problem, where future information is predicted based on known information. During the training process, true values of future moments are used as the supervisory label.

2.4.2 Dataset

Now that the problem (1) has been transformed into a system of linear equations problem, we can generate the training data. To generate the training data, we first determine the coefficient matrix and constant term vector of the system of linear equations. We can then use any solution method, such as Gaussian elimination or iterative methods, to obtain the solution vectors for the system of linear equations. This gives us a set of training data, including the input (coefficient matrix and constant term vector) and the output (solution vector). We can repeat this process several times to generate different training data. Finally, we can use this training data to train and evaluate deep learning models.

2.4.3 Coefficients Initialization

In the back-propagation of the RNN model, the initialization should be done carefully and the learning rate should be chosen carefully to avoid the issue of gradient explosion. To prevent the solution from becoming divergent, we set the initial coefficients to be zero stable. Zero stability can effectively improve the convergence speed and generalization ability of the model, because it can reduce the model’s sensitivity to noise and disturbance, thereby improving the model’s robustness and reliability [23]. In this paper, we will introduce a training method for the RNN model based on zero-stable initialization, and the characteristic equation of Eq. (10) is

$$\begin{aligned} r(\rho )=\rho ^{3}-(1-\alpha -\beta ) \rho ^{2}-\beta \rho -\alpha . \end{aligned}$$
(11)

To ensure zero stability at the initialization stage, we apply the root condition for Eq. (6) to coefficients \(\alpha \) and \(\beta \). During training, we use a low learning rate of 0.001 and a high number of epochs 200 to achieve optimal performance.

One of the challenges of solving a three-order ODE is that there are countless sets of valid coefficients, which means that a different initial value always leads to a different set of valid coefficients. This makes it difficult to find a general solution that works for any initial value. However, one possible approach is to use an RNN to learn the coefficients from data. By training the RNN on a sequence of initial values and their corresponding solutions, we can obtain a set of coefficients that approximate the ODE well. Substituting Eq. (6) with the coefficients trained by the RNN leads to a discretization equation that can be used to predict future values of the ODE given any initial value:

$$\begin{aligned} {\textbf{v}_{k + 1}} -1.4391 {\textbf{v}_k} +0.8782 {\textbf{v}_{k - 1}} -{0.4391 \textbf{v}_{k - 2}}-\tau {\dot{\textbf{v}}_k} =0. \end{aligned}$$
(12)

Substituting our parametric discrete Eq. (12) with (5) leads to

$$\begin{aligned} \begin{aligned} \textbf{v}_{k + 1}=&\,1.4391 {\textbf{v}_k} -0.8782 {\textbf{v}_{k - 1}} +{0.4391 \textbf{v}_{k - 2}}\\&+W_{k}^{\dagger } \bigg (\tau \dot{\textbf{g}}_{k}-\tau \dot{ W_{k}}\textbf{v}_{k}-h_1( W_k\textbf{v}_{k}-\textbf{g}_{k})\\&-h_2 \sum _{i=0}^{k}( W_i \textbf{v}_i-\textbf{g}_i) \bigg ), \end{aligned} \end{aligned}$$
(13)

where \({h_1}=\varrho \tau \) and \({h_2}=\lambda \tau ^{2}\) are parameters of the discrete-time system of linear equations. The proposed TLLND algorithm (13) is obtained by embedding the coefficients into the ND algorithm after the training phase of the parametric method.

2.5 SOTA Algorithms

The SOTA algorithms for solving CEM scheme (1) are given in this subsection with the comparisons listed in Table 1.

  1. 1.

    The proportional-integral iterative algorithm (PII)[24] designed for solve (1) is constructed as

    $$\begin{aligned} {\textbf{v}_{k + 1}}= {\textbf{v}_{k}}-W_{k}^{\dagger } (h_1 (W_k\textbf{v}_{k}-\textbf{g}_{k})+h_2\sum _{i=0}^{k} (W_i\textbf{v}_{i}-\textbf{g}_{i})), \end{aligned}$$
    (14)
  2. 2.

    The discrete-time zeroing neurodynamics (DTZN) model[25, 26] designed for solve (1) is given as

    $$\begin{aligned} \begin{aligned} \textbf{v}_{k + 1} =&\frac{1}{2} {\textbf{v}_k}+\frac{1}{3} {\textbf{v}_{k-1}}+\frac{1}{6} {\textbf{v}_{k-2}}-\frac{5}{3} W_{k}^{\dagger }\bigg (\tau \dot{ W_{k}}\textbf{v}_{k}\\&+h_1(W_k\textbf{v}_{k}+\textbf{g}_{k})-\tau \dot{\textbf{g}}_{k} \bigg ). \end{aligned} \end{aligned}$$
    (15)
  3. 3.

    In addition, the modified newton integration (MNI) neural algorithm[27] is provided here for comparison:

    $$\begin{aligned} {\textbf{v}_{k + 1}}= {\textbf{v}_{k}}-W_{k}^{\dagger } (W_k\textbf{v}_{k}-\textbf{g}_{k}+h_2\sum _{i=0}^{k}( W_i \textbf{v}_i-\textbf{g}_i)). \end{aligned}$$
    (16)
Table 1 Comparisons among different methods for CEM scheme

3 Theoretical Analyses

In this section, we provide theoretical analyses on the convergence and stability of the proposed TLLND algorithm (13).

3.1 Nonlinear Transformation

By defining the error function and using the following theorem, we present a simple form of TLLND algorithm (13) to enable further analysis.

Theorem 1

From the error perspective, TLLND algorithm (13) is transformed into a linear decoupled dynamical system to solve CEM scheme (1) as follows:

$$\begin{aligned} \begin{aligned} {\varvec{l}_{k + 1}} -(1.4391-h_{1}) {\varvec{l}_k}&+0.8782 {\varvec{l}_{k - 1}} -{0.4391\varvec{l}_{k - 2}}\\&+h_{2} \sum _{i=0}^{k}\varvec{l}_{i}+\varvec{o}({\tau ^2})=\varvec{0}, \end{aligned} \end{aligned}$$
(17)

where \(\varvec{l}_{k}=W_k\textbf{v}_{k}-\textbf{g}_{k} \) and \(\varvec{o}({\tau ^2})\) is the truncation errors of vectors.

Proof

The proposed TLLND algorithm (13) is written as

$$\begin{aligned} \begin{aligned} \textbf{v}_{k + 1}&=1.4391 {\textbf{v}_k} -0.8782 {\textbf{v}_{k - 1}} +{0.4391 \textbf{v}_{k - 2}}\\&+W_{k}^{\dagger } \bigg (\tau \dot{\textbf{g}}_{k}-\tau \dot{ W_{k}}\textbf{v}_{k}-h_{1}\varvec{l}_{k}-h_{2} \sum _{i=0}^{k}\varvec{l}_{i}\bigg ). \end{aligned} \end{aligned}$$
(18)

Expanding \(\textbf{v}_{k + 1}\), \({\textbf{v}_k}\), and \({\textbf{v}_{k - 1}}\) using the Taylor expansion results in the following equations:

$$\begin{aligned} \begin{aligned} \textbf{v}_{k+1}=\textbf{v}_{k}+\tau \dot{\textbf{v}}_{k}+\varvec{o}\left( \tau ^{2}\right) \\ \textbf{v}_{k-1}=\textbf{v}_{k}-\tau \dot{\textbf{v}}_{k}+\varvec{o}\left( \tau ^{2}\right) \\ \textbf{v}_{k-2}=\textbf{v}_{k}-2 \tau \dot{\textbf{v}}_{k}+\varvec{o}\left( \tau ^{2}\right) . \end{aligned} \end{aligned}$$

Combining the above four equations, we can get

$$\begin{aligned} \tau {\dot{\textbf{v}}_k} + \varvec{o}({\tau ^2}) =W_{k}^{\dagger } \bigg (\tau \dot{\textbf{g}}_{k}-\tau \dot{ W_{k}}\textbf{v}_{k}-h_{1}\varvec{l}_{k}-h_{2} \sum _{i=0}^{k}\varvec{l}_{i}\bigg ), \end{aligned}$$

which is transformed into

$$\begin{aligned} \tau {W_{k}}\dot{\textbf{v}}_k + \tau {\dot{W_{k}}\textbf{v}}_k- \tau {\dot{\textbf{g}}_k}=-h_{1}\varvec{l}_{k}-h_{2} \sum _{i=0}^{k}\varvec{l}_{i}+\varvec{o}({\tau ^2}). \end{aligned}$$

The equation above can be simplified further as follows:

$$\begin{aligned} {W_{k}}\dot{\textbf{v}}_k + {\dot{W_{k}}\textbf{v}}_k- {\dot{\textbf{g}}_k}=-\frac{1}{\tau }\bigg (h_{1}\varvec{l}_{k}+h_{2} \sum _{i=0}^{k}\varvec{l}_{i}+\varvec{o}({\tau ^2})\bigg ). \end{aligned}$$
(19)

Since \(\varvec{l}_{k}=W_k\textbf{v}_{k}-\textbf{g}_{k} \), we have

$$\begin{aligned} \dot{\varvec{l}}_{k}= {W_{k}}\dot{\textbf{v}}_k + {\dot{W_{k}}\textbf{v}}_k- {\dot{\textbf{g}}_k}. \end{aligned}$$

Thus, (19) can be simplified as

$$\begin{aligned} \dot{\varvec{l}}_{k}=-\frac{1}{\tau }\left( h_{1}\varvec{l}_{k}+h_{2} \sum _{i=0}^{k}\varvec{l}_{i}+\varvec{o}({\tau ^2})\right) . \end{aligned}$$

Substituting \( \dot{\varvec{l}}_{k}\) with our parametric discrete Eq. (12) leads to

$$\begin{aligned} \begin{aligned} {\dot{\varvec{l}}_k}&= \frac{{\varvec{l}_{k + 1}} -1.4391 {\varvec{l}_k} +0.8782 {\varvec{l}_{k - 1}} -{0.4391\varvec{l}_{k - 2}}}{\tau }\\&=-\frac{1}{\tau }\left( h_{1}\varvec{l}_{k}+h_{2} \sum _{i=0}^{k}\varvec{l}_{i}+\varvec{o}({\tau ^2})\right) , \end{aligned} \end{aligned}$$

which can be further written as

$$\begin{aligned} \begin{aligned}&{\varvec{l}_{k + 1}} -1.4391 {\varvec{l}_k} +0.8782 {\varvec{l}_{k - 1}} -{0.4391\varvec{l}_{k - 2}}=\\&\quad -h_{1}\varvec{l}_{k}-h_{2} \sum _{i=0}^{k}\varvec{l}_{i}+\varvec{o}({\tau ^2}). \end{aligned} \end{aligned}$$
(20)

At last, (20) can be further simplified as

$$\begin{aligned} \begin{aligned} {\varvec{l}_{k + 1}} -(1.4391-h_{1}) {\varvec{l}_k}&+0.8782 {\varvec{l}_{k - 1}} -{0.4391\varvec{l}_{k - 2}}\\&+h_{2} \sum _{i=0}^{k}\varvec{l}_{i}+\varvec{o}({\tau ^2})=\varvec{0}. \end{aligned} \end{aligned}$$

The proof is thus completed.

3.2 Convergence Analyses

To demonstrate the convergence of the proposed algorithm, the following theorem is given.

Theorem 2

The proposed TLLND algorithm (13), used for solving CEM scheme (1), has a residual error \(\mathop {\lim }\limits _{k \rightarrow \infty } || {\varvec{l}_{k}}|{|_2}\) that globally converges to \({o}({\tau ^2})\).

Proof

The proposed TLLND algorithm (13) is converted into a linear system that includes a residual error term, as shown in Eq. (17). Let \(\varvec{l}_k^u\) be the \(u\text {th}\) element of \({\varvec{l}_{k}}\) with \(u \in \{1, \ldots , r+1\}\). Then, we get

$$\begin{aligned} \begin{aligned} {{l}_{k + 1}^u} -(1.4391-h_{1}) {{l}_k^u}&+0.8782 {{l}_{k - 1}^u} -{0.4391{l}_{k - 2}^u}\\&+h_{2} \sum _{i=0}^{k}{l}_{i}^u+{o}({\tau ^2})=0 \end{aligned} \end{aligned}$$
(21)

and

$$\begin{aligned} \begin{aligned} {{l}_{k}^u} -(1.4391-h_{1}) {{l}_{k-1}^u}&+0.8782 {{l}_{k - 2}^u} -{0.4391{l}_{k - 3}^u}\\&+h_{2} \sum _{i=0}^{k-1}{l}_{i}^u+{o}({\tau ^2})=0. \end{aligned} \end{aligned}$$
(22)

When Eq. (22) is subtracted from Eq. (21), the resulting expression can be shown as follows:

$$\begin{aligned} \begin{aligned} {{l}_{k+1}^u} -(2.4391-h_{1}-h_{2}) {{l}_{k}^u} +(2.3173-h_{1}){{l}_{k - 1}^u} \\ -{1.3173{l}_{k - 2}^u}+{0.4391{l}_{k - 3}^u}+{o}({\tau ^2})=0. \end{aligned} \end{aligned}$$
(23)

Let \(\varvec{\sigma } _{k+1}^u = {[{{l}_{k + 1}^u} ;{{l}_{k }^u} ;{{l}_{k - 1}^u} ;{{l}_{k - 2}^u} ]}\), and then we get \(\varvec{\sigma } _{k}^u = {[{{l}_{k }^u} ;{{l}_{k-1 }^u} ;{{l}_{k - 2}^u} ;{{l}_{k - 3}^u} ]}\). Equation (23) can be described as the following equation:

$$\begin{aligned} \varvec{\sigma } _{k+1}^u =D\varvec{\sigma } _{k}^u+{o}({\tau ^2}), \end{aligned}$$
(24)

where matrix D is defined as

$$\begin{aligned} D=\left[ {\begin{array}{*{20}{c}} \begin{array}{l} a\\ 1\\ 0\\ 0 \end{array}&{}\begin{array}{l} b\\ 0\\ 1\\ 0 \end{array}&{}\begin{array}{l} c \\ 0\\ 0\\ 1 \end{array}&{}\begin{array}{l} d\\ 0\\ 0\\ 0 \end{array} \end{array}} \right] , \end{aligned}$$

with \(a=2.4391-{h_1}-{h_2}\); \(b={h_1}-2.3173\); \(c= 1.3173\); \(d=- 0.4391\). Expanding (24) leads to

$$\begin{aligned} \begin{aligned} \varvec{\sigma }_{k+1}^{u}&=D \varvec{\sigma }_{k}^{u}+{o}\left( \tau ^{2}\right) \\&=D\left( D\varvec{\sigma }_{k-1}^{u}+{o}\left( \tau ^{2}\right) \right) +{o}\left( \tau ^{2}\right) \\&=D^{2} \varvec{\sigma }_{k-1}^{u}+{o}\left( \tau ^{2}\right) \\&=D^{3} \varvec{\sigma }_{k-2}^{u}+{o}\left( \tau ^{2}\right) \\&\quad \vdots \\&=D^{k} \varvec{\sigma }_{1}^{u}+{o}\left( \tau ^{2}\right) . \end{aligned} \end{aligned}$$

The proposed TLLND algorithm (13) in this paper selects the parameters \({h_1=0.8}\) and \(h_2=0.08\), and the real part of the eigenvalue of matrix D has an absolute value that is less than 1. According to the spectral radius theorem, we have \(\mathop {\lim }\limits _{k \rightarrow \infty } D^k=0\), which leads to \(\varvec{\sigma }_{k+1}^{u}={o}\left( \tau ^{2}\right) \). From this, we can derive the following equation:

$$\begin{aligned} \lim _{k \rightarrow \infty }\left\| \varvec{\sigma } _{k+1}\right\| _{2} ={o}\left( \tau ^{2}\right) . \end{aligned}$$

Based on the definition of \(\varvec{\sigma } _{k+1}^u\), if the parameters \({h_1}\) and \({h_2}\) meet certain conditions, it can be concluded that the steady-state computation error \(\lim _{k \rightarrow \infty }\left\| \varvec{l} _{k+1}\right\| _{2}\) of TLLND algorithm (13) for solving CEM scheme (1) is \({o}({\tau ^2})\).

Theorem 2 proves that TLLND algorithm (13) for solving CEM scheme (1) converges globally to the theoretical solution. Research on the robust performance analyses of TLLND algorithm (13) is presented in the following.

3.3 Robustness Analyses

The proposed TLLND algorithm (13) contains a sum term that helps it resist noises. This results in robust performance, which is proven by the following theorem.

Theorem 3

For CEM scheme (1) solved by TLLND algorithm (13), its steady-state computation error \(\lim _{k \rightarrow \infty }\left\| \varvec{l} _{k}\right\| _{2}\) is bounded by

$$\begin{aligned} \left\| \varvec{a} \tau / h_{2}\right\| _{2}+2 \nu \sup _{1 \le n \le k, 1 \le u \le r+1 }\left| \varvec{\varpi }_{n}^{u}\right| /\left( 1-\Vert D\Vert _{2}\right) +{o}\left( \tau ^{2}\right) , \end{aligned}$$

with \({{k}} \rightarrow \infty \) in an environment with noise \({\varvec{f}}_{k} = \varvec{a}k\tau + \varvec{b} +\varvec{\varpi }\), where \(\varvec{a}k\tau + \varvec{b} \) represents the linear noise, and \(\varvec{\varpi }\) represents other noises.

Proof

The proposed TLLND algorithm (13) is a linear system, and the steady-state computation error \(\lim _{k \rightarrow \infty }\left\| \varvec{l} _{k}\right\| _{2}\) in a noise-polluted environment can be split into three subsystems, including \({o}\left( \tau ^{2}\right) \) generated by TLLND algorithm (13) itself, \(\left\| \varvec{a} \tau / h_{2}\right\| _{2}\) caused by a linear noise, and \(2 (r+1) \sup _{1 \le n \le k, 1 \le u\le r+1 }\left| \varvec{\varpi }_{n}^{u}\right| /\left( 1-\Vert D\Vert _{2}\right) \) derived from other noises. The residual error of TLLND algorithm (13) under different noises can be analyzed from different perspectives.

3.3.1 Linear Noise

Equation (17) with the linear noise \({\varvec{f}}_k = \varvec{a}k\tau + \varvec{b}\) can be presented as

$$\begin{aligned} \begin{aligned} {\varvec{l}_{k + 1}} =(1.4391-h_{1}) {\varvec{l}_k}&-0.8782 {\varvec{l}_{k - 1}} +{0.4391\varvec{l}_{k - 2}}\\&-h_{2} \sum _{i=0}^{k}\varvec{l}_{i}+\varvec{a}k\tau + \varvec{b}. \end{aligned} \end{aligned}$$
(25)

The u-th subsystem of Eq. (25) is derived in the following way

$$\begin{aligned} \begin{aligned} {\varvec{l}_{k + 1}^u} =(1.4391-h_{1}) {\varvec{l}_k^u}&-0.8782 {\varvec{l}_{k - 1}^u} +{0.4391\varvec{l}_{k - 2}^u}\\&-h_{2} \sum _{i=0}^{k}\varvec{l}_{i}^u+\varvec{a}^u k\tau + \varvec{b}^u. \end{aligned} \end{aligned}$$
(26)

Using the Z-transform to Eq. (26), we have

$$\begin{aligned} \begin{aligned}&z\varvec{l}^u(z)-z\varvec{l}^u(0)\\&\quad =(1.4391-h_{1}) \varvec{l}^u(z)-0.8782z^{-1}(\varvec{l}^u(z)+z\varvec{l}^u(-1))\\&\qquad +0.4391z^{-2}(\varvec{l}^u(z)+z^2\varvec{l}^u(-2)+z\varvec{l}^u(-1))-\frac{{z{h_2}}}{{z - 1}}{\varvec{l}^u(z)}\\&\quad \quad +\frac{{z\varvec{a}^u\tau }}{({z - 1})^2}+\frac{{z\varvec{b}^u}}{{z - 1}}, \end{aligned} \end{aligned}$$

which is simplified as

$$\begin{aligned} \begin{aligned}&z^3({z - 1})^2\varvec{l}^u(z)-(1.4391-h_{1})z^2({z - 1})^2 \varvec{l}^u(z)\\&\qquad +0.8782z({z - 1})^2\varvec{l}^u(z)-0.4391({z - 1})^2\varvec{l}^u(z)\\&\qquad +h_{2}z^3({z - 1})\varvec{l}^u(z)\\&\quad =z^3({z - 1})^2\varvec{l}^u(0)+(0.4391-0.8782z)z({z - 1})^2\varvec{l}^u(-1)\\&\qquad +0.4391z^2({z - 1})^2\varvec{l}^u(-2)+z^3\varvec{a}^u\tau +z^3(z-1)\varvec{b}^u. \end{aligned} \end{aligned}$$

The above equation can be further developed as

$$\begin{aligned} \begin{aligned} \varvec{l}^u(z)&=\big (z^3({z - 1})^2\varvec{l}^u(0)+(0.4391-0.8782z)z({z - 1})^2\\ \cdot \varvec{l}^u(-1)&+0.4391z^2({z - 1})^2\varvec{l}^u(-2)+z^3\varvec{a}^u\tau +z^3(z-1)\varvec{b}^u\big )\\ /(z-1)&\big (z^4-(2.4391-h_{1}-h_{2})z^3+(2.3173-h_{1})z^2 \\&-1.3173z+0.4391\big ). \end{aligned} \end{aligned}$$

According to the final value theorem, we have

$$\begin{aligned} \begin{aligned}&\mathop {\lim }\limits _{k \rightarrow \infty }{\varvec{l}_k^u}=\mathop {\lim }\limits _{z \rightarrow 1} (z - 1) \varvec{l}^u(z)\\&\quad =\frac{{{\varvec{a}^u}\tau }}{{1 - 2.4391 + {h_1} + {h_2} + 2.3173 - {h_1} - 1.3173 + 0.4391}}\\&\quad =\frac{{{\varvec{a}^u}\tau }}{{{h_2}}}. \end{aligned} \end{aligned}$$

The above conclusion proves that the steady-state error of TLLND algorithm (13) is \(\left\| \varvec{a} \tau / h_{2}\right\| _{2}+{o}({\tau ^2})\) with linear noise \({\varvec{f}}_{k} = \varvec{a}k\tau + \varvec{b}\). The constant noise is a special case of linear noise \({\varvec{f}}_{k} = \varvec{a}k\tau + \varvec{b}\) with \( \varvec{a}=\varvec{0}\). Thus, the steady-state error of TLLND algorithm (13) for solving CEM scheme (1) problems with constant noise is \( {o}({\tau ^2})\).

3.3.2 Other Noise

The u-th subsystem of Eq. (17) under other noise \({\varvec{f}}_k=\varvec{\varpi }\) can be derived as follows:

$$\begin{aligned} \begin{aligned} {\varvec{l}_{k + 1}^u} =(1.4391-h_{1}) {\varvec{l}_k^u}&-0.8782 {\varvec{l}_{k - 1}^u} +{0.4391\varvec{l}_{k - 2}^u}\\&-h_{2} \sum _{i=0}^{k}\varvec{l}_{i}^u+\varvec{\varpi }_{k}^u. \end{aligned} \end{aligned}$$
(27)

Similarly, we can get the following euation:

$$\begin{aligned} \begin{aligned} {\varvec{l}_{k }^u} =(1.4391-h_{1}) {\varvec{l}_{k - 1}^u}&-0.8782 {\varvec{l}_{k - 2}^u} +{0.4391\varvec{l}_{k - 3}^u}\\&-h_{2} \sum _{i=0}^{k-1}\varvec{l}_{i}^u+\varvec{\varpi }_{k-1}^u. \end{aligned} \end{aligned}$$
(28)

By subtracting Eqs. (28) from (27), we get the following equation:

$$\begin{aligned} \begin{aligned} {\varvec{l}_{k + 1}^u}&=(2.4391 -h_{1}-h_{2}) {\varvec{l}_k^u} -(2.3173-h_{1}) {\varvec{l}_{k - 1}^u} \\&+{1.3173 \varvec{l}_{k - 2}^u}-{0.4391\varvec{l}_{k - 3}^u}+\varvec{\varpi }_{k}^u-\varvec{\varpi }_{k-1}^u. \end{aligned} \end{aligned}$$
(29)

Define \(\varvec{\xi } _{{{k }}}^u=[\varvec{\varpi }_{{{k }}}^u-\varvec{\varpi }_{{{k-1}}}^u;0;0;0]\). According to Theorem 2, we get

$$\begin{aligned} \varvec{\sigma }_{k+1}^{u}=D\varvec{\sigma }_{k}^{u}+\varvec{\xi } _{{{k }}}^u, \end{aligned}$$

which can be further derived as follows:

$$\begin{aligned} \begin{aligned} \varvec{\sigma }_{k+1}^{u}&= D\varvec{\sigma }_{k}^{u}+ \varvec{\xi } _{{{k }}}^u\\&=D(D\varvec{\sigma }_{k-1}^{u}+\varvec{\xi } _{{{k-1}}}^u)+\varvec{\xi } _{{{k }}}^u\\&=D^2\varvec{\sigma }_{k-1}^{u}+D\varvec{\xi } _{{{k-1}}}^u+\varvec{\xi } _{{{k}}}^u\\&\quad \vdots \\&=D^k\varvec{\sigma }_{1}^{u}+\sum _{n=0}^{k-1} D^{n} \varvec{\xi }_{k-n}^{u}. \end{aligned} \end{aligned}$$

According to Theorem 2, we get \(\mathop {\lim }\limits _{k \rightarrow \infty } D^k=0\). Therefore, the above equation is further deduced:

$$\begin{aligned} \mathop {\lim }\limits _{k \rightarrow \infty }{\varvec{\sigma }_{k+1}^u}=\sum _{n=0}^{k-1} D^{n} \varvec{\xi }_{k-n}^{u}. \end{aligned}$$
(30)

Equation (30) can be expand as follows:

$$\begin{aligned} \begin{aligned} \lim _{k \rightarrow \infty }\left\| \varvec{\sigma }_{k+1}\right\| _{2}&=\left\| \sum _{n=0}^{k-1} D^{n} \varvec{\xi }_{k-n}\right\| _{2} \\&\le \sum _{n=0}^{k-1}\left\| D^{n}\right\| _{2} \max _{1 \le n \le k}\left\| \varvec{\xi }_{n}\right\| _{2} \\&=\max _{1 \le n \le k}\left\| \varvec{\xi }_{n}\right\| _{2} \left( 1-\Vert D\Vert _{2}^{k}\right) /\left( 1-\Vert D\Vert _{2}\right) \\&<\max _{1 \le n \le k}\left\| \varvec{\xi }_{n}\right\| _{2} /\left( 1-\Vert D\Vert _{2}\right) \\&<2 (r+1) \sup _{1 \le n \le k, 1 \le u \le r+1 }\left| \varvec{\varpi }_{n}^{u}\right| /\left( 1-\Vert D\Vert _{2}\right) . \end{aligned} \end{aligned}$$

Based on the definition of \(\varvec{\varpi }_{n}^{u}\) and the fact that the term \({o}\left( \tau ^{2}\right) \) represents the residual error of TLLND algorithm (13), we can conclude that

$$\begin{aligned} \lim _{k \rightarrow \infty }\left\| \varvec{l}_{k}\right\| _{2}<2 (r+1) \sup _{1 \le n \le k, 1 \le u \le r+1 }\left| \varvec{\varpi }_{n}^{u}\right| /\left( 1-\Vert D\Vert _{2}\right) +{o}\left( \tau ^{2}\right) . \end{aligned}$$

In summary, under the noise \({\varvec{f}}_k = \varvec{a}k\tau + \varvec{b}+\varvec{\varpi }\), the steady-state error of TLLND algorithm (13) for solving CEM scheme (1) is bounded by \(\left\| \varvec{a} \tau / h_{2}\right\| _{2}+2 (r+1)\sup _{1 \le n \le k, 1 \le u \le r+1 }\left| \varvec{\varpi }_{n}^{u}\right| /\left( 1-\Vert D\Vert _{2}\right) +{o}\left( \tau ^{2}\right) .\) Thus, the proof is completed.

4 Application to Arctic Sea Ice Extraction

Theoretical analyses guarantee the convergence performances of the proposed TLLND algorithm (13) to solve CEM problem (1). To further verify its generalization performance, we apply it to an intuitive problem: Arctic sea ice extraction.

4.1 Satellite Imagery

It is important to note that the Arctic sea ice extraction is based on remote sensing satellite imagery, so reliable and accurate satellite imagery is essential. Comparative information on the various optional remote sensing satellites can be found in Table 2. In particular, the LandSat 8, Sentinel-2A, and Terra/Aqua satellites have the advantage of wide coverage of observations and simple pre-processing of the images. Therefore, the selection of the above three satellite imagery sufficiently supports to prove that the proposed algorithm can reliably and accurately extract sea ice from different satellite imagery under different image resolutions, which also demonstrates the generalisation of the proposed algorithm.

Table 2 Comparisons among different optional remote sensing satellites

4.2 Evaluation Indices

In order to reflect the superiority of the proposed algorithm, the following evaluation indices are given to measure the extraction performance: mean square error (MSE), normalized mean square error (NMSE), peak signal noise ratio (PSNR), overall classification accuracy (OA), average classification accuracy (AA), and producer’s accuracy (PA). In addition, the definitions of some indices are given below:

$$\begin{aligned} \left\{ \begin{aligned} \textrm{M S E}&=\frac{1}{n^2} \sum _{i=0}^{n-1} \sum _{j=0}^{n-1}[I(i, j)-K(i, j)]^2\\ \textrm{N M S E}&=\frac{\sum _{i=0}^{n-1} \sum _{j=0}^{n-1}[I(i, j)-K(i, j)]^2}{\sum _{i=0}^{n-1} \sum _{j=0}^{n-1}[I(i, j)]^2}\\ \textrm{PSNR}&=10\cdot \log _{10}{\frac{\textrm{MAX}^{2} }{ \textrm{MSE} } } \\ \textrm{OA}&=\frac{\textrm{TP}+\textrm{TN}}{\textrm{TP}+\textrm{FP}+\textrm{TN}+\textrm{FN}} \\ \textrm{AA}&=\frac{1}{2}\left( \frac{\textrm{TP}}{\textrm{TP}+\textrm{FN}}+\frac{\textrm{TN}}{\textrm{TN}+\textrm{FP}}\right) \\ \textrm{PA}&=\frac{\textrm{TP}}{\textrm{TP}+\textrm{FP}}, \end{aligned}\right. \end{aligned}$$

where I(ij) and K(ij) denote the gray value of the ith-row jth-column pixel in the input and ground-truth Arctic sea ice remote-sensing image, respectively, and \(\textrm{MAX}\) represents the maximum possible value. In addition, TP represents the number of positive pixels correctly identified, TN denotes the number of negative pixels correctly identified, FP stands number of positive pixels with false identified, and FN denotes number of negative pixels with false identified.

4.3 Experiment 1: Different Satellite Imagery

In this subsection, three satellite imagery from the Sentinel-2A, LandSat 8, and Terra/Aqua are employed as experimental data to perform the Arctic sea ice extraction experiments aided with the proposed TLLND algorithm (13), DTZN algorithm (15), and MNI algorithm (16). In addition, the following experiments focus on the evaluation of the analytical performance of the above-mentioned algorithms in the presence of random noise \(\varvec{f}_k \in 4 \times [-1,1]\) and in the absence of noise \(\varvec{f}_k =0\) with \(h_1=0.8\), \(h_2=0.08\).

4.3.1 Sentinel-2A

Figure 2a represents the original satellite image of the Arctic sea ice download from European Space Agency copernicus open access hub, whose size is 1098*1098 pixels with 10 m * 10 m spatial resolution. The visualized results in Fig. 2b–d indicate that TLLND algorithm (13) and MNI algorithm (16) perform well in Arctic sea ice extraction under zero noise environment. Such problems do not seem to be handled by the higher-order algorithm DTZN (15).

In an environment with random noise, the extraction results are shown in Fig. 2f–h, and the processing result of DTZN algorithm (15) is still very unsatisfactory. The proposed TLLND algorithm (13) seems to be less affected by the noise than MNI algorithm (16). From Table 3, we can also see that TLLND algorithm (13) has better MSE, NMSE, PSNR, OA, and PA values, thus verifying the previous visual performance.

Fig. 2
figure 2

Visualized results of different algorithms to aid CEM scheme (1) for the Arctic sea ice extraction from Sentinel-2A image. a Original satellite image. bd Extraction result of TLLND algorithm (13), DTZN algorithm (15), and MNI algorithm (16) under the zero noise environment. e Ground-truth image. fh Extraction result of TLLND algorithm (13), DTZN algorithm (15), and MNI algorithm (16) under random noise \(\varvec{f}_k \in 4 \times [-1,1]\)

Fig. 3
figure 3

Visualized results of different algorithms to aid CEM scheme (1) for the Arctic sea ice extraction from LandSat 8 image. a Original satellite image. bd Extraction result of TLLND algorithm (13), DTZN algorithm (15), and MNI algorithm (16) under the zero noise environment. e Ground-truth image. fh Extraction result of TLLND algorithm (13), DTZN algorithm (15), and MNI algorithm (16) under random noise \(\varvec{f}_k \in 4 \times [-1,1]\)

Table 3 Comparison of arctic sea ice extraction performance indexes of different algorithms on Sentinel-2A satellite imagery
Table 4 Comparison of arctic sea ice extraction performance indexes of different algorithms on LandSat 8 satellite imagery

In an environment with the random noise, the extraction results are shown in Fig. 2f–h, and the processing result of DTZN algorithm (15) is still very unsatisfactory. The proposed TLLND algorithm (13) seems to be less affected by the noise than MNI algorithm (16). From Table 3, we can also see that TLLND algorithm (13) has better MSE, NMSE, PSNR, OA, and PA values, thus verifying the previous visual performance.

4.3.2 LandSat 8

Continuously, Fig. 3a represents the original satellite image of the Arctic sea ice download from the USGS Global Visualization Viewer, whose size is 1098*1071 pixels with 30 m * 30 m spatial resolution. Subsequently, DTZN (15) algorithm still does not seem to be able to handle this kind of problem. The proposed TLLND algorithm (13) and MNI algorithm (16) have the same performance under a zero noise environment. Similarly, in the random noise environment, Table 4 shows that TLLND algorithm (13) performs better than MNI algorithm (16).

4.3.3 Terra/Aqua

The third extraction experiment is based on satellite image Fig. 4a obtained from National Aeronautics and Space Administration website, whose size is 1092*1092 pixels with 500 m * 500 m patial resolution. Comparing the other two algorithms comprehensively, TLLND algorithm (13) still has better performance in dealing with the sea ice extraction problem, especially in the environment of the random noise.

The three experiments select remote sensing imagery from different satellites, where the background includes not only sea ice and seawater, but also clouds and fragmented sea ice, which extremely increases the difficulty of the Arctic sea ice extraction task. The proposed TLLND algorithm (13) still has high accuracy under such conditions, reflecting its stable performance and robustness.

Fig. 4
figure 4

Visualized results of different algorithms to aid CEM scheme (1) for the Arctic sea ice extraction from Terra/Aqua image. a Original satellite image. bd Extraction result of TLLND algorithm (13), DTZN algorithm (15), and MNI algorithm (16) under the zero noise environment. e Ground-truth image. fh Extraction result of the TLLND algorithm (13), DTZN algorithm (15), and MNI algorithm (16) under random noise \(\varvec{f}_k \in 4 \times [-1,1]\)

Table 5 Comparison of Arctic sea ice extraction performance Indexes of different algorithms on Terra/Aqua satellite imagery

4.4 Experiment 2: Different Noise Environment

In this part, a high-resolution observation image from Landsat 8 OLI is used as experimental data to extract Arctic sea ice under various noise disturbances, that is, zero noise \(\varvec{f}_k =0\), constant noise \(\varvec{f}_k =4\), linear noise \(\varvec{f}_k =5k\), and random noise \(\varvec{f}_k \in 4 \times [-1,1]\). The original LandSat 8 satellite image of the Arctic sea ice, downloaded from the USGS Global Visualization Viewer, has a size of 6504 by 6261 pixels and offers a spatial resolution of 30 m by 30 m. Figure 5 and Table 6 illustrate the results of extraction experiments.

4.4.1 Zero Noise

It can be found that MSE, NMSE, PSNR, OA, AA, and PA values of TLLND algorithm (13) and the MNI algorithm (16) are same in the extraction results. The proposed TLLND algorithm (13) and MNI algorithm (16) may have the same performance in the zero noise environment. The results of DTZN algorithm (15) are still not very good, and only a little outline can be extracted.

4.4.2 Constant Noise

Under the constant noise environment, all performance indexes of TLLND algorithm (13) are better, indicating that TLLND algorithm (13) has better noise tolerance than DTZN (15) and MNI algorithm (16) on Arctic sea ice extraction task.

4.4.3 Linear Noise

Similar to previous experiments results, TLLND algorithm (13) has the better performance compared to other algorithms in the face of the time-varying linear noise.

4.4.4 Random Noise

The extraction results of the proposed TLLND algorithm (13) still has the best performance compared to other algorithms when faced with high-resolution images.

4.5 Summary

Through the above comparative experiments, it could be concluded that the extraction results of TLLND algorithm (13) and MNI algorithm (16) have the same performance in the zero noise environment. However, under various noise environments, the extraction performance of the proposed TLLND algorithm (13) is best among three algorithms, which indicates that the proposed TLLND algorithm (13) has the best generalization and noise-tolerance ability.

Fig. 5
figure 5

Visualized results of different algorithms to aid CEM scheme (1) for the Arctic sea ice extraction from Landsat 8 image under various noise environments. a Original satellite image. b Ground-truth image. ce Extraction result of TLLND algorithm (13), DTZN algorithm (15), and MNI algorithm (16) under zero noise environment. fh Extraction result of TLLND algorithm (13), DTZN algorithm (15), and MNI algorithm (16) under constant noise \(\varvec{f}_k =4\). ik Extraction result of TLLND algorithm (13), DTZN algorithm (15), and MNI algorithm (16) under linear n oise \(\varvec{f}_k =5k\). ln Extraction result of TLLND algorithm (13), DTZN algorithm (15), and MNI algorithm (16) under random noise \(\varvec{f}_k \in 4 \times [-1,1]\)

Table 6 Comparison of arctic sea ice extraction performance indexes of different algorithms on high-resolution LandSat 8 satellite imagery under different noise disturbances

5 Conclusion

In this paper, a transfer-learning-like neural dynamics (TLLND) algorithm has been proposed, which combines both the advantages of neural dynamics (ND) and deep learning, and can extract sea ice from different satellite remote sensing imagery. A parametric method has been used to estimate the coefficients in the ND algorithm to obtain the TLLND algorithm. The TLLND algorithm is used to aid the constrained energy minimization scheme for the sea ice extraction task. Then, rigorous theoretical proofs for the convergence and stability of the proposed TLLND algorithm have been provided, and its superiority in effectiveness, robustness, and generalization performance have been shown through comparative experiments with other state-of-the-art algorithms. This research has provided a new idea and framework for solving complex image processing problems by combining dynamical systems and deep learning methods. While the proposed algorithm has demonstrated promising performance in Arctic sea ice extraction, there is still room for improvement in terms of its robustness in complex environments, generalization capability to new datasets, computational efficiency, and applicability to other remote sensing image processing tasks. Future work could focus on addressing these limitations, and combining the algorithm with other advanced techniques to broaden its practical applications.