An Infrared Small Moving Target Detection Method in Complex Scenes Based on Dual-Region Search

Cao, Huazhao; Hu, Yuxin; Wang, Ziming; Yang, Jianwei; Zhou, Guangyao; Wang, Wenzhi; Liu, Yuhan

doi:10.3390/rs17020323

Open AccessArticle

An Infrared Small Moving Target Detection Method in Complex Scenes Based on Dual-Region Search

by

Huazhao Cao

^1,2,3,4

,

Yuxin Hu

^1,2,3,4,

Ziming Wang

^1,2,3,4

,

Jianwei Yang

^1,2,3,4

,

Guangyao Zhou

^1,2,3,

Wenzhi Wang

^1,2,3 and

Yuhan Liu

^1,2,3,*

¹

Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100094, China

²

Key Laboratory of Technology in Geo-Spatial Information Processing and Application System, Chinese Academy of Sciences, Beijing 100190, China

³

Key Laboratory of Target Cognition and Application Technology (TCAT), Chinese Academy of Sciences, Beijing 100190, China

⁴

School of Electronic, Electrical and Communication Engineering, University of Chinese Academy of Sciences, Beijing 101408, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2025, 17(2), 323; https://doi.org/10.3390/rs17020323

Submission received: 2 December 2024 / Revised: 13 January 2025 / Accepted: 13 January 2025 / Published: 17 January 2025

Download

Browse Figures

Versions Notes

Abstract

:

Infrared (IR) small target detection is a crucial component of infrared imaging systems and is vital for applications in surveillance, security, and early warning systems. However, most existing algorithms for detecting small targets in infrared imagery encounter difficulties in achieving both high accuracy and speed, particularly in complex scenes. Additionally, infrared image sequences frequently exhibit gradual background changes as well as sudden alterations, which further complicates the task of detecting small targets. To address these issues, a dual-region search method (DRSM) is proposed and combined with multi-directional filtering, min-sum fusion, and clustering techniques, forming an infrared small moving target detection method in complex scenes. First, a multi-directional filter bank is proposed and it causes the original infrared image sequence to retain only point-like features after the filtering. Then, several consecutive filtered feature maps are superimposed into one, where the moving target will leave a trajectory due to its motion characteristics. Finally, based on the trajectory, a dual-region search strategy is employed to pinpoint the exact location of the target. The experimental outcomes show that, compared to alternative algorithms, the proposed approach outperforms others in terms of detection accuracy and speed, particularly in diverse real-world complex scenarios.

Keywords:

infrared small target detection; complex scenes; multi-directional filter and min-sum fusion; trajectory clustering; dual-region search

1. Introduction

An infrared imaging system typically translates the infrared emissions from both the target and its surroundings into electronic signals for visualization [1]. In recent years, as thermal imaging technology has been steadily advancing, infrared searching and tracking systems (IRSTSs) have found increasing applications within the realms of surveillance, security, and early warning. Therefore, as the core step of IRSTSs, algorithms for detecting small moving targets in infrared imagery are rapidly developed as well. A range of single-frame and sequence-based detection methods have emerged. Nevertheless, because of the wide imaging range, the target typically occupies only a few pixels in the system’s output image, leading to the absence of detailed texture, structural, and color data. Furthermore, atmospheric refraction, scattering, and other effects weaken the target brightness. Concurrently, random electrical noise generated within the system produces high-brightness noise points in infrared images, making small targets even more indistinguishable. Finally, in a variety of real complex scenes, there may also be problems such as occlusion, which has heightened the challenge of detecting infrared small and faint targets. Thus, infrared detection of small moving targets remains a formidable challenge.

1.1. Related Works

One of the most intuitive feelings for the detection method is to imitate the mechanism of human beings. Thus, target detection methods based on the human visual system (HVS) have emerged [2]. Chen et al. [3] initially proposed the idea of local contrast measure (LCM) based on HVS. In the following years, Han et al. [4] added a slider to improve the LCM algorithm. Wei et al. [5] further proposed the multi-scale patch-based contrast measure (MPCM), which is often used as a fast algorithm in some relatively simple scenarios. Ulteriorly, by adding an adaptive threshold, Han et al. [6] proposed a relative local contrast measure (RLCM). With a novel tri-layer slider constructed, Han et al. [7] proposed the tri-layer local contrast measure (TLLCM) and acquired superior performance. Furthermore, through the weighting and stretching of LCM, the concept of weighted strengthened local contrast measure (WSLCM) has also been introduced [8]. In the past two years, Cui et al. [9] fully utilized the saliency map to design a hollow side window filter (HSWF), which was very effective for background estimation. Moreover, Li et al. [10] leveraged the homogeneous compactness of the small target and discontinuity with its surroundings and proposed an innovative approach for detecting small infrared targets. Generally speaking, these algorithms based on HVS are relatively simple and fast without prior information. However, they can also be ineffective in real complex scenes, especially when the background contains an excessive number of bright spots or the target is too diminutive and dark, which results in the target being obscured by the background.

With the convergence of scientific and technological advancements, target detection has embraced the application of unsupervised methods that leverage Robust Principal Component Analysis (RPCA). Given the hypothesis that infrared images are generally constituted by a uniform backdrop, a singular object of interest, and sporadic noise, they can be roughly dissected into a low-rank and a sparse component by employing robust component analysis. Thus, the challenge of target detection can be formulated as a problem of low-rank and sparse decomposition (LRSD). Gao et al. [11] proposed the infrared patch-image (IPI) model based on LRSD. Zhang et al. [12] proposed the partial sum of tensor nuclear norm (PSTNN) through locally prior mapping and tensor singular value decomposition. Hu et al. [13] introduced an innovative multi-frame spatiotemporal patch-tensor (MFSTPT) model for detecting IR targets with complex backgrounds. This model leverages simultaneous sampling in both the spatial and temporal domains and approximates the rank of the tensor. According to the feature of coarse-to-fine structure (MCFS), Ma et al. [14] proposed another detection method to suppress the background. Making full use of spatiotemporal information, Aliha et al. [15] proposed a novel spatiotemporal block-matching patch-tensor (STBMPT) model to improve the IPI. Taking into account that targets typically exhibit high local salience (HLS) in contrast to noise, Liu et al. [16] introduced a new approach that utilizes HLS. Furthermore, Wu et al. [17] proposed a novel method by suppressing strong edges and utilizing local feature information. Luo et al. [18] introduced the 4-D spatiotemporal field to address the LRSD optimization problem. Moreover, Luo et al. [19] also estimated low-rank background accurately and enhanced the target effectively through spatiotemporal information and an improved tensor nuclear norm. Liu et al. [20] incorporated prior factors into the sparse decomposition method, enhancing the accuracy of background estimation. It is also worth noting that these algorithms based on RPCA can achieve good results when the model and the scene are consistent, but the effect is strictly limited by the conditions. Meanwhile, the complexity of the mathematical model somewhat diminishes the algorithms’ real-time capabilities.

Furthermore, considering the motion characteristics of the targets, scholars have also proposed sequence-based detection algorithms (multi-frame algorithms) such as the three-dimensional matched-filtering algorithm [21], spatiotemporal saliency model [22], pipeline filter [23], frame difference [24,25], and so on. Deng [26] utilized spatial and temporal local contrast to propose a sequence-based method but failed to suppress the background well. These algorithms primarily exploit the target’s movement characteristics, in conjunction with the temporal information of the sequences, to obtain the target’s location for conducting detection and tracking in subsequent work. They can solve some problems that the single-frame-based algorithms cannot deal with and enhance the precision of target detection, yet demand extensive data storage and high space occupancy rates, which also lead to poor real-time performance.

As deep learning is adopted, there is a noticeable trend towards variety in the target detection algorithms. Wang et al. [27] put forward an asymmetric patch attention fusion network (APAFNet) designed to integrate semantic content with spatial information. Chen et al. [28] proposed a sliced spatiotemporal network (SSTNet) based on cross-slice motion characteristics. Considering that existing techniques often neglect to capture comprehensive global information, Yuan et al. [29] introduced a Spatial-Channel Cross Transformer Network (SCTransNet) to tackle the issue, especially when targets closely resemble the background. Additionally, taking advantage of the features related to target dimensions and grayscale values, Sun et al. [30] put forward a Receptive-Field and Direction-Influenced Attention Network (RDIAN). Li et al. [31] introduced mixed-precision into the process of network quantization, improving segmentation performance. Furthermore, Xiao et al. [32] combined traditional methods with deep learning, reducing the acquisition of labels and significantly improving computational efficiency, providing a new approach for IR small target detection. Compared to traditional algorithms, detection methods for IR targets that are based on deep learning can automatically extract features and show improved robustness and real-time execution. However, the infrared images tend to be blurry, with diminutive target dimensions, which makes it challenging to provide usable features for the network to learn. This creates a bottleneck for deep learning algorithms.

1.2. Motivation

Detection algorithms based on HVS have a significant advantage in terms of fast detection speed, such as LCM, MPCM, and HB-MLCM, but their drawbacks lie in their inability to handle scenes with complex backgrounds or targets that are too small. If the target is too small, it is easily filtered out, leading to missed detections. To ensure the target is retained, the proposed algorithm (DRSM) introduces a multi-directional filtering and min-sum fusion strategy. The DRSM designs the filter kernel based on the morphological characteristics of the target to enhance target points and then uses the anisotropy of the background to suppress the background and edges through feature fusion, which further enhances the target. In this way, simple convolutional filtering can effectively suppress the background and retain the target to the greatest extent possible.

RPCA-based detection algorithms perform well in target detection rate and background suppression. However, these methods treat images as matrices, construct tensors, and use constraint conditions to perform a sparse decomposition to solve for the optimal model, a process that involves a large amount of mathematical computation and has a lower detection speed. For example, MFSTPT [13] and STBMPT [15], as sequence-based detection methods, fully utilize the spatiotemporal information of infrared image sequences to construct tensors. Although they have good performance on detection accuracy, their detection speed is relatively slow. To avoid complex computations and enhance timeliness, the DRSM proposes a clustering method to compensate for the insufficiency of filtering effects such as retaining point-like noise so that there is no need to design complex algorithms to pursue ultimate filtering effects.

Most current algorithms fail to decide the true target from a saliency feature map with multiple points or can only retain a single feature point as the target through threshold segmentation. However, these methods are highly uncertain and the design of the segmentation threshold is difficult. Figure 1 shows a situation where (a) and (b) are saliency feature maps containing one true target and two false alarms whose grayscales vary. If the true target has the maximum grayscale value as shown in (a), then an appropriate segmentation threshold can successfully retain the target, as shown in (c). Conversely, if the false alarms have a higher grayscale value, then the method to determine the target through the segmentation threshold will fail, as shown in (e). Consequently, we extract the approximate area of the target through clustering, and then precisely lock the unique target through a dual-region search method, efficiently solving the target judgment problem, even if the target’s grayscale is lower than false alarms.

In this paper, to cope with the aforementioned issues, we introduce a detection method for small moving targets in complex infrared scenes based on dual-region search (DRSM). This method primarily utilizes the target’s motion characteristics to aggregate trajectories, identifies the area of target movement, and thereby locks the target’s location through dual-region search. The principal contributions of this manuscript are compiled as follows:

A multi-directional filter bank and min-sum feature fusion are proposed to quickly eliminate the background and retain the weak target.
The DBSCAN clustering strategy is introduced to extract the target’s potential area. It fully exploits the spatiotemporal information of infrared image sequences and acquires the trajectory of the moving target. Moreover, the target area obtained from clustering directly suppresses false alarms in the non-target area, significantly enhancing the performance of the algorithm.
To precisely detect the target’s location, a dual-region search method based on fix region and dynamic region is proposed. It can dynamically adjust the search location and range according to the changes in the background, thereby improving the detection rate.
A method for detecting small infrared (IR) targets has been proposed to identify small targets swiftly and precisely within infrared image sequences, including those with complex backgrounds.

The structure of the paper is organized in the following way. Section 2 elaborates on the proposed approach with a detailed explanation and includes theoretical discussions. Section 3 offers a comprehensive analysis of the experimental findings on different infrared image sequences in contrast to alternative methods. In Section 4, we discuss the advantages and disadvantages of the proposed method by combining two examples. We present the conclusion of this paper in Section 5.

2. Proposed Algorithm

2.1. Multi-Directional Filtering

Firstly, inspired by the Sobel edge detection algorithm [33], a new filter bank

f b k

, which can filter out the large image background of the infrared image and preserve the edges, is designed. Figure 2 shows the configuration of the filter bank. It includes eight filter kernels, each measuring 5 × 5, representing eight filtering directions. In each direction, correction factors are set up to make the sum of the filter kernel coefficients equal to zero, and it is the key to filtering out the large image background. The design of the filter kernel is based on the optimization process of the Sobel edge detection operator. In theory, more filtering directions result in better background suppression but also higher computational complexity; therefore, eight directions are relatively appropriate. In order to differentiate the eight directions of the filter kernel, the size of the kernel needs to be 5 by 5 or larger. Nonetheless, expanding the kernel size will result in a higher computational complexity for the algorithm. Therefore, to balance the filtering effect and timeliness, it is appropriate to set the filter kernel to 5 by 5 and design the filter bank with 8 directions.

Secondly, each frame of the original IR image sequences is processed with the filter bank to derive 8 directional feature maps

I_{n k} (k = 1, \dots, 8)

, where n denotes the serial number of each infrared image, while k denotes the k-th directional feature map generated by the image through multi-directional filtering. Min-sum fusion is performed on the 8 directional feature maps to obtain a minimum filtered feature map

M_{n}

and a summation of filtered feature map

S_{n}

, where the pixel values less than 0 are all set to 0.

M_{n} (x, y) = min (I_{n k} (x, y))

(1)

S_{n} = \sum_{k = 0}^{8} I_{n k}

(2)

We compute the Hadamard product of

M_{n}

and

S_{n}

and normalize the results to obtain the final filtered feature map

I F_{n}

for each frame of original IR image sequences

I R_{n}

. The complete procedure of multi-directional filtering and min-sum fusion is given in Algorithm 1, where

I R

and

I F

refer to the original image sequence and filtered feature maps, respectively. The

f b k

is the filter bank. The parameters

l e n

and

w i d t h

correspond to the length and width of an image frame. Here, each filtered feature map is arranged in chronological order as a sequence

I F

. Meanwhile, a map layer index is established to store relevant information for each frame in the sequence.

I F_{n} = M_{n} ⊙ S_{n}

(3)

Algorithm 1 Multi-directional Filtering and Min-Sum Fusion

: Input: IR, fbk, len, width
: Output: IF
1:: for i = 1 : n do
2:: for k = 1 : 8 do
3:: $I_{nk} \leftarrow {IR}_{n} * {fbk}_{k}$
4:: end for
5:: for x = 1 : len do
6:: for y = 1 : width do
7:: $M_{n} (x, y) \leftarrow min (I_{n k} (x, y))$
8:: end for
9:: end for
10:: $S_{n} \leftarrow \sum_{k = 0}^{8} I_{n k}$
11:: $I F_{n} \leftarrow M_{n} \cdot S_{n}$
12:: end for

In summary, multi-directional filter bank filters out the large background and preserves the edges, while minimum fusion suppresses the edges and preserves the points. Moreover, the summation fusion strategy enhances the targets without introducing any clutter due to the Hadamard product. The entire filtering and fusion process involves only convolution and simple computations, making it extremely time-efficient. However, it is worth mentioning that this process can also enhance point noise but will not affect the final detection outcome, and the reason for this will be explained in the following subsection.

2.2. Trajectory Clustering

Trajectory clustering aims to accurately extract target trajectories by classifying data on a two-dimensional plane. Firstly, we select periodic cyclical frames of filtered feature maps with a fixed background in the time sequence. These frames are then subjected to threshold segmentation using the threshold

t h

. The segmented frames are then superimposed to obtain a single frame of the feature map

I_{s}

.

I_{s} = \sum_{n = 1}^{p e r i o d} I F_{n}

(4)

Then, to determine the target’s motion range, we adopt a clustering strategy for the feature map

I_{s}

. Given the benefits of the DBSCAN clustering algorithm, such as eliminating the need to preset the number of clusters, its capability to identify clusters of diverse shapes, and its robustness against noise, it emerges as the optimal selection for the clustering approach employed in this detection algorithm. The previous work has filtered out the large area of clutter, so the target will be left with a linear trajectory on the feature image

I_{s}

due to its motion characteristics while the other point noise will be left with a few scattered points, which will be slightly increased in the case of the background with slight jitters. Therefore, if we denote that the target trajectory is clustered into the category

G_{t}

and the clustering results in a total of k classes, denoted as

G_{i} (i = 1, \dots, k)

, the

G_{t}

will be the one with the longest Euclidean distance in

I_{s}

. The schematic diagram is shown in Figure 3. Consequently, the potential target moving region is determined.

l e n (G_{t}) = max (l e n (G_{i}))

(5)

In summary, DBSCAN clustering is adopted to extract target trajectories and effectively separate the noise due to its characteristic. On the one hand, clustering relies on the position of pixels, not their grayscale values, which allows faint targets to participate in forming trajectory clusters, while bright noise points do not affect the clustering results. On the other hand, the discrimination of trajectories relies solely on the length of the clusters and is irrelevant to the grayscale values. These two points also explain why the enhancement of point noise does not affect the final detection results.

2.3. Dual-Region Search

Infrared image sequences are often captured with a stationary camera, leading to slow changes in the background. After a target flies for a while, the lens will move significantly to refocus and center the target, which is a process that has a certain periodicity and can result in abrupt changes in the background. This results in the moving target having the following two spatial characteristics in the image sequence. One is the spatial position continuity of the flying object across successive frames. The other is the periodic regression, meaning that after a period, the target will return to a position near the initial location.

Additionally, the analysis of the infrared image sequence indicates that within the same infrared image sequence, the target’s grayscale value remains constant or varies slightly. Therefore, we use it as the primary criterion for verifying the correctness of the search results.

According to these characteristics, a dual-region search method is proposed to accurately obtain the target’s position. Firstly, based on the clustering results, we obtain the target trajectory within the initial

p e r i o d

frames, which is the initial motion range of the target. Scanning the filtered feature map sequence

I F

within the initial range and considering the maximum grayscale point as the target allows us to confirm the target’s position in the initial

p e r i o d

frames. Then, using these positions as reference, we statistically determine the most frequent grayscale value of the target in the original image, which is then used as the standard grayscale value

g_{s}

for the target throughout the sequence of infrared images. Next, we perform a thorough traversal of the remaining filtered feature map sequence. The search range for each time has a radius r centered on the position of the target in the previous frame. This process is called dynamic region search. Subsequently, the results of dynamic region search are judged: if they do not meet the criteria, it indicates that the camera has moved and the target has returned to a position near the initial location. At this point, we use the initial range for the search. This process is called fixed region search. Figure 4 presents a schematic diagram of the dual-region search, where the green square represents the fixed search region, the blue square represents the dynamic search region, and the red square represents the target. The diagram illustrates two scenarios. As can be seen from the figure, in the first scenario, both search regions can detect the target, while in the second scenario, only the dynamic search area can detect the target.

The dual-region search method can be given in Algorithm 2, where the meanings of

I R

and

I F

are the same as in Algorithm 1.

[x_{l}, x_{r}, y_{t}, y_{b}]

are the four boundary coordinate values of the fixed region, which is derived from the trajectory clustering boundary. r is the radius of the dynamic search range.

δ

and

Δ

are the grayscale difference threshold and the distance threshold separately.

Algorithm 2 Dual-region Search Method

: Input: IR, IF, period, $[x_{l}, x_{r}, y_{t}, y_{b}]$ , $g_{s}$ , $r$ , $δ$ , $Δ$
: Output: P(x_n, y_n)
1:: for i = 1 : $p e r i o d$ do
2:: $P (x_{i}, y_{i}) \leftarrow \max ({IF}_{i} [x_{l} : x_{r}, y_{t} : y_{b}])$
3:: end for
4:: for i = $p e r i o d$ + 1 : n do
5:: $P (x_{i}, y_{i}) \leftarrow \max ({IF}_{i} [x_{i - 1} - r : x_{i - 1} + r, y_{i - 1} - r : y_{i - 1} + r])$
6:: if $| I R_{i} (x_{i}, y_{i}) - g_{s} | > δ$ and $\sqrt{{(x_{i} - x_{i - 1})}^{2} + {(y_{i} - y_{i - 1})}^{2}} > Δ$ then
7:: $P (x_{i}, y_{i}) \leftarrow \max ({IF}_{i} [x_{l} : x_{r}, y_{t} : y_{b}])$
8:: end if
9:: end for

Finally, by filling the obtained target position coordinates into the map layer index and establishing a complete correspondence between each image frame and the target’s location, the detection of the target within the infrared image sequence is accomplished.

To offer a straightforward visual explanation of the proposed method, the flowchart of the target detection is provided in Figure 5. The core advantages of this algorithm are listed below.

Low algorithm complexity and fast detection speed. Multi-directional filtering and min-sum fusion quickly filter out the background, preserving point-like targets and noise. Then, trajectory clustering utilizes the motion characteristics of targets to determine target areas. Finally, the maximum grayscale point in the search region is considered as the target. Thus, the entire detection process avoids complex calculations and improves detection speed.
High detection rate and low false alarm rate. Multi-directional filtering and min-sum fusion retain the target as much as possible, improving detection rate. The search area is established to quickly separate the target from a large number of noise points, so the region outside the search area can be fully suppressed, which diminishes the false alarm rate.
Detecting the precise position of the target without threshold segmentation. The DRSM extracts the approximate area of the target through clustering and precisely locks the unique target through a dual-region search method, which solves the target judgment problem successfully without threshold segmentation.

3. Experiment and Analysis

The research presented in this paper utilizes the dataset for detecting and tracking small aircraft targets in infrared images provided by Hui et al. [34]. This dataset includes backgrounds like the sky and ground, along with multiple scenarios, totaling 22 segments of data. Five segments of data with complex background and high detection difficulty are selected to conduct the relevant experiments. Table 1 presents the data information.

To illustrate the advantages of the recommended method, eight current mainstream infrared target detection algorithms are selected for comparative analysis, such as HB-MLCM [35], MCPM [5], PSTNN [12], STLCF [26], TLLCM [7], WTLLCM [36], MFSTPT [13], and STBMPT [15]. The experimental parameters are detailed in Table 2 and the codes of the proposed method (DRSM) can be accessed at https://github.com/MagiRabbit/DRSM (accessed on 12 January 2025).

3.1. Evaluation Metrics

To provide an impartial and visual representation of the algorithm’s performance, we have established a comprehensive suite of metrics encompassing both qualitative and quantitative assessments. The former is realized through observation of the two-dimensional detection outcomes complemented by their respective three-dimensional grayscale imagery, thereby providing a direct comparative evaluation of the algorithm’s efficacy. The latter makes the results more objective, precise, and quantifiable through data. The target region and its surrounding neighborhood in the evaluation metric are depicted in Figure 6. The target region, indicated by the red square, measures

a * b

, and the surrounding neighborhood, demarcated by the yellow square, has dimensions of

(a + 2 d) * (b + 2 d)

.

The Background Suppression Factor ( $B S F$ ) [37] indicates the extent of background suppression; a greater value signifies a higher level of suppression and superior algorithm performance.

$B S F = \frac{δ_{i n}}{δ_{o u t}}$

(6)

where $δ$ corresponds to the standard deviation of the images.
The Signal-to-Clutter Ratio Gain ( $S C R G$ ) can demonstrate the degree to which the algorithm enhances the target and suppresses background clutter in the vicinity of the target. An algorithm’s performance is considered better when it has a larger $S C R G$ value. The definition of $S C R G$ is as follows:

$S C R G = \frac{S C R_{o u t}}{S C R_{i n}}$

(7)

where $S C R$ denotes the ratio of signal to clutter.

$S C R = \frac{|μ_{t} - μ_{b}|}{σ_{b}}$

(8)

where $μ_{t}$ and $μ_{b}$ represent the grayscale mean of the target and neighborhood, respectively. $σ_{b}$ represents the standard deviation of the neighborhood.
In this paper, the receiver operation characteristic (ROC) is a three-dimensional curve plotted with false alarms ( $F_{a}$ ), the threshold ( $τ$ ), and detection rate ( $P_{d}$ ). The definition of $F_{a}$ and $P_{d}$ is as follows [38]:

$P_{d} = \frac{t r u e t a r g e t s d e t e c t e d}{t o t a l t r u e t a r g e t s}$

(9)

$F_{a} = \frac{f a l s e p i x e l s d e t e c t e d}{t o t a l p i x e l s i n i m a g e}$

(10)

Furthermore, the area under the curve (AUC) indicates the algorithm’s effectiveness. $A U C_{(D, F)}$ is derived from the two-dimensional ROC curve $(P_{d}, F_{a})$ , which is employed to assess the overall performance of the algorithm. $A U C_{(D, τ)}$ relates to the two-dimensional ROC curve $(P_{d}, τ)$ , reflecting the detection probability $P_{d}$ at different threshold values $τ$ , used to assess the target detection capability of a detector. $A U C_{(F, τ)}$ is derived from the two-dimensional ROC curve $(F_{a}, τ)$ , focusing on the variation of the false alarm rate with the threshold, evaluating the background suppression capability of a detector. $A U C_{O A}$ represents overall accuracy and $A U C_{S N P R}$ is the signal-to-noise probability ratio, which, specified as follows, are employed to reflect the comprehensive performance of a detector in signal detection and the suppression of background noise.

$A U C_{O A} = A U C_{(D, F)} + A U C_{(D, τ)} - A U C_{(F, τ)}$

(11)

$A U C_{S N P R} = \frac{A U C_{(D, τ)}}{A U C_{(F, τ)}}$

(12)

3.2. Detection Results of DRSM

The detection outcomes of the recommended approach are depicted in Figure 7, with the actual target indicated by a red square and the green square representing the fixed search area. The original image reflects many challenges faced in the detection of weak targets in real complex scenarios: the extremely faint target, as in

a 1

; the target being submerged by the road, as in

a 2

; the target set at the margin of the background, as in

a 3

; the large area of bright imaging background, as in

a 4

; and numerous environmental noise points, as in

a 5

. However, the detection results show that the DRSM effectively addresses the aforementioned situations by establishing a small search area. It not only suppresses the background well but also accurately detects and marks the targets.

3.3. Robustness to Noise

Robustness to noise is another vital criterion for assessing the algorithm’s effectiveness. Therefore, we added Gaussian white noise with a mean of 0 and variances of 0.002 and 0.01 to the datasets, respectively. The results are graphically represented in Figure 8 and Figure 9, sequentially. As depicted in Figure 8(a1–e1), after the addition of noise, the image becomes blurred. There is a substantial reduction in the contrast near the target, making it more difficult to recognize. Besides, it can be seen from

a 2

to

e 2

that more noise points have appeared within the search area, but the final results shown in

a 3

to

e 3

indicate that the algorithm continues to be effective in precisely locating the target. However, as shown in Figure 9, as the variance of the Gaussian white noise increases, the images become more blurred. Relatively salient targets such as data06, data08, data15, and data22 can still be accurately detected, but data05 loses its salience due to the target being too faint and the image being excessively blurred, leading to detection failure.

We can analyze this result from a theoretical perspective. Firstly, in the design of the filter kernel, the central point has a larger weight, which is consistent with the infrared characteristics of point-like small targets. After convolution, the point-like targets are enhanced for the first time, and through min-sum fusion, the small targets are enhanced for the second time, while edges in different directions are suppressed. The continuous enhancement of the target during the multi-directional filtering and fusion process ensures that the infrared small targets can be retained, which is the fundamental source of the algorithm’s robustness to noise. On the other hand, cases where excessive noise leads to detection failure in data05 also indicate that the algorithm still requires the target to have a certain level of local contrast.

3.4. Parameter Analyses

The setting of parameters can have a significant impact on the detection results of the algorithm [39]. The DRSM has four main parameters: First, the segmentation threshold

t h

, which is used to filter out some false alarms before DBSCAN clustering. Second, the image sequence jitter period

P e r i o d

, which is also the number of frames used to superimpose the filtered feature maps for clustering. This number should not be too small; otherwise, it will lead to a short trajectory of the moving target, affecting category judgment. Third, the parameters

ϵ

and

M i n P o i n t s

used for DBSCAN clustering, which mainly affect the clustering results. Fourth, the radius r of the search range, which determines the expansion of the search range after the search center is determined.

Nevertheless, the value of

P e r i o d

, which is the number of frames used for superimposing clustering, can be appropriately larger and has less impact on the algorithm. Moreover, since this algorithm is mainly designed for scenarios with complex backgrounds and small targets, the motion trajectory characteristics of the targets in these scenarios are essentially consistent and, thus, the impact of clustering parameters (i.e.,

ϵ

and

M i n P o i n t s

) is minimal too. Therefore, the following mainly discusses the impact of the other two parameters.

Segmentation threshold $t h$ . The key to the success of this algorithm lies in clustering and classification; only by correctly identifying the target class (i.e., the motion trajectory) can we search for the target within that range. Therefore, we adjusted different values of $t h$ and plotted Figure 10 based on whether clustering and trajectory recognition were successful (1 for success, 0 for failure). The detailed clustering results are shown in Figure 11. It can be seen from Figure 10 and Figure 11 that as long as $t h$ is greater than or equal to 0.4, clustering can be successful. This means that although $t h$ affects the clustering results, the range of possible values for $t h$ is relatively broad. This is because multi-directional filtering enhances the target, thereby reducing the requirements for $t h$ .
Search radius r. By clustering and identifying trajectories, the initial motion range can be determined. In the subsequent dynamic region search process, the size of the search area is a critical factor. If r is too large, as shown in Figure 12, it may cause brighter noise points than the target to enter the search region, leading to false detection. On the contrary, if r is too small, as shown in Figure 13, it may result in the search area not covering the target, leading to missed detection.

3.5. 2-D Detection Results

To avoid the impact of threshold segmentation on the detection results of different methods, all the 2-D detection result figures, displayed in Figure 14, are presented without threshold segmentation. This allows for a more direct reflection of the background suppression effects of different algorithms. The targets are marked with red squares and images without marked targets indicate detection failure. The 2-D detection results indicate that some algorithms such as PSTNN and STLCF can preserve weak targets but have poor background suppression, resulting in a higher number of false alarms. Conversely, other algorithms, such as HB-MLCM, MPCM, TLLCM, and WTLLCM, perform well in background suppression but fail to detect weak targets. MFSTPT and STBMPT are two sequence-based and RPCA-based methods that fully exploit the spatiotemporal information of image sequences, performing excellently in background suppression and detection rate, but they have longer detection times. However, the DRSM performs well in retaining weak targets and effectively suppressing background noise in less time. We can explain this result from two aspects. On the one hand, multi-directional filtering and fusion ensure that the target is retained. On the other hand, the dual-region search strategy divides the filtered feature map into two regions: the target area and the non-target area. The non-target area is completely suppressed, achieving a good background suppression effect.

3.6. 3-D Detection Results

The three-dimensional images, as shown in Figure 15, provide a more intuitive reflection of the brightness values of detection results, in which the targets that can be identified have been marked with red ellipses. The two-dimensional detection results only reflect whether the targets have been preserved, but a good detection result should also ensure that the brightest spots in the result correspond exactly to the real targets. For instance, Figure 14 shows that PSTNN and STLCF can also preserve the targets, but Figure 15 indicates that it is very difficult to discern the target. From this perspective, the DRSM offers optimal performance because it has better background suppression and the target is more prominent and easier to identify. Compared to MFSTPT, which achieves suboptimal performance, the DRSM has lower false alarms.

3.7. Average Detection Time and Complexity Analysis

This paper conducts target detection experiments using MATLAB. For each set of data, the experiment starts timing from reading the image sequence with the “tic” command and ends when the filtered feature map data matrix is output, displaying the elapsed time with the “toc” command, excluding the process of graph plotting. The average time per frame for each set of data is calculated and the results are given in Table 3. HB-MLCM and MPCM, as two rapid detection algorithms, have good performance on detection time and are often applied to simple scenes. PSTNN is a classic algorithm for solving targets by constructing tensor and sparse decomposition, which has a large amount of computation and is relatively time-consuming. TLLCM uses three layers to suppress the background, which will take a lot of time. Based on TLLCM, WTLLCM introduces sliding windows to solve the problem of varying target scales and improve the detection rate. STLCF spends a considerable amount of time on the calculation of spatiotemporal local contrast. It is worth noting that although MFSTPT and STBMPT perform well in detection rate and background suppression rate, Table 3 reveals the biggest drawback of these two methods, which is that they are very time-consuming.

Compared to other algorithms, the DRSM achieves the fastest detection rate, and we can analyze the specific reasons from the complexity of the algorithm. Suppose that the original image consists of m rows, n columns, and its size is

M \times N

. The entire algorithm is divided into three parts: Multi-directional Filtering, Trajectory Clustering, and Dual-region Search. First, during the multi-directional filtering, the filter kernel size is

5 \times 5

, and the computation cost of the convolution calculation between the original image and the filter bank is

O (M N)

. The fusion process involves summation and taking the minimum value, whose computation cost is also

O (M N)

. Second, due to the good background suppression effect during filtering, only a few pixels participate in the clustering; hence, the computation cost of DBSCAN clustering can be neglected. Finally, the search area for locking the target location is approximately

10 \times 10

, and the strategy is to take the maximum value within the area, which can also be neglected. Therefore, the time complexity of the entire algorithm is

O (M N)

.

Consequently, the DRSM fully leverages the characteristics of infrared moving targets, accomplishing the task of target detection by skillfully combining filtering, clustering, and search strategies instead of relying on a large amount of complex computations, so it performs best in detection rate.

3.8. BSF and SCRG

The

B S F

and

S C R G

for various algorithms across different datasets are presented in Table 4. This provides evidence that the DRSM performs significantly better in

B S F

, meaning our algorithm achieves superior suppression on backgrounds. This is because the DRSM delimits a search area, and regions outside of this search area are completely suppressed. Nevertheless, the

S C R G

of the DRSM is lower than that of some other methods such as HB-MLCM and STBMPT. The reasons can be summarized as follows: First,

S C R G

assesses the performance of background suppression in the vicinity of targets, which happens to be within the search range. Second, for the search region, the suppression of the background entirely relies on the effect of multi-directional filtering. Finally, the DRSM aims to improve detection speed; so, it does not aggressively pursue the background suppression effect in the target neighborhood. As a consequence, some detection methods that mainly aim to enhance the target or improve local contrast, such as HB-MLCM and STBMPT, perform better in

S C R G

than this method. Nonetheless, the

S C R G

of the DRSM is still superior to most algorithms. In conclusion, taking into account both the BSF and SCRG metrics, the performance of the DRSM remains the best.

3.9. Receiver Operation Characteristic (ROC) Curves and Area Under the Curve (AUC)

The ROC curve in Figure 16 provides comprehensive and intuitive information. First, the curve for the DRSM is almost parallel to the

t h

axis, indicating that threshold segmentation has little impact on the performance of the algorithm. This is because the targets detected by the DRSM are generally the brightest, so they are always retained after threshold segmentation. Additionally, in subfigures b (data06) and d (data15), the

P_{d}

does not reach 1 for the DRSM, because missed detections occurred in some images on these two datasets. Subfigure a indicates that all methods except the DRSM perform poorly on data05, as the target in data05 has a low local contrast, making it the most challenging to detect. In contrast, data08 (c) is relatively easy. Moreover, Figure 16 illustrates the fact that the DRSM, MFSTPT, and STBMPT perform optimally, showing a superior detection rate and an extremely low false alarm rate. In comparison, PSTNN and STLCF show good performance, yet they suffer from a significant rate of false alarms, while the detection performance of the other methods is poor when facing weak targets and changing backgrounds.

Table 5 provides a comprehensive and quantitative comparison of all the data and algorithms.

A U C_{O A}

and

A U C_{S N P R}

reflect the comprehensive performance of an algorithm, where the value of

A U C_{O A}

ranges between 0 and 2, with higher values indicating better performance. It is the same as in Table 4, where the optimal data are marked in red and the suboptimal data are underlined. Overall, the best performing algorithms are the DRSM, MFSTPT, and STBMPT, while the worst performing algorithms are HB-MLCM and MPCM. In terms of the five AUC metrics, the DRSM has the lowest

A U C_{(F, τ)}

, which means it has the lowest false alarm rate. Compared to the DRSM, MFSTPT and STBMPT perform poorly on data06 and data05, respectively. In contrast, the DRSM exhibits excellent performance across all datasets with superior detection rates and the lowest false alarm rates. This is mainly because the DRSM enhances and retains the target through multi-directional filtering, which improves the detection rate of weak and small targets. Meanwhile, it also suppresses the vast majority of false alarms by dividing the target area.

4. Discussion

Nowadays, there is an increasing variety of infrared weak target detection algorithms, yet their fundamental principles remain unchanged. Methods based on background estimation are simple but have limited applicability to specific scenarios. Component-based analysis methods, such as IPI and PSTNN, offer good results but require substantial computational power and have low timeliness. Local contrast-based algorithms, HB-MLCM and MPCM for example, are fast in detection speed but struggle with complex scenes. Moreover, most detection methods rely solely on single-frame images and do not leverage the motion characteristics of targets across multiple frames. In addition, they distinguish targets from false alarms by segmenting the thresholds. Therefore, this paper introduces a rapid detection algorithm, which is aimed at identifying weak targets within complex environments and locks the real target by dual-region search instead of threshold segmentation. The experiments have proven the algorithm’s effectiveness. However, the algorithm may also fail in certain special scenarios.

Insufficient Local Contrast. Essentially, this algorithm remains a detection method based on local features. Although it significantly reduces the requirement for local contrast compared to other local feature-based detection methods, excessively low local contrast can still greatly increase the difficulty of DBSCAN clustering. The issues caused can be mainly divided into two categories. Initially, setting the segmentation threshold excessively high could cause the loss of targets, which hinders effective clustering. Subsequently, setting the segmentation threshold too low might cause a proliferation of dense noise points to cluster, leading to incorrect algorithmic discrimination. Taking data05 as an example, after adding Gaussian white noise with a mean of 0 and a variance of 0.005, the original segmentation threshold would result in clustering failure. Therefore, we altered the segmentation threshold and conducted multiple experiments. The results are shown in Figure 17, where only the experimental group with a threshold of 0.6 achieved successful clustering.
Consecutive Frame Loss in the Dataset. One of the keys to the success of the algorithm is clustering, which relies on the continuity of motion. Therefore, this algorithm is not suitable for datasets where the target’s position spans too widely or is randomly dispersed. If the dataset suffers from consecutive frame loss, the continuity of the target’s position is lost, which introduces a degree of randomness. This situation may lead to clustering failure or the ineffectiveness of the regional search. Figure 18 illustrates an example, where a and b are consecutive frames from data15. However, there may be multiple frames lost between the two frames, leading to a considerable disparity in the target’s locations across the two adjacent frames, which results in the failure of the dual-region search. If these two frames are used for stacking and clustering, it may result in clustering failure.

5. Conclusions

In this paper, an infrared small moving target detection method based on dual-region search (DRSM) is proposed. It has the ability to pinpoint small moving IR targets quickly and accurately within complex environments, especially with changing backgrounds. Multi-directional filtering and min-sum fusion are applied to improve target identification and reduce background clutter, while dual-region search helps lock the target’s location among numerous noise points. The results from the experiments demonstrate that relative to other detection methods, the DRSM achieves faster detection speed, higher detection rate, lower false alarm rate, and better performance on AUC and BSF in IR image sequences with changing backgrounds. However, it can still be improved in future work, such as by adapting the algorithm to make it equally applicable to multiple targets scenes.

Author Contributions

Y.L. and H.C. proposed the original idea. H.C. designed the experiments. H.C. and Z.W. performed the experiments and wrote the manuscript. Y.L. and J.Y. reviewed and edited the manuscript. W.W. revised the manuscript. Y.H. and G.Z. contributed the computational resources and edited the manuscript. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Bingwei Hui et al., “A dataset for infrared image dim-small aircraft target detection and tracking underground/air background”. Science Data Bank, 28 October 2019 (Online). Available at https://www.scidb.cn/en/detail?dataSetId=720626420933459968 (accessed on 5 November 2023).

Conflicts of Interest

The authors declare no conflicts of interest.

References

Chen, Q.; Hamdulla, A. Summary about Detection and Tracking of Infrared Small Targets. In Proceedings of the 2019 12th International Conference on Intelligent Computation Technology and Automation (ICICTA), Xiangtan, China, 26–27 October 2019; pp. 250–253. [Google Scholar] [CrossRef]
Dong, X.; Huang, X.; Zheng, Y.; Shen, L.; Bai, S. Infrared dim and small target detecting and tracking method inspired by Human Visual System. Infrared Phys. Technol. 2014, 62, 100–109. [Google Scholar] [CrossRef]
Chen, C.L.P.; Li, H.; Wei, Y.; Xia, T.; Tang, Y.Y. A Local Contrast Method for Small Infrared Target Detection. IEEE Trans. Geosci. Remote Sens. 2014, 52, 574–581. [Google Scholar] [CrossRef]
Han, J.; Ma, Y.; Zhou, B.; Fan, F.; Liang, K.; Fang, Y. A Robust Infrared Small Target Detection Algorithm Based on Human Visual System. IEEE Geosci. Remote Sens. Lett. 2014, 11, 2168–2172. [Google Scholar] [CrossRef]
Wei, Y.; You, X.; Li, H. Multiscale patch-based contrast measure for small infrared target detection. Pattern Recognit. 2016, 58, 216–226. [Google Scholar] [CrossRef]
Han, J.; Liang, K.; Zhou, B.; Zhu, X.; Zhao, J.; Zhao, L. Infrared Small Target Detection Utilizing the Multiscale Relative Local Contrast Measure. IEEE Geosci. Remote Sens. Lett. 2018, 15, 612–616. [Google Scholar] [CrossRef]
Han, J.; Moradi, S.; Faramarzi, I.; Liu, C.; Zhang, H.; Zhao, Q. A Local Contrast Method for Infrared Small-Target Detection Utilizing a Tri-Layer Window. IEEE Geosci. Remote Sens. Lett. 2020, 17, 1822–1826. [Google Scholar] [CrossRef]
Han, J.; Moradi, S.; Faramarzi, I.; Zhang, H.; Zhao, Q.; Zhang, X.; Li, N. Infrared Small Target Detection Based on the Weighted Strengthened Local Contrast Measure. IEEE Geosci. Remote Sens. Lett. 2021, 18, 1670–1674. [Google Scholar] [CrossRef]
Cui, Y.; Lei, T.; Chen, G.; Zhang, Y.; Peng, L.; Hao, X.; Zhang, G. Hollow Side Window Filter with Saliency Prior for Infrared Small Target Detection. IEEE Geosci. Remote Sens. Lett. 2024, 21, 6001505. [Google Scholar] [CrossRef]
Li, Y.; Li, Z.; Guo, Z.; Siddique, A.; Liu, Y.; Yu, K. Infrared Small Target Detection Based on Adaptive Region Growing Algorithm with Iterative Threshold Analysis. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5003715. [Google Scholar] [CrossRef]
Gao, C.; Meng, D.; Yang, Y.; Wang, Y.; Zhou, X.; Hauptmann, A.G. Infrared Patch-Image Model for Small Target Detection in a Single Image. IEEE Trans. Image Process. 2013, 22, 4996–5009. [Google Scholar] [CrossRef]
Zhang, L.; Peng, Z. Infrared Small Target Detection Based on Partial Sum of the Tensor Nuclear Norm. Remote Sens. 2019, 11, 382. [Google Scholar] [CrossRef]
Hu, Y.; Ma, Y.; Pan, Z.; Liu, Y. Infrared Dim and Small Target Detection from Complex Scenes via Multi-Frame Spatial–Temporal Patch-Tensor Model. Remote Sens. 2022, 14, 2234. [Google Scholar] [CrossRef]
Ma, Y.; Liu, Y.; Pan, Z.; Hu, Y. Method of Infrared Small Moving Target Detection Based on Coarse-to-Fine Structure in Complex Scenes. Remote Sens. 2023, 15, 1508. [Google Scholar] [CrossRef]
Aliha, A.; Liu, Y.; Ma, Y.; Hu, Y.; Pan, Z.; Zhou, G. A Spatial–Temporal Block-Matching Patch-Tensor Model for Infrared Small Moving Target Detection in Complex Scenes. Remote Sens. 2023, 15, 4316. [Google Scholar] [CrossRef]
Liu, Y.; Liu, X.; Hao, X.; Tang, W.; Zhang, S.; Lei, T. Single-Frame Infrared Small Target Detection by High Local Variance, Low-Rank and Sparse Decomposition. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5614317. [Google Scholar] [CrossRef]
Wu, A.; Fan, X.; Min, L.; Qin, W.; Yu, L. Dim and Small Target Detection Based on Local Feature Prior and Tensor Train Nuclear Norm. IEEE Photonics J. 2024, 16, 7800514. [Google Scholar] [CrossRef]
Luo, Y.; Li, X.; Chen, S.; Xia, C. 4DST-BTMD: An Infrared Small Target Detection Method Based on 4-D Data-Sphered Space. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5000520. [Google Scholar] [CrossRef]
Luo, Y.; Li, X.; Chen, S. Feedback Spatial–Temporal Infrared Small Target Detection Based on Orthogonal Subspace Projection. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5001919. [Google Scholar] [CrossRef]
Liu, T.; Yang, J.; Li, B.; Wang, Y.; An, W. Infrared Small Target Detection via Nonconvex Tensor Tucker Decomposition with Factor Prior. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5617317. [Google Scholar] [CrossRef]
Reed, I.; Gagliardi, R.; Stotts, L. Optical moving target detection with 3-D matched filtering. IEEE Trans. Aerosp. Electron. Syst. 1988, 24, 327–336. [Google Scholar] [CrossRef]
Li, Y.; Zhang, Y. Robust infrared small target detection using local steering kernel reconstruction. Pattern Recognit. 2018, 77, 113–125. [Google Scholar] [CrossRef]
Xu, J.; Zhang, J.Q.; Liang, C.H. Prediction of the performance of an algorithm for the detection of small targets in infrared images. Infrared Phys. Technol. 2001, 42, 17–22. [Google Scholar] [CrossRef]
Kim, S.; Sun, S.G.; Kim, K.T. Highly efficient supersonic small infrared target detection using temporal contrast filter. Electron. Lett. 2014, 50, 81–83. [Google Scholar] [CrossRef]
Qu, J.J.; Xin, Y.H. Combined Continuous Frame Difference with Background Difference Method for Moving Object Detection. Acta Photonica Sin. 2014, 43, 219–226. [Google Scholar]
Deng, L.; Zhu, H.; Tao, C.; Wei, Y. Infrared moving point target detection based on spatial–temporal local contrast filter. Infrared Phys. Technol. 2016, 76, 168–173. [Google Scholar] [CrossRef]
Wang, Z.; Yang, J.; Pan, Z.; Liu, Y.; Lei, B.; Hu, Y. APAFNet: Single-Frame Infrared Small Target Detection by Asymmetric Patch Attention Fusion. IEEE Geosci. Remote Sens. Lett. 2023, 20, 7000405. [Google Scholar] [CrossRef]
Chen, S.; Ji, L.; Zhu, J.; Ye, M.; Yao, X. SSTNet: Sliced Spatio-Temporal Network with Cross-Slice ConvLSTM for Moving Infrared Dim-Small Target Detection. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5000912. [Google Scholar] [CrossRef]
Yuan, S.; Qin, H.; Yan, X.; Akhtar, N.; Mian, A. SCTransNet: Spatial-Channel Cross Transformer Network for Infrared Small Target Detection. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5002615. [Google Scholar] [CrossRef]
Sun, H.; Bai, J.; Yang, F.; Bai, X. Receptive-Field and Direction Induced Attention Network for Infrared Dim Small Target Detection with a Large-Scale Dataset IRDST. IEEE Trans. Geosci. Remote Sens. 2023, 61, 5000513. [Google Scholar] [CrossRef]
Li, B.; Wang, L.; Wang, Y.; Wu, T.; Lin, Z.; Li, M.; An, W.; Guo, Y. Mixed-Precision Network Quantization for Infrared Small Target Segmentation. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5000812. [Google Scholar] [CrossRef]
Xiao, C.; An, W.; Zhang, Y.; Su, Z.; Li, M.; Sheng, W.; Pietikäinen, M.; Liu, L. Highly Efficient and Unsupervised Framework for Moving Object Detection in Satellite Videos. IEEE Trans. Pattern Anal. Mach. Intell. 2024, 46, 11532–11539. [Google Scholar] [CrossRef] [PubMed]
Kanopoulos, N.; Vasanthavada, N.; Baker, R. Design of an image edge detection filter using the Sobel operator. IEEE J. Solid-State Circuits 1988, 23, 358–367. [Google Scholar] [CrossRef]
Hui, B.; Song, Z.; Fan, H.; Zhong, P.; Hu, W.; Zhang, X.; Ling, J.; Su, H.; Jin, W.; Zhang, Y.; et al. A dataset for infrared detection and tracking of dim-small aircraft targets under ground/air background. China Sci. 2020, 5, 291–302. [Google Scholar] [CrossRef]
Shi, Y.; Wei, Y.; Yao, H.; Pan, D.; Xiao, G. High-Boost-Based Multiscale Local Contrast Measure for Infrared Small Target Detection. IEEE Geosci. Remote Sens. Lett. 2018, 15, 33–37. [Google Scholar] [CrossRef]
Cui, H.; Li, L.; Liu, X.; Su, X.; Chen, F. Infrared Small Target Detection Based on Weighted Three-Layer Window Local Contrast. IEEE Geosci. Remote Sens. Lett. 2022, 19, 7505705. [Google Scholar] [CrossRef]
Guan, X.; Peng, Z.; Huang, S.; Chen, Y. Gaussian Scale-Space Enhanced Local Contrast Measure for Small Infrared Target Detection. IEEE Geosci. Remote Sens. Lett. 2020, 17, 327–331. [Google Scholar] [CrossRef]
Bai, X.; Bi, Y. Derivative Entropy-Based Contrast Measure for Infrared Small-Target Detection. IEEE Trans. Geosci. Remote Sens. 2018, 56, 2452–2466. [Google Scholar] [CrossRef]
Liu, T.; Liu, Y.; Yang, J.; Li, B.; Wang, Y.; An, W. Graph Laplacian regularization for fast infrared small target detection. Pattern Recognit. 2025, 158, 111077. [Google Scholar] [CrossRef]

Figure 1. Comparison of two methods for target decision from saliency feature maps. (a,b) are filtered feature maps for two different cases, where the target is brightest in (a) and the false alarm is brightest in (b). (c–f) are the results of the two decision-making methods in the both cases.

Figure 2. Multi-directional filter bank. Correction factors are highlighted in yellow.

Figure 3. DBSCAN clustering. Area 1 is the target clustering, Area 2 is the clutter clustering, and Areas 3 and 4 do not form clusters due to the dispersion of points.

Figure 4. Dual-region search. The green square represents the fixed search region, the blue square represents the dynamic search region, and the red square represents the target.

Figure 5. Flowchart of the target system. The green box represents the fixed search region, the yellow box represents the dynamic search region, and the red box represents the target or target class.

Figure 6. Depiction of the target area and neighborhood.

Figure 7. (a1–e1) Original representative images in data05, data06, data08, data15, and data22, respectively. (a2–e2) The filtered feature maps. (a3–e3) Final detection results. The red box represents the target and the green box represents the fixed search region.

Figure 8. (a1–e1) Original images added Gaussian white noise with a variance of 0.002 in data05, data06, data08, data15, and data22, respectively. (a2–e2) The filtered feature maps. (a3–e3) Final detection results. The red box represents the target and the green box represents the fixed search region.

Figure 9. (a1–d1) Original images added Gaussian white noise with a variance of 0.01 in data06, data08, data15, and data22, respectively. (a2–d2) The filtered feature maps. (a3–d3) Final detection results. The red box represents the target and the green box represents the fixed search region.

Figure 10. Bar chart of clustering results under different segmentation thresholds

t h

.

Figure 10. Bar chart of clustering results under different segmentation thresholds

t h

.

Figure 11. Clustering results under different segmentation thresholds

t h

. The red box represents the target class detected by the algorithm.

Figure 11. Clustering results under different segmentation thresholds

t h

. The red box represents the target class detected by the algorithm.

Figure 12. False detection because of large r.

Figure 13. Missed detection because of small r.

Figure 14. (a1–e1) Original representative images in 5 datasets. (a2–e10) 2-D detection results of HB-MLCM, MPCM, PSTNN, STLCF, TLLCM, WTLLCM, MFSTPT, STBMPT, and DRSM, respectively. The red box represents the target and images without red box indicate detection failure.

Figure 15. (a1–e1) Original representative images in 5 datasets. (a2–e10) 3-D detection results of HB-MLCM, MPCM, PSTNN, STLCF, TLLCM, WTLLCM, MFSTPT, STBMPT, and DRSM, respectively. The red box represents the target and the red circle represents the targets that can be identified from 3-D results.

Figure 16. (a–e) The ROC curves of the different algorithms on data05, data06, data08, data15, and data22, respectively.

Figure 17. (a) Original images with marked target. (b–f) Clustering results at thresholds of 0.5, 0.55, 0.6, 0.65, and 0.7, respectively. The red boxes represent the targets detected by the algorithm.

Figure 18. Frame loss in the dataset. (a) is frame 322 from data15, and (b) is frame 323. The green box represents the fixed search area, the red box is the detection result, and the yellow one represents the true target.

Table 1. Information of the infrared sequences.

Dataset	Frames	ASNR ¹	Resolution	Target Size
data05	400	5.45	256 × 256	1 × 1
data06	398	5.11	256 × 256	1 × 1
data08	332	6.07	256 × 256	1 × 1
data15	400	3.42	256 × 256	1 × 1
data22	460	2.20	256 × 256	1 × 1

¹ Average Signal to Noise Ratio.

Table 2. Parameters of the different methods.

Method	Parameters
HB-MLCM [35]	Filter size: $9 \times 9$ ; external window: $15 \times 15$ , len = 3, 5, 7, 9
MPCM [5]	Mask size: $3 \times 3$ , $5 \times 5$ , $7 \times 7$ , $9 \times 9$
PSTNN [12]	Patch size: $40 \times 40$ ; step: 40, $λ = \frac{3.7}{\sqrt{min (m, n)}} \cdot n_{3}$ , $ε = 10^{- 7}$
STLCF [26]	tspan = 5, swind = 5
TLLCM [7]	Mask size: $27 \times 27$
WTLLCM [36]	Window size: $3 \times 3$ , K = 4
MFSTPT [13]	Patch size: $60 \times 60$ ; step: 60, $λ = \frac{1.0}{\sqrt{max (n_{1}, n_{2}) \cdot n_{3}}}$
STBMPT [15]	Blocksize: $30 \times 30$ ; $μ = 3 \times 10^{- 3}$ , $λ = \frac{0.7}{\sqrt{max (n_{1}, n_{2}) \cdot n_{3}}}$ , $θ = 0.4$
DRSM	dbscan: $[3, 2]$ ; period = 15, th = 0.7, r = 5

Table 3. The average detection time (in milliseconds) of the algorithms in 5 datasets.

Algorithm	data05	data06	data08	data15	data22
HB-MLCM [35]	19	18	18	18	18
MPCM [5]	26	25	25	25	25
PSTNN [12]	505	576	525	571	528
STLCF [26]	283	289	266	261	265
TLLCM [7]	488	496	478	480	486
WTLLCM [36]	119	115	118	117	118
MFSTPT [13]	101,800	81,000	65,200	55,400	59,200
STBMPT [15]	13,300	12,550	12,550	10,350	10,400
DRSM	9	6	6	6	7

NOTES: Bold represents the best results.

Table 4. The BSF and SCRG of the algorithms in 5 datasets.

Dataset	Indicator	HB-MLCM [35]	MPCM [5]	PSTN N [12]	STLC F [26]	TLLC M [7]	WTLL CM [36]	MFST PT [13]	STBM PT [15]	DRSM
Data05	BSF	3.3	2.2	2.5	1.3	3.7	5.1	10.5	6.7	513.2
	SCRG	16.51	11.26	4.02	3.11	2.74	5.84	4.24	55.0	4.31
Data06	BSF	2.7	3.6	4.3	1.6	4.8	6.9	22.1	11.9	749.9
	SCRG	11.19	40.55	4.43	3.61	2.15	4.17	4.93	1.94	6.33
Data08	BSF	4.8	4.7	3.8	2.1	5.8	9.5	23.5	10.8	519.5
	SCRG	13.28	6.26	4.44	3.28	1.12	5.51	7.49	11.21	5.45
Data15	BSF	5.5	5.0	3.5	1.5	7.4	9.5	120.8	19.4	1019.6
	SCRG	108.61	6.31	1.82	2.51	2.29	2.99	1.59	26.01	3.04
Data22	BSF	9.8	9.2	11.1	4.9	13.7	16.1	144.0	53.4	333.6
	SCRG	21.19	3.37	5.21	3.26	0.55	3.13	6.30	20.68	6.78

NOTES: Red and bold represents the highest value and underline represents the second highest value.

Table 5. The AUC of the algorithms in 5 datasets.

Dataset	Indicator	HB-MLCM [35]	MPCM [5]	PSTN N [12]	STLC F [26]	TLLC M [7]	WTLL CM [36]	MFST PT [13]	STBM PT [15]	DRSM
Data05	$A U C_{(D, F)}$	0.05	0.12	0.51	0.75	0.92	0.90	0.97	0.69	1.00
	$A U C_{(D, τ)}$	0.05	0.02	0.22	0.13	0.39	0.23	0.58	0.40	0.67
	$A U C_{(F, τ)} (E - 03)$	2.9	4.0	4.8	2.6	4.3	2.2	2.3	0.8	0.1
	$A U C_{O A}$	0.05	0.13	0.73	0.89	1.31	1.13	1.55	1.09	1.66
	$A U C_{S N P R}$	1.6	6.1	46.3	52.0	92.0	102.0	254.6	469.6	7698.7
Data06	$A U C_{(D, F)}$	0.26	0.35	0.35	0.65	0.85	0.90	0.58	1.00	1.00
	$A U C_{(D, τ)}$	0.05	0.10	0.16	0.33	0.55	0.22	0.40	0.67	0.65
	$A U C_{(F, τ)} (E - 03)$	2.8	3.6	4.2	2.9	6.8	2.0	0.9	0.4	0.2
	$A U C_{O A}$	0.31	0.44	0.51	0.98	1.39	1.12	1.00	1.67	1.65
	$A U C_{S N P R}$	16.9	26.5	37.5	114.1	81.7	109.6	427.4	1749.5	3934.1
Data08	$A U C_{(D, F)}$	0.87	0.72	0.24	0.96	0.99	0.99	1.00	1.00	1.00
	$A U C_{(D, τ)}$	0.34	0.24	0.06	0.44	0.51	0.30	0.67	0.67	0.67
	$A U C_{(F, τ)} (E - 03)$	3.1	3.0	3.5	1.7	4.4	2.2	1.3	0.6	0.1
	$A U C_{O A}$	1.21	0.96	0.29	1.39	1.49	1.29	1.67	1.67	1.67
	$A U C_{S N P R}$	110.5	80.4	15.7	256.2	116.3	135.1	534.0	1204.3	6872.1
Data15	$A U C_{(D, F)}$	0	0	0.55	0.58	0.62	0.81	1.00	0.70	0.94
	$A U C_{(D, τ)}$	0	0	0.13	0.03	0.25	0.19	0.67	0.42	0.62
	$A U C_{(F, τ)} (E - 03)$	3.1	3.6	3.5	1.8	5.1	2.5	0.7	0.3	0.1
	$A U C_{O A}$	0	0	0.68	0.61	0.87	1.00	1.67	1.12	1.60
	$A U C_{S N P R}$	0	0	37.2	17.9	49.8	74.4	933.0	1600.0	6531.8
Data22	$A U C_{(D, F)}$	0.09	0.11	0.05	0.31	0.98	0.98	1.00	0.90	0.98
	$A U C_{(D, τ)}$	0.01	0.02	0.02	0.06	0.49	0.29	0.67	0.60	0.66
	$A U C_{(F, τ)} (E - 03)$	3.2	2.3	4.3	2.9	3.8	2.4	0.6	0.4	0.1
	$A U C_{O A}$	0.10	0.13	0.06	0.36	1.47	1.26	1.67	1.50	1.64
	$A U C_{S N P R}$	4.2	8.0	3.6	20.0	128.6	119.4	1149.9	1478.8	4655.1

NOTES: Red and bold represents the best value and underline represents the next best value.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Cao, H.; Hu, Y.; Wang, Z.; Yang, J.; Zhou, G.; Wang, W.; Liu, Y. An Infrared Small Moving Target Detection Method in Complex Scenes Based on Dual-Region Search. Remote Sens. 2025, 17, 323. https://doi.org/10.3390/rs17020323

AMA Style

Cao H, Hu Y, Wang Z, Yang J, Zhou G, Wang W, Liu Y. An Infrared Small Moving Target Detection Method in Complex Scenes Based on Dual-Region Search. Remote Sensing. 2025; 17(2):323. https://doi.org/10.3390/rs17020323

Chicago/Turabian Style

Cao, Huazhao, Yuxin Hu, Ziming Wang, Jianwei Yang, Guangyao Zhou, Wenzhi Wang, and Yuhan Liu. 2025. "An Infrared Small Moving Target Detection Method in Complex Scenes Based on Dual-Region Search" Remote Sensing 17, no. 2: 323. https://doi.org/10.3390/rs17020323

APA Style

Cao, H., Hu, Y., Wang, Z., Yang, J., Zhou, G., Wang, W., & Liu, Y. (2025). An Infrared Small Moving Target Detection Method in Complex Scenes Based on Dual-Region Search. Remote Sensing, 17(2), 323. https://doi.org/10.3390/rs17020323

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

An Infrared Small Moving Target Detection Method in Complex Scenes Based on Dual-Region Search

Abstract

1. Introduction

1.1. Related Works

1.2. Motivation

2. Proposed Algorithm

2.1. Multi-Directional Filtering

2.2. Trajectory Clustering

2.3. Dual-Region Search

3. Experiment and Analysis

3.1. Evaluation Metrics

3.2. Detection Results of DRSM

3.3. Robustness to Noise

3.4. Parameter Analyses

3.5. 2-D Detection Results

3.6. 3-D Detection Results

3.7. Average Detection Time and Complexity Analysis

3.8. BSF and SCRG

3.9. Receiver Operation Characteristic (ROC) Curves and Area Under the Curve (AUC)

4. Discussion

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI