Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (1,406)

Search Parameters:
Keywords = channel state information

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
26 pages, 5594 KiB  
Article
Multiscale Residual Weighted Classification Network for Human Activity Recognition in Microwave Radar
by Yukun Gao, Lin Cao, Zongmin Zhao, Dongfeng Wang, Chong Fu and Yanan Guo
Sensors 2025, 25(1), 197; https://doi.org/10.3390/s25010197 (registering DOI) - 1 Jan 2025
Abstract
Human activity recognition by radar sensors plays an important role in healthcare and smart homes. However, labeling a large number of radar datasets is difficult and time-consuming, and it is difficult for models trained on insufficient labeled data to obtain exact classification results. [...] Read more.
Human activity recognition by radar sensors plays an important role in healthcare and smart homes. However, labeling a large number of radar datasets is difficult and time-consuming, and it is difficult for models trained on insufficient labeled data to obtain exact classification results. In this paper, we propose a multiscale residual weighted classification network with large-scale, medium-scale, and small-scale residual networks. Firstly, an MRW image encoder is used to extract salient feature representations from all time-Doppler images through contrastive learning. This can extract the representative vector of each image and also obtain the pre-training parameters of the MRW image encoder. During the pre-training process, large-scale residual networks, medium-scale residual networks, and small-scale residual networks are used to extract global information, texture information, and semantic information, respectively. Moreover, the time–channel weighting mechanism can allocate weights to important time and channel dimensions to achieve more effective extraction of feature information. The model parameters obtained from pre-training are frozen, and the classifier is added to the backend. Finally, the classifier is fine-tuned using a small amount of labeled data. In addition, we constructed a new dataset with eight dangerous activities. The proposed MRW-CN model was trained on this dataset and achieved a classification accuracy of 96.9%. We demonstrated that our method achieves state-of-the-art performance. The ablation analysis also demonstrated the role of multi-scale convolutional kernels and time–channel weighting mechanisms in classification. Full article
(This article belongs to the Section Biomedical Sensors)
28 pages, 9860 KiB  
Article
Spatial Cognitive EEG Feature Extraction and Classification Based on MSSECNN and PCMI
by Xianglong Wan, Yue Sun, Yiduo Yao, Wan Zuha Wan Hasan and Dong Wen
Bioengineering 2025, 12(1), 25; https://doi.org/10.3390/bioengineering12010025 - 31 Dec 2024
Viewed by 202
Abstract
With the aging population rising, the decline in spatial cognitive ability has become a critical issue affecting the quality of life among the elderly. Electroencephalogram (EEG) signal analysis presents substantial potential in spatial cognitive assessments. However, conventional methods struggle to effectively classify spatial [...] Read more.
With the aging population rising, the decline in spatial cognitive ability has become a critical issue affecting the quality of life among the elderly. Electroencephalogram (EEG) signal analysis presents substantial potential in spatial cognitive assessments. However, conventional methods struggle to effectively classify spatial cognitive states, particularly in tasks requiring multi-class discrimination of pre- and post-training cognitive states. This study proposes a novel approach for EEG signal classification, utilizing Permutation Conditional Mutual Information (PCMI) for feature extraction and a Multi-Scale Squeezed Excitation Convolutional Neural Network (MSSECNN) model for classification. Specifically, the MSSECNN classifies spatial cognitive states into two classes—before and after cognitive training—based on EEG features. First, the PCMI extracts nonlinear spatial features, generating spatial feature matrices across different channels. SENet then adaptively weights these features, highlighting key channels. Finally, the MSCNN model captures local and global features using convolution kernels of varying sizes, enhancing classification accuracy and robustness. This study systematically validates the model using cognitive training data from a brain-controlled car and manually operated UAV tasks, with cognitive state assessments performed through spatial cognition games combined with EEG signals. The experimental findings demonstrate that the proposed model significantly outperforms traditional methods, offering superior classification accuracy, robustness, and feature extraction capabilities. The MSSECNN model’s advantages in spatial cognitive state classification provide valuable technical support for early identification and intervention in cognitive decline. Full article
Show Figures

Figure 1

18 pages, 8282 KiB  
Article
Adaptive Asymptotic Shape Synchronization of a Chaotic System with Applications for Image Encryption
by Yangxin Luo, Yuanyuan Huang, Fei Yu, Diqing Liang and Hairong Lin
Mathematics 2025, 13(1), 128; https://doi.org/10.3390/math13010128 - 31 Dec 2024
Viewed by 171
Abstract
In contrast to previous research that has primarily focused on distance synchronization of states in chaotic systems, shape synchronization emphasizes the geometric shape of the attractors of two chaotic systems. Diverging from the existing work on shape synchronization, this paper introduces the application [...] Read more.
In contrast to previous research that has primarily focused on distance synchronization of states in chaotic systems, shape synchronization emphasizes the geometric shape of the attractors of two chaotic systems. Diverging from the existing work on shape synchronization, this paper introduces the application of adaptive control methods to achieve asymptotic shape synchronization for the first time. By designing an adaptive controller using the proposed adaptive rule, the response system under control is able to attain asymptotic synchronization with the drive system. This method is capable of achieving synchronization for models with parameters requiring estimation in both the drive and response systems. The control approach remains effective even in the presence of uncertainties in model parameters. The paper presents relevant theorems and proofs, and simulation results demonstrate the effectiveness of adaptive asymptotic shape synchronization. Due to the pseudo-random nature of chaotic systems and their extreme sensitivity to initial conditions, which make them suitable for information encryption, a novel channel-integrated image encryption scheme is proposed. This scheme leverages the shape synchronization method to generate pseudo-random sequences, which are then used for shuffling, scrambling, and diffusion processes. Simulation experiments demonstrate that the proposed encryption algorithm achieves exceptional performance in terms of correlation metrics and entropy, with a competitive value of 7.9971. Robustness is further validated through key space analysis, yielding a value of 10210×2512, as well as visual tests, including center and edge cropping. The results confirm the effectiveness of adaptive asymptotic shape synchronization in the context of image encryption. Full article
(This article belongs to the Special Issue Nonlinear Dynamics, Chaos and Complex Systems)
33 pages, 1251 KiB  
Article
Psychological Poverty Traps in Rural Farm Households: Implications for Sustainable Agricultural Development and Rural Revitalization in China
by Dong Zhang, Hongfeng Zhang, Ziran Meng and Jianxu Liu
Viewed by 290
Abstract
In the context of sustainable agricultural development and rural revitalization in China, understanding and addressing psychological poverty traps among rural farm households is crucial. The poverty mindset represents a crucial factor affecting rural poverty. This study focuses on two key questions: first, whether [...] Read more.
In the context of sustainable agricultural development and rural revitalization in China, understanding and addressing psychological poverty traps among rural farm households is crucial. The poverty mindset represents a crucial factor affecting rural poverty. This study focuses on two key questions: first, whether and how material poverty influences the poverty mindset; and second, whether this psychological state affects economic behavior, potentially intensifying material poverty. Using data from the China Family Panel Studies (CFPS) spanning 2014–2018, the data collection employed a multi-stage stratified sampling approach. Multiple methods, including questionnaire surveys and in-depth interviews, were utilized to gather information. Through matching and merging processes based on personal questionnaire IDs, a total of 30,143 observations were obtained over a three-year period. We employ Causal Mediation Analysis (CMA) to examine the micro-level mechanisms between material and psychological poverty among rural farm households. Our findings reveal three key insights. First, material poverty significantly reduces aspiration levels and behavioral capabilities of rural farm households, with impoverished groups scoring approximately 10% lower than non-poor groups. Second, this negative impact operates through two primary channels: stigma effects (self-stigmatization 11.29%, social stigma 4.71%) and psychological resource depletion (negative emotions 1.5%, psychological stress 1.27%). Third, psychological poverty reinforces material poverty through aspiration failure (72.3%) and capability deficiency (75.68%), creating a self-perpetuating “psychological poverty trap” that particularly affects agricultural production efficiency. These findings suggest that sustainable agricultural development requires addressing both material and psychological dimensions of rural poverty. Policy recommendations include strengthening psychological support for farm households, enhancing agricultural capacity building, mitigating stigma effects in rural communities, and reconstructing psychological resources for sustainable development. This integrated approach can help break psychological poverty traps, improve agricultural productivity, and support rural revitalization in China. Full article
Show Figures

Figure 1

14 pages, 3941 KiB  
Article
Low-Resolution Target Detection with High-Frequency Information Preservation
by Feng Zhang, Hongyang Bai, Wenlong Yin, Ze Li, Hailong Ma and Lei Chen
Appl. Sci. 2025, 15(1), 103; https://doi.org/10.3390/app15010103 - 26 Dec 2024
Viewed by 379
Abstract
In the absence of high-frequency visual observation, low-resolution (LR) targets (e.g., objects, human body keypoints) are intrinsically difficult to detect in unconstrained images. This challenge can be further exasperated by typical downsampling operations (e.g., pooling, stride) of existing deep networks (e.g., CNNs). To [...] Read more.
In the absence of high-frequency visual observation, low-resolution (LR) targets (e.g., objects, human body keypoints) are intrinsically difficult to detect in unconstrained images. This challenge can be further exasperated by typical downsampling operations (e.g., pooling, stride) of existing deep networks (e.g., CNNs). To tackle this challenge, in this work, we introduce a generic, High-Frequency Information Preservation (HFIP) block as a replacement for existing downsampling operations. It is composed of two key components: (1) the decoupled high-frequency learning component, which extracts the high-frequency information along the vertical and horizontal directions separately, and (2) the dilated frequency-aware channel correlation component, which decomposes the input feature map into multiple smaller ones in a dilated manner, concatenates them by channel, and then correlates the combined channels in the frequency space. Our module can generally be integrated into existing network architectures for target detection (e.g., YOLO, HRNet). Extensive experiments on low-resolution human pose estimation and object detection tasks show that our HFIP technique can generally boost the performance of state-of-the-art detection models significantly, e.g., improving the object detection accuracy of YOLOv5s by an absolute margin of 3.30% in mAP under a resolution of 640 × 640 compared to the COCO benchmark. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

15 pages, 442 KiB  
Article
Performance Analysis of Artificial-Noise-Based Secure Transmission in Wiretap Channel
by Hyukmin Son
Mathematics 2025, 13(1), 32; https://doi.org/10.3390/math13010032 - 26 Dec 2024
Viewed by 320
Abstract
Artificial noise (AN)-aided techniques have been considered to be promising and practical candidates for enhancing physical layer security. However, there has been a lack of analysis regarding the AN effect on the eavesdropper (EV) from the perspective of the signal-to-interference plus noise ratio [...] Read more.
Artificial noise (AN)-aided techniques have been considered to be promising and practical candidates for enhancing physical layer security. However, there has been a lack of analysis regarding the AN effect on the eavesdropper (EV) from the perspective of the signal-to-interference plus noise ratio (SINR) regarding the existence of the EV’s channel state information (CSI) at the legitimate transmitter. In this paper, we analyze the performance of AN-aided secure transmission from the SINR perspective when a legitimate transmitter has and does not have the EV’s CSI. Based on the analyzed EV’s SINRs for the above two cases, the secrecy gap, which is the difference between the two secrecy capacities, is defined and analyzed. Based on the derived secrecy gap, we have analyzed the asymptotic performances of the secrecy capacity and gap when the number of antennas of the legitimate transmitter and the number of antennas of the EV have large values. Through asymptotic analysis, it is demonstrated that the AN-aided secure transmission under the practical environment (i.e., the case that the EV’s CSI is not available at the legitimate transmitter) can nearly achieve an ideal performance (i.e., the performance when the EV’s CSI is available at the legitimate transmitter) in a massive antenna system. Full article
Show Figures

Figure 1

20 pages, 3176 KiB  
Article
Spectral Weaver: A Study of Forest Image Classification Based on SpectralFormer
by Haotian Yu, Xuyang Li, Xinggui Xu, Hong Li and Xiangsuo Fan
Forests 2025, 16(1), 21; https://doi.org/10.3390/f16010021 - 26 Dec 2024
Viewed by 178
Abstract
In forest ecosystems, the application of hyperspectral (HS) imagery offers unprecedented opportunities for refined identification and classification. The diversity and complexity of forest cover make it challenging for traditional remote-sensing techniques to capture subtle spectral differences. Hyperspectral imagery, however, can reveal the nuanced [...] Read more.
In forest ecosystems, the application of hyperspectral (HS) imagery offers unprecedented opportunities for refined identification and classification. The diversity and complexity of forest cover make it challenging for traditional remote-sensing techniques to capture subtle spectral differences. Hyperspectral imagery, however, can reveal the nuanced changes in different tree species, vegetation health status, and soil composition through its nearly continuous spectral information. This detailed spectral information is crucial for the monitoring, management, and conservation of forest resources. While Convolutional Neural Networks (CNNs) have demonstrated excellent local context modeling capabilities in HS image classification, their inherent network architecture limits the exploration and representation of spectral feature sequence properties. To address this issue, we have rethought HS image classification from a sequential perspective and proposed a hybrid model, the Spectral Weaver, which combines CNNs and Transformers. The Spectral Weaver replaces the traditional Multi-Head Attention mechanism with a Channel Attention mechanism (MCA) and introduces Centre-Differential Convolutional Layers (Conv2d-cd) to enhance spatial feature extraction capabilities. Additionally, we designed a cross-layer skip connection that adaptively learns to fuse “soft” residuals, transferring memory-like components from shallow to deep layers. Notably, the proposed model is a highly flexible backbone network, adaptable to both hyperspectral and multispectral image inputs. In comparison to traditional Visual Transformers (ViT), the Spectral Weaver innovates in several ways: (1) It introduces the MCA mechanism to enhance the mining of spectral feature sequence properties; (2) It employs Centre-Differential Convolutional Layers to strengthen spatial feature extraction; (3) It designs cross-layer skip connections to reduce information loss; (4) It supports both multispectral and hyperspectral inputs, increasing the model’s flexibility and applicability. By integrating global and local features, our model significantly improves the performance of HS image classification. We have conducted extensive experiments on the Gaofen dataset, multispectral data, and multiple hyperspectral datasets, validating the superiority of the Spectral Weaver model in forest hyperspectral image classification. The experimental results show that our model achieves 98.59% accuracy on multispectral data, surpassing ViT’s 96.30%. On the Jilin-1 dataset, our proposed algorithm achieved an accuracy of 98.95%, which is 2.17% higher than ViT. The model significantly outperforms classic ViT and other state-of-the-art backbone networks in classification performance. Not only does it effectively capture the spectral features of forest vegetation, but it also significantly improves the accuracy and robustness of classification, providing strong technical support for the refined management and conservation of forest resources. Full article
(This article belongs to the Section Forest Inventory, Modeling and Remote Sensing)
Show Figures

Figure 1

24 pages, 5495 KiB  
Article
Generative Image Steganography via Encoding Pose Keypoints
by Yi Cao, Wentao Ge, Chengsheng Yuan and Quan Wang
Appl. Sci. 2025, 15(1), 58; https://doi.org/10.3390/app15010058 - 25 Dec 2024
Viewed by 339
Abstract
Existing generative image steganography methods typically encode secret information into latent vectors, which are transformed into the entangled features of generated images. This approach faces two main challenges: (1) Transmission can degrade the quality of stego-images, causing bit errors in information extraction. (2) [...] Read more.
Existing generative image steganography methods typically encode secret information into latent vectors, which are transformed into the entangled features of generated images. This approach faces two main challenges: (1) Transmission can degrade the quality of stego-images, causing bit errors in information extraction. (2) High embedding capacity often reduces the accuracy of information extraction. To overcome these limitations, this paper presents a novel generative image steganography via encoding pose keypoints. This method employs an LSTM-based sequence generation model to embed secret information into the generation process of pose keypoint sequences. Each generated sequence is drawn as a keypoint connectivity graph, which serves as input with an original image to a trained pose-guided person image generation model (DPTN-TA) to generate an image with the target pose. The sender uploads the generated images to a public channel to transmit the secret information. On the receiver’s side, an improved YOLOv8 pose estimation model extracts the pose keypoints from the stego-images and decodes the embedded secret information using the sequence generation model. Extensive experiments on the DeepFashion dataset show that the proposed method significantly outperforms state-of-the-art methods in information extraction accuracy, achieving 99.94%. It also achieves an average hiding capacity of 178.4 bits per image. This method is robust against common image attacks, such as salt and pepper noise, median filtering, compression, and screenshots, with an average bit error rate of less than 0.87%. Additionally, the method is optimized for fast inference and lightweight deployment, enhancing its real-world applicability. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

16 pages, 2388 KiB  
Article
Mitigating Data Leakage in a WiFi CSI Benchmark for Human Action Recognition
by Domonkos Varga
Sensors 2024, 24(24), 8201; https://doi.org/10.3390/s24248201 - 22 Dec 2024
Viewed by 526
Abstract
Human action recognition using WiFi channel state information (CSI) has gained attention due to its non-intrusive nature and potential applications in healthcare, smart environments, and security. However, the reliability of methods developed for CSI-based action recognition is often contingent on the quality of [...] Read more.
Human action recognition using WiFi channel state information (CSI) has gained attention due to its non-intrusive nature and potential applications in healthcare, smart environments, and security. However, the reliability of methods developed for CSI-based action recognition is often contingent on the quality of the datasets and evaluation protocols used. In this paper, we uncovered a critical data leakage issue, which arises from improper data partitioning, in a widely used WiFi CSI benchmark dataset. Specifically, the benchmark fails to separate individuals between the training and test sets, leading to inflated performance metrics as models inadvertently learn individual-specific features rather than generalizable action patterns. We analyzed this issue in depth, retrained several benchmarked models using corrected data partitioning methods, and demonstrated a significant drop in accuracy when individuals were properly separated across training and testing. Our findings highlight the importance of rigorous data partitioning in CSI-based action recognition and provide recommendations for mitigating data leakage in future research. This work contributes to the development of more robust and reliable human action recognition systems using WiFi CSI. Full article
Show Figures

Figure 1

18 pages, 7403 KiB  
Article
A Full-Scale Shadow Detection Network Based on Multiple Attention Mechanisms for Remote-Sensing Images
by Lei Zhang, Qing Zhang, Yu Wu, Yanfeng Zhang, Shan Xiang, Donghai Xie and Zeyu Wang
Remote Sens. 2024, 16(24), 4789; https://doi.org/10.3390/rs16244789 - 22 Dec 2024
Viewed by 362
Abstract
Shadows degrade image quality and complicate interpretation, underscoring the importance of accurate shadow detection for many image analysis tasks. However, due to the complex backgrounds and variable shadow characteristics of remote sensing images (RSIs), existing methods often struggle with accurately detecting shadows of [...] Read more.
Shadows degrade image quality and complicate interpretation, underscoring the importance of accurate shadow detection for many image analysis tasks. However, due to the complex backgrounds and variable shadow characteristics of remote sensing images (RSIs), existing methods often struggle with accurately detecting shadows of various scales and misclassifying dark, non-shaded areas as shadows. To address these issues, we proposed a comprehensive shadow detection network called MAMNet. Firstly, we proposed a multi-scale spatial channel attention fusion module, which extracted multi-scale features incorporating both spatial and channel information, allowing the model to flexibly adapt to shadows of different scales. Secondly, to address the issue of false detection in non-shadow areas, we introduced a criss-cross attention module, enabling non-shadow pixels to be compared with other shadow and non-shadow pixels in the same row and column, learning similar features of pixels in the same category, which improved the classification accuracy of non-shadow pixels. Finally, to address the issue of important information from the other two modules being lost due to continuous upsampling during the decoding phase, we proposed an auxiliary branch module to assist the main branch in decision-making, ensuring that the final output retained the key information from all stages. The experimental results demonstrated that the model outperformed the current state-of-the-art RSI shadow detection method on the aerial imagery dataset for shadow detection (AISD). The model achieved an overall accuracy (OA) of 97.50%, an F1 score of 94.07%, an intersection over union (IOU) of 88.87%, a precision of 95.06%, and a BER of 4.05%, respectively. Additionally, visualization results indicated that our model could effectively detect shadows of various scales while avoiding false detection in non-shadow areas. Therefore, this model offers an efficient solution for shadow detection in aerial imagery. Full article
Show Figures

Figure 1

20 pages, 13020 KiB  
Article
Multi-Dimensional and Multi-Scale Physical Dehazing Network for Remote Sensing Images
by Hao Zhou, Le Wang, Qiao Li, Xin Guan and Tao Tao
Remote Sens. 2024, 16(24), 4780; https://doi.org/10.3390/rs16244780 - 22 Dec 2024
Viewed by 308
Abstract
Haze obscures remote sensing images, making it difficult to extract valuable information. To address this problem, we propose a fine detail extraction network that aims to restore image details and improve image quality. Specifically, to capture fine details, we design multi-scale and multi-dimensional [...] Read more.
Haze obscures remote sensing images, making it difficult to extract valuable information. To address this problem, we propose a fine detail extraction network that aims to restore image details and improve image quality. Specifically, to capture fine details, we design multi-scale and multi-dimensional extraction blocks and then fuse them to optimize feature extraction. The multi-scale extraction block adopts multi-scale pixel attention and channel attention to extract and combine global and local information from the image. Meanwhile, the multi-dimensional extraction block uses depthwise separable convolutional layers to capture additional dimensional information. Additionally, we integrate an atmospheric scattering model unit into the network to enhance both the dehazing effectiveness and stability. Our experiments on the SateHaze1k and HRSD datasets demonstrate that the proposed method efficiently handles remote sensing images with varying levels of haze, successfully recovers fine details, and achieves superior results compared to existing state-of-the-art dehazing techniques. Full article
(This article belongs to the Special Issue Deep Learning for Remote Sensing Image Enhancement)
Show Figures

Figure 1

18 pages, 14931 KiB  
Article
Wavelet-Driven Multi-Band Feature Fusion for RGB-T Salient Object Detection
by Jianxun Zhao, Xin Wen, Yu He, Xiaowei Yang and Kechen Song
Sensors 2024, 24(24), 8159; https://doi.org/10.3390/s24248159 - 20 Dec 2024
Viewed by 436
Abstract
RGB-T salient object detection (SOD) has received considerable attention in the field of computer vision. Although existing methods have achieved notable detection performance in certain scenarios, challenges remain. Many methods fail to fully utilize high-frequency and low-frequency features during information interaction among different [...] Read more.
RGB-T salient object detection (SOD) has received considerable attention in the field of computer vision. Although existing methods have achieved notable detection performance in certain scenarios, challenges remain. Many methods fail to fully utilize high-frequency and low-frequency features during information interaction among different scale features, limiting detection performance. To address this issue, we propose a method for RGB-T salient object detection that enhances performance through wavelet transform and channel-wise attention fusion. Through feature differentiation, we effectively extract spatial characteristics of the target, enhancing the detection capability for global context and fine-grained details. First, input features are passed through the channel-wise criss-cross module (CCM) for cross-modal information fusion, adaptively adjusting the importance of features to generate rich fusion information. Subsequently, the multi-scale fusion information is input into the feature selection wavelet transforme module (FSW), which selects beneficial low-frequency and high-frequency features to improve feature aggregation performance and achieves higher segmentation accuracy through long-distance connections. Extensive experiments demonstrate that our method outperforms 22 state-of-the-art methods. Full article
(This article belongs to the Special Issue Multi-Modal Image Processing Methods, Systems, and Applications)
Show Figures

Figure 1

20 pages, 10333 KiB  
Article
NST-YOLO11: ViT Merged Model with Neuron Attention for Arbitrary-Oriented Ship Detection in SAR Images
by Yiyang Huang, Di Wang, Boxuan Wu and Daoxiang An
Remote Sens. 2024, 16(24), 4760; https://doi.org/10.3390/rs16244760 - 20 Dec 2024
Viewed by 336
Abstract
Due to the significant discrepancies in the distribution of ships in nearshore and offshore areas, the wide range of their size, and the randomness of target orientation in the sea, traditional detection models in the field of computer vision struggle to achieve performance [...] Read more.
Due to the significant discrepancies in the distribution of ships in nearshore and offshore areas, the wide range of their size, and the randomness of target orientation in the sea, traditional detection models in the field of computer vision struggle to achieve performance in SAR image ship target detection comparable to that in optical image detection. This paper proposes an oriented ship target detection model based on the YOLO11 algorithm, Neural Swin Transformer-YOLO11 (NST-YOLO11). The proposed model integrates an improved Swin Transformer module called Neural Swin-T and a Cross-Stage connected Spatial Pyramid Pooling-Fast (CS-SPPF) module. By introducing a spatial/channel unified attention mechanism with neuron suppression in the spatial domain, the information redundancy generated by the local window self-attention module in the Swin Transformer Block is cut off. Furthermore, the idea of cross-stage partial (CSP) connections is applied to the fast spatial pyramid pooling (SPPF) module, effectively enhancing the ability to retain information in multi-scale feature extraction. Experiments conducted on the Rotated Ship Detection Dataset in SAR Images (RSDD-SAR) and the SAR Ship Detection Dataset (SSDD+) and comparisons with other oriented detection models demonstrate that the proposed NST-YOLO11 achieves state-of-the-art detection performance, demonstrate outstanding generalization ability and robustness of the proposed model. Full article
Show Figures

Figure 1

27 pages, 9095 KiB  
Article
BMFusion: Bridging the Gap Between Dark and Bright in Infrared-Visible Imaging Fusion
by Chengwen Liu, Bin Liao and Zhuoyue Chang
Electronics 2024, 13(24), 5005; https://doi.org/10.3390/electronics13245005 - 19 Dec 2024
Viewed by 466
Abstract
The fusion of infrared and visible light images is a crucial technology for enhancing visual perception in complex environments. It plays a pivotal role in improving visual perception and subsequent performance in advanced visual tasks. However, due to the significant degradation of visible [...] Read more.
The fusion of infrared and visible light images is a crucial technology for enhancing visual perception in complex environments. It plays a pivotal role in improving visual perception and subsequent performance in advanced visual tasks. However, due to the significant degradation of visible light image quality in low-light or nighttime scenes, most existing fusion methods often struggle to obtain sufficient texture details and salient features when processing such scenes. This can lead to a decrease in fusion quality. To address this issue, this article proposes a new image fusion method called BMFusion. Its aim is to significantly improve the quality of fused images in low-light or nighttime scenes and generate high-quality fused images around the clock. This article first designs a brightness attention module composed of brightness attention units. It extracts multimodal features by combining the SimAm attention mechanism with a Transformer architecture. Effective enhancement of brightness and features has been achieved, with gradual brightness attention performed during feature extraction. Secondly, a complementary fusion module was designed. This module deeply fuses infrared and visible light features to ensure the complementarity and enhancement of each modal feature during the fusion process, minimizing information loss to the greatest extent possible. In addition, a feature reconstruction network combining CLIP-guided semantic vectors and neighborhood attention enhancement was proposed in the feature reconstruction stage. It uses the KAN module to perform channel adaptive optimization on the reconstruction process, ensuring semantic consistency and detail integrity of the fused image during the reconstruction phase. The experimental results on a large number of public datasets demonstrate that the BMFusion method can generate fusion images with higher visual quality and richer details in night and low-light environments compared with various existing state-of-the-art (SOTA) algorithms. At the same time, the fusion image can significantly improve the performance of advanced visual tasks. This shows the great potential and application prospect of this method in the field of multimodal image fusion. Full article
(This article belongs to the Topic Computer Vision and Image Processing, 2nd Edition)
Show Figures

Figure 1

17 pages, 2969 KiB  
Article
GCBAM-UNet: Sun Glare Segmentation Using Convolutional Block Attention Module
by Nabila Zrira, Anwar Jimi, Mario Di Nardo, Issam Elafi, Maryam Gallab and Redouan Chahdi El Ouazzani
Appl. Syst. Innov. 2024, 7(6), 128; https://doi.org/10.3390/asi7060128 - 19 Dec 2024
Viewed by 597
Abstract
Sun glare poses a significant challenge in Advanced Driver Assistance Systems (ADAS) due to its potential to obscure important visual information, reducing accuracy in detecting road signs, obstacles, and lane markings. Effective sun glare mitigation and segmentation are crucial for enhancing the reliability [...] Read more.
Sun glare poses a significant challenge in Advanced Driver Assistance Systems (ADAS) due to its potential to obscure important visual information, reducing accuracy in detecting road signs, obstacles, and lane markings. Effective sun glare mitigation and segmentation are crucial for enhancing the reliability and safety of ADAS. In this paper, we propose a new approach called “GCBAM-UNet” for sun glare segmentation using deep learning. We employ a pre-trained U-Net model VGG19-UNet with weights initialized from an ImageNet. To further enhance the segmentation performance, we integrated a Convolutional Block Attention Module (CBAM), enabling the model to focus on important features in both spatial and channel dimensions. Experimental results show that GCBAM-UNet is considerably better than other state-of-the-art methods, which will undoubtedly guarantee the safety of ADAS. Full article
(This article belongs to the Section Artificial Intelligence)
Show Figures

Figure 1

Back to TopTop