Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (5,163)

Search Parameters:
Keywords = image fusion

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
12 pages, 1696 KiB  
Article
Early Detection of Residual/Recurrent Lung Malignancies on Post-Radiation FDG PET/CT
by Liyuan Chen, Avanka Lowe and Jing Wang
Algorithms 2024, 17(10), 435; https://doi.org/10.3390/a17100435 (registering DOI) - 1 Oct 2024
Viewed by 143
Abstract
Positron Emission Tomography/Computed Tomography (PET/CT) using Fluorodeoxyglucose (FDG) is an important imaging modality for assessing treatment outcomes in patients with pulmonary malignant neoplasms undergoing radiation therapy. However, distinguishing between benign post-radiation changes and residual or recurrent malignancies on PET/CT images is challenging. Leveraging [...] Read more.
Positron Emission Tomography/Computed Tomography (PET/CT) using Fluorodeoxyglucose (FDG) is an important imaging modality for assessing treatment outcomes in patients with pulmonary malignant neoplasms undergoing radiation therapy. However, distinguishing between benign post-radiation changes and residual or recurrent malignancies on PET/CT images is challenging. Leveraging the potential of artificial intelligence (AI), we aimed to develop a hybrid fusion model integrating radiomics and Convolutional Neural Network (CNN) architectures to improve differentiation between benign post-radiation changes and residual or recurrent malignancies on PET/CT images. We retrospectively collected post-radiation PET/CTs with identified labels for benign changes or residual/recurrent malignant lesions from 95 lung cancer patients who received radiation therapy. Firstly, we developed separate radiomics and CNN models using handcrafted and self-learning features, respectively. Then, to build a more reliable model, we fused the probabilities from the two models through an evidential reasoning approach to derive the final prediction probability. Five-folder cross-validation was performed to evaluate the proposed radiomics, CNN, and fusion models. Overall, the hybrid fusion model outperformed the other two models in terms of sensitivity, specificity, accuracy, and the area under the curve (AUC) with values of 0.67, 0.72, 0.69, and 0.72, respectively. Evaluation results on the three AI models we developed suggest that handcrafted features and learned features may provide complementary information for residual or recurrent malignancy identification in PET/CT. Full article
(This article belongs to the Special Issue Algorithms for Computer Aided Diagnosis: 2nd Edition)
Show Figures

Figure 1

14 pages, 828 KiB  
Article
Lightweight MRI Brain Tumor Segmentation Enhanced by Hierarchical Feature Fusion
by Lei Zhang, Rong Zhang, Zhongjie Zhu, Pei Li, Yongqiang Bai and Ming Wang
Tomography 2024, 10(10), 1577-1590; https://doi.org/10.3390/tomography10100116 (registering DOI) - 1 Oct 2024
Viewed by 148
Abstract
Background: Existing methods for MRI brain tumor segmentation often suffer from excessive model parameters and suboptimal performance in delineating tumor boundaries. Methods: For this issue, a lightweight MRI brain tumor segmentation method, enhanced by hierarchical feature fusion (EHFF), is proposed. This method reduces [...] Read more.
Background: Existing methods for MRI brain tumor segmentation often suffer from excessive model parameters and suboptimal performance in delineating tumor boundaries. Methods: For this issue, a lightweight MRI brain tumor segmentation method, enhanced by hierarchical feature fusion (EHFF), is proposed. This method reduces model parameters while improving segmentation performance by integrating hierarchical features. Initially, a fine-grained feature adjustment network is crafted and guided by global contextual information, leading to the establishment of an adaptive feature learning (AFL) module. This module captures the global features of MRI brain tumor images through macro perception and micro focus, adjusting spatial granularity to enhance feature details and reduce computational complexity. Subsequently, a hierarchical feature weighting (HFW) module is constructed. This module extracts multi-scale refined features through multi-level weighting, enhancing the detailed features of spatial positions and alleviating the lack of attention to local position details in macro perception. Finally, a hierarchical feature retention (HFR) module is designed as a supplementary decoder. This module retains, up-samples, and fuses feature maps from each layer, thereby achieving better detail preservation and reconstruction. Results: Experimental results on the BraTS 2021 dataset demonstrate that the proposed method surpasses existing methods. Dice similarity coefficients (DSC) for the three semantic categories ET, TC, and WT are 88.57%, 91.53%, and 93.09%, respectively. Full article
(This article belongs to the Topic AI in Medical Imaging and Image Processing)
Show Figures

Figure 1

20 pages, 3901 KiB  
Article
Multi-Modal Fusion Network with Multi-Head Self-Attention for Injection Training Evaluation in Medical Education
by Zhe Li, Aya Kanazuka, Atsushi Hojo, Yukihiro Nomura and Toshiya Nakaguchi
Electronics 2024, 13(19), 3882; https://doi.org/10.3390/electronics13193882 - 30 Sep 2024
Viewed by 346
Abstract
The COVID-19 pandemic has significantly disrupted traditional medical training, particularly in critical areas such as the injection process, which require expert supervision. To address the challenges posed by reduced face-to-face interactions, this study introduces a multi-modal fusion network designed to evaluate the timing [...] Read more.
The COVID-19 pandemic has significantly disrupted traditional medical training, particularly in critical areas such as the injection process, which require expert supervision. To address the challenges posed by reduced face-to-face interactions, this study introduces a multi-modal fusion network designed to evaluate the timing and motion aspects of the injection training process in medical education. The proposed framework integrates 3D reconstructed data and 2D images of hand movements during the injection process. The 3D data are preprocessed and encoded by a Long Short-Term Memory (LSTM) network to extract temporal features, while a Convolutional Neural Network (CNN) processes the 2D images to capture detailed image features. These encoded features are then fused and refined through a proposed multi-head self-attention module, which enhances the model’s ability to capture and weigh important temporal and image dynamics in the injection process. The final classification of the injection process is conducted by a classifier module. The model’s performance was rigorously evaluated using video data from 255 subjects with assessments made by professional physicians according to the Objective Structured Assessment of Technical Skill—Global Rating Score (OSATS-GRS)[B] criteria for time and motion evaluation. The experimental results demonstrate that the proposed data fusion model achieves an accuracy of 0.7238, an F1-score of 0.7060, a precision of 0.7339, a recall of 0.7238, and an AUC of 0.8343. These findings highlight the model’s potential as an effective tool for providing objective feedback in medical injection training, offering a scalable solution for the post-pandemic evolution of medical education. Full article
Show Figures

Figure 1

22 pages, 6532 KiB  
Article
Iterative Mamba Diffusion Change-Detection Model for Remote Sensing
by Feixiang Liu, Yihan Wen, Jiayi Sun, Peipei Zhu, Liang Mao, Guanchong Niu and Jie Li
Remote Sens. 2024, 16(19), 3651; https://doi.org/10.3390/rs16193651 - 30 Sep 2024
Viewed by 276
Abstract
In the field of remote sensing (RS), change detection (CD) methods are critical for analyzing the quality of images shot over various geographical areas, particularly for high-resolution images. However, there are some shortcomings of the widely used Convolutional Neural Networks (CNNs) and Transformers-based [...] Read more.
In the field of remote sensing (RS), change detection (CD) methods are critical for analyzing the quality of images shot over various geographical areas, particularly for high-resolution images. However, there are some shortcomings of the widely used Convolutional Neural Networks (CNNs) and Transformers-based CD methods. The former is limited by its insufficient long-range modeling capabilities, while the latter is hampered by its computational complexity. Additionally, the commonly used information-fusion methods for pre- and post-change images often lead to information loss or redundancy, resulting in inaccurate edge detection. To address these issues, we propose an Iterative Mamba Diffusion Change Detection (IMDCD) approach to iteratively integrate various pieces of information and efficiently produce fine-grained CD maps. Specifically, the Swin-Mamba-Encoder (SME) within Mamba-CD (MCD) is employed as a semantic feature extractor, capable of modeling long-range relationships with linear computability. Moreover, we introduce the Variable State Space CD (VSS-CD) module, which extracts abundant CD features by training the matrix parameters within the designed State Space Change Detection (SS-CD). The computed high-dimensional CD feature is integrated into the noise predictor using a novel Global Hybrid Attention Transformer (GHAT) while low-dimensional CD features are utilized to calibrate prior CD results at each iterative step, progressively refining the generated outcomes. IMDCD exhibits a high performance across multiple datasets such as the CDD, WHU, LEVIR, and OSCD, marking a significant advancement in the methodologies within the CD field of RS. The code for this work is available on GitHub. Full article
Show Figures

Figure 1

21 pages, 9523 KiB  
Article
A Hybrid Framework for Referring Image Segmentation: Dual-Decoder Model with SAM Complementation
by Haoyuan Chen, Sihang Zhou, Kuan Li, Jianping Yin and Jian Huang
Mathematics 2024, 12(19), 3061; https://doi.org/10.3390/math12193061 - 30 Sep 2024
Viewed by 310
Abstract
In the realm of human–robot interaction, the integration of visual and verbal cues has become increasingly significant. This paper focuses on the challenges and advancements in referring image segmentation (RIS), a task that involves segmenting images based on textual descriptions. Traditional approaches to [...] Read more.
In the realm of human–robot interaction, the integration of visual and verbal cues has become increasingly significant. This paper focuses on the challenges and advancements in referring image segmentation (RIS), a task that involves segmenting images based on textual descriptions. Traditional approaches to RIS have primarily focused on pixel-level classification. These methods, although effective, often overlook the interconnectedness of pixels, which can be crucial for interpreting complex visual scenes. Furthermore, while the PolyFormer model has shown impressive performance in RIS, its large number of parameters and high training data requirements pose significant challenges. These factors restrict its adaptability and optimization on standard consumer hardware, hindering further enhancements in subsequent research. Addressing these issues, our study introduces a novel two-branch decoder framework with SAM (segment anything model) for RIS. This framework incorporates an MLP decoder and a KAN decoder with a multi-scale feature fusion module, enhancing the model’s capacity to discern fine details within images. The framework’s robustness is further bolstered by an ensemble learning strategy that consolidates the insights from both the MLP and KAN decoder branches. More importantly, we collect the segmentation target edge coordinates and bounding box coordinates as input cues for the SAM model. This strategy leverages SAM’s zero-sample learning capabilities to refine and optimize the segmentation outcomes. Our experimental findings, based on the widely recognized RefCOCO, RefCOCO+, and RefCOCOg datasets, confirm the effectiveness of this method. The results not only achieve state-of-the-art (SOTA) performance in segmentation but are also supported by ablation studies that highlight the contributions of each component to the overall improvement in performance. Full article
Show Figures

Figure 1

17 pages, 1733 KiB  
Article
An Improved Deep Learning Framework for Multimodal Medical Data Analysis
by Sachin Kumar and Shivani Sharma
Big Data Cogn. Comput. 2024, 8(10), 125; https://doi.org/10.3390/bdcc8100125 - 29 Sep 2024
Viewed by 344
Abstract
Lung disease is one of the leading causes of death worldwide. This emphasizes the need for early diagnosis in order to provide appropriate treatment and save lives. Physicians typically require information about patients’ clinical symptoms, various laboratory and pathology tests, along with chest [...] Read more.
Lung disease is one of the leading causes of death worldwide. This emphasizes the need for early diagnosis in order to provide appropriate treatment and save lives. Physicians typically require information about patients’ clinical symptoms, various laboratory and pathology tests, along with chest X-rays to confirm the diagnosis of lung disease. In this study, we present a transformer-based multimodal deep learning approach that incorporates imaging and clinical data for effective lung disease diagnosis on a new multimodal medical dataset. The proposed method employs a cross-attention transformer module to merge features from the heterogeneous modalities. Then unified fused features are used for disease classification. The experiments were performed and evaluated on several classification metrics to illustrate the performance of the proposed approach. The study’s results revealed that the proposed method achieved an accuracy of 95% in terms of accurate classification of tuberculosis and outperformed other traditional fusion methods on multimodal tuberculosis data used in this study. Full article
Show Figures

Figure 1

26 pages, 10689 KiB  
Article
Radar Target Radar Cross-Section Measurement Based on Enhanced Imaging and Scattering Center Extraction
by Xin Tan, Chaoqi Wang, Yang Fang, Bai Wu, Dongyan Zhao and Jiansheng Hu
Sensors 2024, 24(19), 6315; https://doi.org/10.3390/s24196315 - 29 Sep 2024
Viewed by 233
Abstract
Accurate measurement of a Radar Cross-Section (RCS) is a critical technical challenge in assessing the stealth performance and scattering characteristics of radar targets. Traditional RCS measurement methods are limited by high costs, sensitivity to environmental conditions, and difficulties in distinguishing local scattering features [...] Read more.
Accurate measurement of a Radar Cross-Section (RCS) is a critical technical challenge in assessing the stealth performance and scattering characteristics of radar targets. Traditional RCS measurement methods are limited by high costs, sensitivity to environmental conditions, and difficulties in distinguishing local scattering features of targets. To address these challenges, this paper proposes a novel RCS measurement method based on enhanced imaging and scattering center extraction. This method integrates sub-aperture imaging with image fusion techniques to improve imaging quality and enhance the detail of target scattering characteristics. Additionally, an improved sequence CLEAN algorithm is employed to effectively suppress sidelobe effects and ensure the accuracy of scattering center extraction. Experimental results demonstrate that this method achieves higher precision in RCS measurement of complex targets and is particularly effective in environments with strong interference, where it successfully separates the scattering contributions of the target from those of the interference sources. This method offers a new technological approach for precise RCS measurement of radar stealth targets in the future. Full article
(This article belongs to the Section Radar Sensors)
Show Figures

Figure 1

24 pages, 13098 KiB  
Article
A Multi-Scale Feature Fusion Based Lightweight Vehicle Target Detection Network on Aerial Optical Images
by Chengrui Yu, Xiaonan Jiang, Fanlu Wu, Yao Fu, Junyan Pei, Yu Zhang, Xiangzhi Li and Tianjiao Fu
Remote Sens. 2024, 16(19), 3637; https://doi.org/10.3390/rs16193637 - 29 Sep 2024
Viewed by 395
Abstract
Vehicle detection with optical remote sensing images has become widely applied in recent years. However, the following challenges have remained unsolved during remote sensing vehicle target detection. These challenges include the dense and arbitrary angles at which vehicles are distributed and which make [...] Read more.
Vehicle detection with optical remote sensing images has become widely applied in recent years. However, the following challenges have remained unsolved during remote sensing vehicle target detection. These challenges include the dense and arbitrary angles at which vehicles are distributed and which make it difficult to detect them; the extensive model parameter (Param) that blocks real-time detection; the large differences between larger vehicles in terms of their features, which lead to a reduced detection precision; and the way in which the distribution in vehicle datasets is unbalanced and thus not conducive to training. First, this paper constructs a small dataset of vehicles, MiVehicle. This dataset includes 3000 corresponding infrared and visible image pairs, offering a more balanced distribution. In the infrared part of the dataset, the proportions of different vehicle types are as follows: cars, 48%; buses, 19%; trucks, 15%; freight, cars 10%; and vans, 8%. Second, we choose the rotated box mechanism for detection with the model and we build a new vehicle detector, ML-Det, with a novel multi-scale feature fusion triple cross-criss FPN (TCFPN), which can effectively capture the vehicle features in three different positions with an mAP improvement of 1.97%. Moreover, we propose LKC–INVO, which allows involution to couple the structure of multiple large kernel convolutions, resulting in an mAP increase of 2.86%. We also introduce a novel C2F_ContextGuided module with global context perception, which enhances the perception ability of the model in the global scope and minimizes model Params. Eventually, we propose an assemble–disperse attention module to aggregate local features so as to improve the performance. Overall, ML-Det achieved a 3.22% improvement in accuracy while keeping Params almost unchanged. In the self-built small MiVehicle dataset, we achieved 70.44% on visible images and 79.12% on infrared images with 20.1 GFLOPS, 78.8 FPS, and 7.91 M. Additionally, we trained and tested our model on the following public datasets: UAS-AOD and DOTA. ML-Det was found to be ahead of many other advanced target detection algorithms. Full article
(This article belongs to the Section AI Remote Sensing)
Show Figures

Figure 1

20 pages, 33767 KiB  
Article
Multi-Source Data-Driven Extraction of Urban Residential Space: A Case Study of the Guangdong–Hong Kong–Macao Greater Bay Area Urban Agglomeration
by Xiaodie Yuan, Xiangjun Dai, Zeduo Zou, Xiong He, Yucong Sun and Chunshan Zhou
Remote Sens. 2024, 16(19), 3631; https://doi.org/10.3390/rs16193631 - 29 Sep 2024
Viewed by 326
Abstract
The accurate extraction of urban residential space (URS) is of great significance for recognizing the spatial structure of urban function, understanding the complex urban operating system, and scientific allocation and management of urban resources. The traditional URS identification process is generally conducted through [...] Read more.
The accurate extraction of urban residential space (URS) is of great significance for recognizing the spatial structure of urban function, understanding the complex urban operating system, and scientific allocation and management of urban resources. The traditional URS identification process is generally conducted through statistical analysis or a manual field survey. Currently, there are also superpixel segmentation and wavelet transform (WT) processes to extract urban spatial information, but these methods have shortcomings in extraction efficiency and accuracy. The superpixel wavelet fusion (SWF) method proposed in this paper is a convenient method to extract URS by integrating multi-source data such as Point of Interest (POI) data, Nighttime Light (NTL) data, LandScan (LDS) data, and High-resolution Image (HRI) data. This method fully considers the distribution law of image information in HRI and imparts the spatial information of URS into the WT so as to obtain the recognition results of URS based on multi-source data fusion under the perception of spatial structure. The steps of this study are as follows: Firstly, the SLIC algorithm is used to segment HRI in the Guangdong–Hong Kong–Macao Greater Bay Area (GBA) urban agglomeration. Then, the discrete cosine wavelet transform (DCWT) is applied to POI–NTL, POI–LDS, and POI–NTL–LDS data sets, and the SWF is carried out based on different superpixel scale perspectives. Finally, the OSTU adaptive threshold algorithm is used to extract URS. The results show that the extraction accuracy of the NLT–POI data set is 81.52%, that of the LDS–POI data set is 77.70%, and that of the NLT–LDS–POI data set is 90.40%. The method proposed in this paper not only improves the accuracy of the extraction of URS, but also has good practical value for the optimal layout of residential space and regional planning of urban agglomerations. Full article
(This article belongs to the Special Issue Nighttime Light Remote Sensing Products for Urban Applications)
Show Figures

Figure 1

16 pages, 5464 KiB  
Article
Estimation of Cotton SPAD Based on Multi-Source Feature Fusion and Voting Regression Ensemble Learning in Intercropping Pattern of Cotton and Soybean
by Xiaoli Wang, Jingqian Li, Junqiang Zhang, Lei Yang, Wenhao Cui, Xiaowei Han, Dulin Qin, Guotao Han, Qi Zhou, Zesheng Wang, Jing Zhao and Yubin Lan
Agronomy 2024, 14(10), 2245; https://doi.org/10.3390/agronomy14102245 - 29 Sep 2024
Viewed by 214
Abstract
The accurate estimation of soil plant analytical development (SPAD) values in cotton under various intercropping patterns with soybean is crucial for monitoring cotton growth and determining a suitable intercropping pattern. In this study, we utilized an unmanned aerial vehicle (UAV) to capture visible [...] Read more.
The accurate estimation of soil plant analytical development (SPAD) values in cotton under various intercropping patterns with soybean is crucial for monitoring cotton growth and determining a suitable intercropping pattern. In this study, we utilized an unmanned aerial vehicle (UAV) to capture visible (RGB) and multispectral (MS) data of cotton at the bud stage, early flowering stage, and full flowering stage in a cotton–soybean intercropping pattern in the Yellow River Delta region of China, and we used SPAD502 Plus and tapeline to collect SPAD and cotton plant height (CH) data of the cotton canopy, respectively. We analyzed the differences in cotton SPAD and CH under different intercropping ratio patterns. It was conducted using Pearson correlation analysis between the RGB features, MS features, and cotton SPAD, then the recursive feature elimination (RFE) method was employed to select image features. Seven feature sets including MS features (five vegetation indices + five texture features), RGB features (five vegetation indices + cotton cover), and CH, as well as combinations of these three types of features with each other, were established. Voting regression (VR) ensemble learning was proposed for estimating cotton SPAD and compared with the performances of three models: random forest regression (RFR), gradient boosting regression (GBR), and support vector regression (SVR). The optimal model was then used to estimate and visualize cotton SPAD under different intercropping patterns. The results were as follows: (1) There was little difference in the mean value of SPAD or CH under different intercropping patterns; a significant positive correlation existed between CH and SPAD throughout the entire growth period. (2) All VR models were optimal when each of the seven feature sets were used as input. When the features set was MS + RGB, the determination coefficient (R2) of the validation set of the VR model was 0.902, the root mean square error (RMSE) was 1.599, and the relative prediction deviation (RPD) was 3.24. (3) When the features set was CH + MS + RGB, the accuracy of the VR model was further improved, compared with the feature set MS + RGB, the R2 and RPD were increased by 1.55% and 8.95%, respectively, and the RMSE was decreased by 7.38%. (4) In the intercropping of cotton and soybean, cotton growing under 4:6 planting patterns was better. The results can provide a reference for the selection of intercropping patterns and the estimation of cotton SPAD. Full article
(This article belongs to the Special Issue AI, Sensors and Robotics for Smart Agriculture—2nd Edition)
Show Figures

Figure 1

20 pages, 2515 KiB  
Article
Detection of Thymoma Disease Using mRMR Feature Selection and Transformer Models
by Mehmet Agar, Siyami Aydin, Muharrem Cakmak, Mustafa Koc and Mesut Togacar
Diagnostics 2024, 14(19), 2169; https://doi.org/10.3390/diagnostics14192169 - 29 Sep 2024
Viewed by 351
Abstract
Background: Thymoma is a tumor that originates in the thymus gland, a part of the human body located behind the breastbone. It is a malignant disease that is rare in children but more common in adults and usually does not spread outside the [...] Read more.
Background: Thymoma is a tumor that originates in the thymus gland, a part of the human body located behind the breastbone. It is a malignant disease that is rare in children but more common in adults and usually does not spread outside the thymus. The exact cause of thymic disease is not known, but it is thought to be more common in people infected with the EBV virus at an early age. Various surgical methods are used in clinical settings to treat thymoma. Expert opinion is very important in the diagnosis of the disease. Recently, next-generation technologies have become increasingly important in disease detection. Today’s early detection systems already use transformer models that are open to technological advances. Methods: What makes this study different is the use of transformer models instead of traditional deep learning models. The data used in this study were obtained from patients undergoing treatment at Fırat University, Department of Thoracic Surgery. The dataset consisted of two types of classes: thymoma disease images and non-thymoma disease images. The proposed approach consists of preprocessing, model training, feature extraction, feature set fusion between models, efficient feature selection, and classification. In the preprocessing step, unnecessary regions of the images were cropped, and the region of interest (ROI) technique was applied. Four types of transformer models (Deit3, Maxvit, Swin, and ViT) were used for model training. As a result of the training of the models, the feature sets obtained from the best three models were merged between the models (Deit3 and Swin, Deit3 and ViT, Deit3 and ViT, Swin and ViT, and Deit3 and Swin and ViT). The combined feature set of the model (Deit3 and ViT) that gave the best performance with fewer features was analyzed using the mRMR feature selection method. The SVM method was used in the classification process. Results: With the mRMR feature selection method, 100% overall accuracy was achieved with feature sets containing fewer features. The cross-validation technique was used to verify the overall accuracy of the proposed approach and 99.22% overall accuracy was achieved in the analysis with this technique. Conclusions: These findings emphasize the added value of the proposed approach in the detection of thymoma. Full article
(This article belongs to the Special Issue Advanced Computer-Aided Diagnosis Using Medical Images)
Show Figures

Graphical abstract

25 pages, 7524 KiB  
Article
Spatial Feature-Based ISAR Image Registration for Space Targets
by Lizhi Zhao, Junling Wang, Jiaoyang Su and Haoyue Luo
Remote Sens. 2024, 16(19), 3625; https://doi.org/10.3390/rs16193625 - 28 Sep 2024
Viewed by 206
Abstract
Image registration is essential for applications requiring the joint processing of inverse synthetic aperture radar (ISAR) images, such as interferometric ISAR, image enhancement, and image fusion. Traditional image registration methods, developed for optical images, often perform poorly with ISAR images due to their [...] Read more.
Image registration is essential for applications requiring the joint processing of inverse synthetic aperture radar (ISAR) images, such as interferometric ISAR, image enhancement, and image fusion. Traditional image registration methods, developed for optical images, often perform poorly with ISAR images due to their differing imaging mechanisms. This paper introduces a novel spatial feature-based ISAR image registration method. The method encodes spatial information by utilizing the distances and angles between dominant scatterers to construct translation and rotation-invariant feature descriptors. These feature descriptors are then used for scatterer matching, while the coordinate transformation of matched scatterers is employed to estimate image registration parameters. To mitigate the glint effects of scatterers, the random sample consensus (RANSAC) algorithm is applied for parameter estimation. By extracting global spatial information, the constructed feature curves exhibit greater stability and reliability. Additionally, using multiple dominant scatterers ensures adaptability to low signal-to-noise (SNR) ratio conditions. The effectiveness of the method is validated through both simulated and natural ISAR image sequences. Comparative performance results with traditional image registration methods, such as the SIFT, SURF and SIFT+SURF algorithms, are also included. Full article
(This article belongs to the Section Engineering Remote Sensing)
Show Figures

Figure 1

17 pages, 15850 KiB  
Article
Ancient Painting Inpainting with Regional Attention-Style Transfer and Global Context Perception
by Xiaotong Liu, Jin Wan and Nan Wang
Appl. Sci. 2024, 14(19), 8777; https://doi.org/10.3390/app14198777 - 28 Sep 2024
Viewed by 293
Abstract
Ancient paintings, as a vital component of cultural heritage, encapsulate a profound depth of cultural significance. Over time, they often suffer from different degradation conditions, leading to damage. Existing ancient painting inpainting methods struggle with semantic discontinuities, blurred textures, and details in missing [...] Read more.
Ancient paintings, as a vital component of cultural heritage, encapsulate a profound depth of cultural significance. Over time, they often suffer from different degradation conditions, leading to damage. Existing ancient painting inpainting methods struggle with semantic discontinuities, blurred textures, and details in missing areas. To address these issues, this paper proposes a generative adversarial network (GAN)-based ancient painting inpainting method named RG-GAN. Firstly, to address the inconsistency between the styles of missing and non-missing areas, this paper proposes a Regional Attention-Style Transfer Module (RASTM) to achieve complex style transfer while maintaining the authenticity of the content. Meanwhile, a multi-scale fusion generator (MFG) is proposed to use the multi-scale residual downsampling module to reduce the size of the feature map and effectively extract and integrate the features of different scales. Secondly, a multi-scale fusion mechanism leverages the Multi-scale Cross-layer Perception Module (MCPM) to enhance feature representation of filled areas to solve the semantic incoherence of the missing region of the image. Finally, the Global Context Perception Discriminator (GCPD) is proposed for the deficiencies in capturing detailed information, which enhances the information interaction across dimensions and improves the discriminator’s ability to identify specific spatial areas and extract critical detail information. Experiments on the ancient painting and ancient Huaniao++ datasets demonstrate that our method achieves the highest PSNR values of 34.62 and 23.46 and the lowest LPIPS values of 0.0507 and 0.0938, respectively. Full article
(This article belongs to the Special Issue Advanced Technologies in Cultural Heritage)
Show Figures

Figure 1

19 pages, 3429 KiB  
Article
An Insulator Fault Diagnosis Method Based on Multi-Mechanism Optimization YOLOv8
by Chuang Gong, Wei Jiang, Dehua Zou, Weiwei Weng and Hongjun Li
Appl. Sci. 2024, 14(19), 8770; https://doi.org/10.3390/app14198770 - 28 Sep 2024
Viewed by 257
Abstract
Aiming at the problem that insulator image backgrounds are complex and fault types are diverse, which makes it difficult for existing deep learning algorithms to achieve accurate insulator fault diagnosis, an insulator fault diagnosis method based on multi-mechanism optimization YOLOv8-DCP is proposed. Firstly, [...] Read more.
Aiming at the problem that insulator image backgrounds are complex and fault types are diverse, which makes it difficult for existing deep learning algorithms to achieve accurate insulator fault diagnosis, an insulator fault diagnosis method based on multi-mechanism optimization YOLOv8-DCP is proposed. Firstly, a feature extraction and fusion module, named CW-DRB, was designed. This module enhances the C2f structure of YOLOv8 by incorporating the dilation-wise residual module and the dilated re-param module. The introduction of this module improves YOLOv8’s capability for multi-scale feature extraction and multi-level feature fusion. Secondly, the CARAFE module, which is feature content-aware, was introduced to replace the up-sampling layer in YOLOv8n, thereby enhancing the model’s feature map reconstruction ability. Finally, an additional small-object detection layer was added to improve the detection accuracy of small defects. Simulation results indicate that YOLOv8-DCP achieves an accuracy of 97.7% and an [email protected] of 93.9%. Compared to YOLOv5, YOLOv7, and YOLOv8n, the accuracy improved by 1.5%, 4.3%, and 4.8%, while the [email protected] increased by 3.0%, 4.3%, and 3.1%. This results in a significant enhancement in the accuracy of insulator fault diagnosis. Full article
(This article belongs to the Special Issue Deep Learning for Object Detection)
Show Figures

Figure 1

21 pages, 5986 KiB  
Article
A Transformer-Based Image-Guided Depth-Completion Model with Dual-Attention Fusion Module
by Shuling Wang, Fengze Jiang and Xiaojin Gong
Sensors 2024, 24(19), 6270; https://doi.org/10.3390/s24196270 - 27 Sep 2024
Viewed by 205
Abstract
Depth information is crucial for perceiving three-dimensional scenes. However, depth maps captured directly by depth sensors are often incomplete and noisy, our objective in the depth-completion task is to generate dense and accurate depth maps from sparse depth inputs by fusing guidance information [...] Read more.
Depth information is crucial for perceiving three-dimensional scenes. However, depth maps captured directly by depth sensors are often incomplete and noisy, our objective in the depth-completion task is to generate dense and accurate depth maps from sparse depth inputs by fusing guidance information from corresponding color images obtained from camera sensors. To address these challenges, we introduce transformer models, which have shown great promise in the field of vision, into the task of image-guided depth completion. By leveraging the self-attention mechanism, we propose a novel network architecture that effectively meets these requirements of high accuracy and resolution in depth data. To be more specific, we design a dual-branch model with a transformer-based encoder that serializes image features into tokens step by step and extracts multi-scale pyramid features suitable for pixel-wise dense prediction tasks. Additionally, we incorporate a dual-attention fusion module to enhance the fusion between the two branches. This module combines convolution-based spatial and channel-attention mechanisms, which are adept at capturing local information, with cross-attention mechanisms that excel at capturing long-distance relationships. Our model achieves state-of-the-art performance on both the NYUv2 depth and SUN-RGBD depth datasets. Additionally, our ablation studies confirm the effectiveness of the designed modules. Full article
Back to TopTop