Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

Search Results (330)

Search Parameters:
Keywords = small object detection layer

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
20 pages, 17200 KiB  
Article
What Is Beyond Hyperbola Detection and Characterization in Ground-Penetrating Radar Data?—Implications from the Archaeological Site of Goting, Germany
by Tina Wunderlich, Bente S. Majchczack, Dennis Wilken, Martin Segschneider and Wolfgang Rabbel
Remote Sens. 2024, 16(21), 4080; https://doi.org/10.3390/rs16214080 - 31 Oct 2024
Viewed by 292
Abstract
Hyperbolae in radargrams are caused by a variety of small subsurface objects. The analysis of their curvature enables the determination of propagation velocity in the subsurface, which is important for exact time-to-depth conversion and migration and also yields information on the water content [...] Read more.
Hyperbolae in radargrams are caused by a variety of small subsurface objects. The analysis of their curvature enables the determination of propagation velocity in the subsurface, which is important for exact time-to-depth conversion and migration and also yields information on the water content of the soil. Using deep learning methods and fitting (DLF) algorithms, it is possible to automatically detect and analyze large numbers of hyperbola in 3D Ground-Penetrating Radar (GPR) datasets. As a result, a 3D velocity model can be established. Combining the hyperbola locations and the 3D velocity model with reflection depth sections and timeslices leads to improved archaeological interpretation due to (1) correct time-to-depth conversion through migration with the 3D velocity model, (2) creation of depthslices following the topography, (3) evaluation of the spatial distribution of hyperbolae, and (4) derivation of a 3D water content model of the site. In an exemplary study, we applied DLF to a 3D GPR dataset from the multi-phased (2nd to 12th century CE) archaeological site of Goting on the island of Föhr, Northern Germany. Using RetinaNet, we detected 38,490 hyperbolae in an area of 1.76 ha and created a 3D velocity model. The velocities ranged from approximately 0.12 m/ns at the surface to 0.07 m/ns at approx. 3 m depth in the vertical direction; in the lateral direction, the maximum velocity variation was ±0.048 m/ns. The 2D-migrated radargrams and subsequently created depthslices revealed the remains of a longhouse, which was not known beforehand and had not been visible in the unmigrated timeslices. We found hyperbola apex points aligned along linear strong reflections. They can be interpreted as stones contained in ditch fills. The hyperbola points help to differentiate between ditches and processing artifacts that have a similar appearance as the ditches in time-/depthslices. From the derived 3D water content model, we could identify the thickness of the archaeologically relevant layer across the whole site. The layer contains a lot of humus and has a high water retention capability, leading to a higher water content compared to the underlying glacial moraine sand, which is well-drained. Full article
(This article belongs to the Special Issue Advanced Ground-Penetrating Radar (GPR) Technologies and Applications)
Show Figures

Figure 1

26 pages, 7432 KiB  
Article
Research on Deep Learning Detection Model for Pedestrian Objects in Complex Scenes Based on Improved YOLOv7
by Jun Hu, Yongqi Zhou, Hao Wang, Peng Qiao and Wenwei Wan
Sensors 2024, 24(21), 6922; https://doi.org/10.3390/s24216922 - 29 Oct 2024
Viewed by 374
Abstract
Objective: Pedestrian detection is very important for the environment perception and safety action of intelligent robots and autonomous driving, and is the key to ensuring the safe action of intelligent robots and auto assisted driving. Methods: In response to the characteristics of pedestrian [...] Read more.
Objective: Pedestrian detection is very important for the environment perception and safety action of intelligent robots and autonomous driving, and is the key to ensuring the safe action of intelligent robots and auto assisted driving. Methods: In response to the characteristics of pedestrian objects occupying a small image area, diverse poses, complex scenes and severe occlusion, this paper proposes an improved pedestrian object detection method based on the YOLOv7 model, which adopts the Convolutional Block Attention Module (CBAM) attention mechanism and Deformable ConvNets v2 (DCNv2) in the two Efficient Layer Aggregation Network (ELAN) modules of the backbone feature extraction network. In addition, the detection head is replaced with a Dynamic Head (DyHead) detector head with an attention mechanism; unnecessary background information around the pedestrian object is also effectively excluded, making the model learn more concentrated feature representations. Results: Compared with the original model, the log-average miss rate of the improved YOLOv7 model is significantly reduced in both the Citypersons dataset and the INRIA dataset. Conclusions: The improved YOLOv7 model proposed in this paper achieved good performance improvement in different pedestrian detection problems. The research in this paper has important reference significance for pedestrian detection in complex scenes such as small, occluded and overlapping objects. Full article
Show Figures

Figure 1

16 pages, 4833 KiB  
Article
BGF-YOLOv10: Small Object Detection Algorithm from Unmanned Aerial Vehicle Perspective Based on Improved YOLOv10
by Junhui Mei and Wenqiu Zhu
Sensors 2024, 24(21), 6911; https://doi.org/10.3390/s24216911 - 28 Oct 2024
Viewed by 413
Abstract
With the rapid development of deep learning, unmanned aerial vehicles (UAVs) have acquired intelligent perception capabilities, demonstrating efficient data collection across various fields. In UAV perspective scenarios, captured images often contain small and unevenly distributed objects, and are typically high-resolution. This makes object [...] Read more.
With the rapid development of deep learning, unmanned aerial vehicles (UAVs) have acquired intelligent perception capabilities, demonstrating efficient data collection across various fields. In UAV perspective scenarios, captured images often contain small and unevenly distributed objects, and are typically high-resolution. This makes object detection in UAV imagery more challenging compared to conventional detection tasks. To address this issue, we propose a lightweight object detection algorithm, BGF-YOLOv10, specifically designed for small object detection, based on an improved version of YOLOv10n. First, we introduce a novel YOLOv10 architecture tailored for small objects, incorporating BoTNet, variants of C2f and C3 in the backbone, along with an additional small object detection head, to enhance detection performance for small objects. Second, we embed GhostConv into both the backbone and head, effectively reducing the number of parameters by nearly half. Finally, we insert a Patch Expanding Layer module in the neck to restore the feature spatial resolution. Experimental results on the VisDrone-DET2019 and UAVDT datasets demonstrate that our method significantly improves detection accuracy compared to YOLO series networks. Moreover, when compared to other state-of-the-art networks, our approach achieves a substantial reduction in the number of parameters. Full article
(This article belongs to the Section Remote Sensors)
Show Figures

Figure 1

24 pages, 6467 KiB  
Article
YOLO-DHGC: Small Object Detection Using Two-Stream Structure with Dense Connections
by Lihua Chen, Lumei Su, Weihao Chen, Yuhan Chen, Haojie Chen and Tianyou Li
Sensors 2024, 24(21), 6902; https://doi.org/10.3390/s24216902 - 28 Oct 2024
Viewed by 359
Abstract
Small object detection, which is frequently applied in defect detection, medical imaging, and security surveillance, often suffers from low accuracy due to limited feature information and blurred details. This paper proposes a small object detection method named YOLO-DHGC, which employs a two-stream structure [...] Read more.
Small object detection, which is frequently applied in defect detection, medical imaging, and security surveillance, often suffers from low accuracy due to limited feature information and blurred details. This paper proposes a small object detection method named YOLO-DHGC, which employs a two-stream structure with dense connections. Firstly, a novel backbone network, DenseHRNet, is introduced. It innovatively combines a dense connection mechanism with high-resolution feature map branches, effectively enhancing feature reuse and cross-layer fusion, thereby obtaining high-level semantic information from the image. Secondly, a two-stream structure based on an edge-gated branch is designed. It uses higher-level information from the regular detection stream to eliminate irrelevant interference remaining in the early processing stages of the edge-gated stream, allowing it to focus on processing information related to shape boundaries and accurately capture the morphological features of small objects. To assess the effectiveness of the proposed YOLO-DHGC method, we conducted experiments on several public datasets and a self-constructed dataset. Exceptionally, a defect detection accuracy of 96.3% was achieved on the Market-PCB public dataset, demonstrating the effectiveness of our method in detecting small object defects for industrial applications. Full article
(This article belongs to the Special Issue Image Processing and Analysis for Object Detection: 2nd Edition)
Show Figures

Graphical abstract

16 pages, 21131 KiB  
Article
GCS-YOLOv8: A Lightweight Face Extractor to Assist Deepfake Detection
by Ruifang Zhang, Bohan Deng, Xiaohui Cheng and Hong Zhao
Sensors 2024, 24(21), 6781; https://doi.org/10.3390/s24216781 - 22 Oct 2024
Viewed by 304
Abstract
To address the issues of target feature blurring and increased false detections caused by high compression rates in deepfake videos, as well as the high computational resource requirements of existing face extractors, we propose a lightweight face extractor to assist deepfake detection, GCS-YOLOv8. [...] Read more.
To address the issues of target feature blurring and increased false detections caused by high compression rates in deepfake videos, as well as the high computational resource requirements of existing face extractors, we propose a lightweight face extractor to assist deepfake detection, GCS-YOLOv8. Firstly, we employ the HGStem module for initial downsampling to address the issue of false detections of small non-face objects in deepfake videos, thereby improving detection accuracy. Secondly, we introduce the C2f-GDConv module to mitigate the low-FLOPs pitfall while reducing the model’s parameters, thereby lightening the network. Additionally, we add a new P6 large target detection layer to expand the receptive field and capture multi-scale features, solving the problem of detecting large-scale faces in low-compression deepfake videos. We also design a cross-scale feature fusion module called CCFG (CNN-based Cross-Scale Feature Fusion with GDConv), which integrates features from different scales to enhance the model’s adaptability to scale variations while reducing network parameters, addressing the high computational resource requirements of traditional face extractors. Furthermore, we improve the detection head by utilizing group normalization and shared convolution, simplifying the process of face detection while maintaining detection performance. The training dataset was also refined by removing low-accuracy and low-resolution labels, which reduced the false detection rate. Experimental results demonstrate that, compared to YOLOv8, this face extractor achieves the AP of 0.942, 0.927, and 0.812 on the WiderFace dataset’s Easy, Medium, and Hard subsets, representing improvements of 1.1%, 1.3%, and 3.7% respectively. The model’s parameters and FLOPs are only 1.68 MB and 3.5 G, reflecting reductions of 44.2% and 56.8%, making it more effective and lightweight in extracting faces from deepfake videos. Full article
(This article belongs to the Section Intelligent Sensors)
Show Figures

Figure 1

17 pages, 6863 KiB  
Article
YOLO-GE: An Attention Fusion Enhanced Underwater Object Detection Algorithm
by Qiming Li and Hongwei Shi
J. Mar. Sci. Eng. 2024, 12(10), 1885; https://doi.org/10.3390/jmse12101885 - 21 Oct 2024
Viewed by 675
Abstract
Underwater object detection is a challenging task with profound implications for fields such as aquaculture, marine ecological protection, and maritime rescue operations. The presence of numerous small aquatic organisms in the underwater environment often leads to issues of missed detections and false positives. [...] Read more.
Underwater object detection is a challenging task with profound implications for fields such as aquaculture, marine ecological protection, and maritime rescue operations. The presence of numerous small aquatic organisms in the underwater environment often leads to issues of missed detections and false positives. Additionally, factors such as the water quality result in weak target features, which adversely affect the extraction of target feature information. Furthermore, the lack of illumination underwater causes image blur and low contrast, thereby increasing the difficulty of the detection task. To address these issues, we propose a novel underwater object detection algorithm called YOLO-GE (GCNet-EMA). First, we introduce an image enhancement module to mitigate the impact of underwater image quality issues on the detection task. Second, a high-resolution feature layer is added into the network to improve the problems of missed detections and false positives for small targets. Third, we propose GEBlock, an attention-based fusion module that captures long-range contextual information and suppresses noise from lower-level feature layers. Finally, we combine an adaptive spatial fusion module with the detection head to filter out conflicting feature information from different feature layers. Experiments on the UTDAC2020, DUO and RUOD datasets show that the proposed method achieves an optimal detection accuracy. Full article
(This article belongs to the Section Ocean Engineering)
Show Figures

Figure 1

14 pages, 2370 KiB  
Article
AMW-YOLOv8n: Road Scene Object Detection Based on an Improved YOLOv8
by Donghao Wu, Chao Fang, Xiaogang Zheng, Jue Liu, Shengchun Wang and Xinyu Huang
Electronics 2024, 13(20), 4121; https://doi.org/10.3390/electronics13204121 - 19 Oct 2024
Viewed by 530
Abstract
This study introduces an improved YOLOv8 model tailored for detecting objects in road scenes. To overcome the limitations of standard convolution operations in adapting to varying targets, we introduce Adaptive Kernel Convolution (AKconv). AKconv dynamically adjusts the convolution kernel’s shape and size, enhancing [...] Read more.
This study introduces an improved YOLOv8 model tailored for detecting objects in road scenes. To overcome the limitations of standard convolution operations in adapting to varying targets, we introduce Adaptive Kernel Convolution (AKconv). AKconv dynamically adjusts the convolution kernel’s shape and size, enhancing the backbone network’s feature extraction capabilities and improving feature representation across different scales. Additionally, we employ a Multi-Scale Dilated Attention (MSDA) mechanism to focus on key target features, further enhancing feature representation. To address the challenge posed by YOLOv8’s large down sampling factor, which limits the learning of small target features in deeper feature maps, we add a small target detection layer. Finally, to improve model training efficiency, we introduce a regression loss function with a Wise-IoU dynamic non-monotonic focusing mechanism. With these enhancements, our improved YOLOv8 model excels in road scene object detection tasks, achieving a 5.6 percentage point improvement in average precision over the original YOLOv8n on real road datasets. Full article
Show Figures

Figure 1

17 pages, 4447 KiB  
Article
Sugarcane-YOLO: An Improved YOLOv8 Model for Accurate Identification of Sugarcane Seed Sprouts
by Fujie Zhang, Defeng Dong, Xiaoyi Jia, Jiawen Guo and Xiaoning Yu
Agronomy 2024, 14(10), 2412; https://doi.org/10.3390/agronomy14102412 - 18 Oct 2024
Viewed by 447
Abstract
Sugarcane is a crop that propagates through seed sprouts on nodes. Accurate identification of sugarcane seed sprouts is crucial for sugarcane planting and the development of intelligent sprout-cutting equipment. This paper proposes a sugarcane seed sprout recognition method based on the YOLOv8s model [...] Read more.
Sugarcane is a crop that propagates through seed sprouts on nodes. Accurate identification of sugarcane seed sprouts is crucial for sugarcane planting and the development of intelligent sprout-cutting equipment. This paper proposes a sugarcane seed sprout recognition method based on the YOLOv8s model by adding the simple attention mechanism (SimAM) module to the neck network of the YOLOv8s model and adding the spatial-depth convolution (SPD-Conv) to the tail convolution part. Meanwhile, the E-IoU loss function is chosen to increase the model’s regression speed. Additionally, a small-object detection layer, P2, is incorporated into the feature pyramid network (FPN), and the large-object detection layer, P5, is eliminated to further improve the model’s recognition accuracy and speed. Then, the improvement of each part is tested and analyzed, and the effectiveness of the improved modules is verified. Finally, the Sugarcane-YOLO model is obtained. On the sugarcane seed and sprout dataset, the Sugarcane-YOLO model performed better and was more balanced in accuracy and detection speed than other mainstream models, and it was the most suitable model for seed and sprout recognition by automatic sugarcane-cutting equipment. Experimental results showed that the Sugarcane-YOLO achieved a mAP50 value of 99.05%, a mAP72 value of 81.3%, a mAP50-95 value of 71.61%, a precision of 97.42%, and a recall rate of 98.63%. Full article
(This article belongs to the Section Precision and Digital Agriculture)
Show Figures

Figure 1

13 pages, 2709 KiB  
Article
Enhanced Vehicle Logo Detection Method Based on Self-Attention Mechanism for Electric Vehicle Application
by Shuo Yang, Yisu Liu, Ziyue Liu, Changhua Xu and Xueting Du
World Electr. Veh. J. 2024, 15(10), 467; https://doi.org/10.3390/wevj15100467 - 14 Oct 2024
Viewed by 579
Abstract
Vehicle logo detection plays a crucial role in various computer vision applications, such as vehicle classification and detection. In this research, we propose an improved vehicle logo detection method leveraging the self-attention mechanism. Our feature-sampling structure integrates multiple attention mechanisms and bidirectional feature [...] Read more.
Vehicle logo detection plays a crucial role in various computer vision applications, such as vehicle classification and detection. In this research, we propose an improved vehicle logo detection method leveraging the self-attention mechanism. Our feature-sampling structure integrates multiple attention mechanisms and bidirectional feature aggregation to enhance the discriminative power of the detection model. Specifically, we introduce the multi-head attention for multi-scale feature fusion module to capture multi-scale contextual information effectively. Moreover, we incorporate the bidirectional aggregation mechanism to facilitate information exchange between different layers of the detection network. Experimental results on a benchmark dataset (VLD-45 dataset) demonstrate that our proposed method outperforms baseline models in terms of both detection accuracy and efficiency. Our experimental evaluation using the VLD-45 dataset achieves a state-of-the-art result of 90.3% mAP. Our method has also improved AP by 10% for difficult samples, such as HAVAL and LAND ROVER. Our method provides a new detection framework for small-size objects, with potential applications in various fields. Full article
(This article belongs to the Special Issue Deep Learning Applications for Electric Vehicles)
Show Figures

Figure 1

20 pages, 3290 KiB  
Article
Vehicle and Pedestrian Detection Based on Improved YOLOv7-Tiny
by Zhen Liang, Wei Wang, Ruifeng Meng, Hongyu Yang, Jinlei Wang, He Gao, Biao Li and Jungeng Fan
Electronics 2024, 13(20), 4010; https://doi.org/10.3390/electronics13204010 - 12 Oct 2024
Viewed by 611
Abstract
To improve the detection accuracy of vehicles and pedestrians in traffic scenes using object detection algorithms, this paper presents modifications, compression, and deployment of the single-stage typical algorithm YOLOv7-tiny. In the model improvement section: firstly, to address the problem of small object missed [...] Read more.
To improve the detection accuracy of vehicles and pedestrians in traffic scenes using object detection algorithms, this paper presents modifications, compression, and deployment of the single-stage typical algorithm YOLOv7-tiny. In the model improvement section: firstly, to address the problem of small object missed detection, shallower feature layer information is incorporated into the original feature fusion branch, forming a four-scale detection head; secondly, a Multi-Stage Feature Fusion (MSFF) module is proposed to fully integrate shallow, middle, and deep feature information to extract more comprehensive small object information. In the model compression section: the Layer-Adaptive Magnitude-based Pruning (LAMP) algorithm and the Torch-Pruning library are combined, setting different pruning rates for the improved model. In the model deployment section: the V7-tiny-P2-MSFF model, pruned by 45% using LAMP, is deployed on the embedded platform NVIDIA Jetson AGX Xavier. Experimental results show that the improved and pruned model achieves a 12.3% increase in [email protected] compared to the original model, with parameter volume, computation volume, and model size reduced by 76.74%, 7.57%, and 70.94%, respectively. Moreover, the inference speed of a single image for the pruned and quantized model deployed on Xavier is 9.5 ms. Full article
Show Figures

Figure 1

20 pages, 6554 KiB  
Article
An Efficient UAV Image Object Detection Algorithm Based on Global Attention and Multi-Scale Feature Fusion
by Rui Qian and Yong Ding
Electronics 2024, 13(20), 3989; https://doi.org/10.3390/electronics13203989 - 10 Oct 2024
Viewed by 1089
Abstract
Object detection technology holds significant promise in unmanned aerial vehicle (UAV) applications. However, traditional methods face challenges in detecting denser, smaller, and more complex targets within UAV aerial images. To address issues such as target occlusion and dense small objects, this paper proposes [...] Read more.
Object detection technology holds significant promise in unmanned aerial vehicle (UAV) applications. However, traditional methods face challenges in detecting denser, smaller, and more complex targets within UAV aerial images. To address issues such as target occlusion and dense small objects, this paper proposes a multi-scale object detection algorithm based on YOLOv5s. A novel feature extraction module, DCNCSPELAN4, which combines CSPNet and ELAN, is introduced to enhance the receptive field of feature extraction while maintaining network efficiency. Additionally, a lightweight Vision Transformer module, the CloFormer Block, is integrated to provide the network with a global receptive field. Moreover, the algorithm incorporates a three-scale feature fusion (TFE) module and a scale sequence feature fusion (SSFF) module in the neck network to effectively leverage multi-scale spatial information across different feature maps. To address dense small objects, an additional small object detection head was added to the detection layer. The original large object detection head was removed to reduce computational load. The proposed algorithm has been evaluated through ablation experiments and compared with other state-of-the-art methods on the VisDrone2019 and AU-AIR datasets. The results demonstrate that our algorithm outperforms other baseline methods in terms of both accuracy and speed. Compared to the YOLOv5s baseline model, the enhanced algorithm achieves improvements of 12.4% and 8.4% in AP50 and AP metrics, respectively, with only a marginal parameter increase of 0.3 M. These experiments validate the effectiveness of our algorithm for object detection in drone imagery. Full article
Show Figures

Figure 1

24 pages, 12126 KiB  
Article
Efficient Optimized YOLOv8 Model with Extended Vision
by Qi Zhou, Zhou Wang, Yiwen Zhong, Fenglin Zhong and Lijin Wang
Sensors 2024, 24(20), 6506; https://doi.org/10.3390/s24206506 - 10 Oct 2024
Viewed by 1171
Abstract
In the field of object detection, enhancing algorithm performance in complex scenarios represents a fundamental technological challenge. To address this issue, this paper presents an efficient optimized YOLOv8 model with extended vision (YOLO-EV), which optimizes the performance of the YOLOv8 model through a [...] Read more.
In the field of object detection, enhancing algorithm performance in complex scenarios represents a fundamental technological challenge. To address this issue, this paper presents an efficient optimized YOLOv8 model with extended vision (YOLO-EV), which optimizes the performance of the YOLOv8 model through a series of innovative improvement measures and strategies. First, we propose a multi-branch group-enhanced fusion attention (MGEFA) module and integrate it into YOLO-EV, which significantly boosts the model’s feature extraction capabilities. Second, we enhance the existing spatial pyramid pooling fast (SPPF) layer by integrating large scale kernel attention (LSKA), improving the model’s efficiency in processing spatial information. Additionally, we replace the traditional IOU loss function with the Wise-IOU loss function, thereby enhancing localization accuracy across various target sizes. We also introduce a P6 layer to augment the model’s detection capabilities for multi-scale targets. Through network structure optimization, we achieve higher computational efficiency, ensuring that YOLO-EV consumes fewer computational resources than YOLOv8s. In the validation section, preliminary tests on the VOC12 dataset demonstrate YOLO-EV’s effectiveness in standard object detection tasks. Moreover, YOLO-EV has been applied to the CottonWeedDet12 and CropWeed datasets, which are characterized by complex scenes, diverse weed morphologies, significant occlusions, and numerous small targets. Experimental results indicate that YOLO-EV exhibits superior detection accuracy in these complex agricultural environments compared to the original YOLOv8s and other state-of-the-art models, effectively identifying and locating various types of weeds, thus demonstrating its significant practical application potential. Full article
(This article belongs to the Section Smart Agriculture)
Show Figures

Figure 1

25 pages, 27763 KiB  
Article
Improved Multi-Size, Multi-Target and 3D Position Detection Network for Flowering Chinese Cabbage Based on YOLOv8
by Yuanqing Shui, Kai Yuan, Mengcheng Wu and Zuoxi Zhao
Plants 2024, 13(19), 2808; https://doi.org/10.3390/plants13192808 - 7 Oct 2024
Viewed by 745
Abstract
Accurately detecting the maturity and 3D position of flowering Chinese cabbage (Brassica rapa var. chinensis) in natural environments is vital for autonomous robot harvesting in unstructured farms. The challenge lies in dense planting, small flower buds, similar colors and occlusions. This study [...] Read more.
Accurately detecting the maturity and 3D position of flowering Chinese cabbage (Brassica rapa var. chinensis) in natural environments is vital for autonomous robot harvesting in unstructured farms. The challenge lies in dense planting, small flower buds, similar colors and occlusions. This study proposes a YOLOv8-Improved network integrated with the ByteTrack tracking algorithm to achieve multi-object detection and 3D positioning of flowering Chinese cabbage plants in fields. In this study, C2F-MLCA is created by adding a lightweight Mixed Local Channel Attention (MLCA) with spatial awareness capability to the C2F module of YOLOv8, which improves the extraction of spatial feature information in the backbone network. In addition, a P2 detection layer is added to the neck network, and BiFPN is used instead of PAN to enhance multi-scale feature fusion and small target detection. Wise-IoU in combination with Inner-IoU is adopted as a new loss function to optimize the network for different quality samples and different size bounding boxes. Lastly, ByteTrack is integrated for video tracking, and RGB-D camera depth data are used to estimate cabbage positions. The experimental results show that YOLOv8-Improve achieves a precision (P) of 86.5% and a recall (R) of 86.0% in detecting the maturity of flowering Chinese cabbage. Among them, mAP50 and mAP75 reach 91.8% and 61.6%, respectively, representing an improvement of 2.9% and 4.7% over the original network. Additionally, the number of parameters is reduced by 25.43%. In summary, the improved YOLOv8 algorithm demonstrates high robustness and real-time detection performance, thereby providing strong technical support for automated harvesting management. Full article
(This article belongs to the Section Plant Modeling)
Show Figures

Figure 1

12 pages, 34384 KiB  
Article
Improved Small Object Detection Algorithm CRL-YOLOv5
by Zhiyuan Wang, Shujun Men, Yuntian Bai, Yutong Yuan, Jiamin Wang, Kanglei Wang and Lei Zhang
Sensors 2024, 24(19), 6437; https://doi.org/10.3390/s24196437 - 4 Oct 2024
Viewed by 570
Abstract
Detecting small objects in images poses significant challenges due to their limited pixel representation and the difficulty in extracting sufficient features, often leading to missed or false detections. To address these challenges and enhance detection accuracy, this paper presents an improved small object [...] Read more.
Detecting small objects in images poses significant challenges due to their limited pixel representation and the difficulty in extracting sufficient features, often leading to missed or false detections. To address these challenges and enhance detection accuracy, this paper presents an improved small object detection algorithm, CRL-YOLOv5. The proposed approach integrates the Convolutional Block Attention Module (CBAM) attention mechanism into the C3 module of the backbone network, which enhances the localization accuracy of small objects. Additionally, the Receptive Field Block (RFB) module is introduced to expand the model’s receptive field, thereby fully leveraging contextual information. Furthermore, the network architecture is restructured to include an additional detection layer specifically for small objects, allowing for deeper feature extraction from shallow layers. When tested on the VisDrone2019 small object dataset, CRL-YOLOv5 achieved an mAP50 of 39.2%, representing a 5.4% improvement over the original YOLOv5, effectively boosting the detection precision for small objects in images. Full article
Show Figures

Figure 1

23 pages, 48499 KiB  
Article
TTPRNet: A Real-Time and Precise Tea Tree Pest Recognition Model in Complex Tea Garden Environments
by Yane Li, Ting Chen, Fang Xia, Hailin Feng, Yaoping Ruan, Xiang Weng and Xiaoxing Weng
Agriculture 2024, 14(10), 1710; https://doi.org/10.3390/agriculture14101710 - 29 Sep 2024
Viewed by 346
Abstract
The accurate identification of tea tree pests is crucial for tea production, as it directly impacts yield and quality. In natural tea garden environments, identifying pests is challenging due to their small size, similarity in color to tea trees, and complex backgrounds. To [...] Read more.
The accurate identification of tea tree pests is crucial for tea production, as it directly impacts yield and quality. In natural tea garden environments, identifying pests is challenging due to their small size, similarity in color to tea trees, and complex backgrounds. To address this issue, we propose TTPRNet, a multi-scale recognition model designed for real tea garden environments. TTPRNet introduces the ConvNext architecture into the backbone network to enhance the global feature learning capabilities and reduce the parameters, and it incorporates the coordinate attention mechanism into the feature output layer to improve the representation ability for different scales. Additionally, GSConv is employed in the neck network to reduce redundant information and enhance the effectiveness of the attention modules. The NWD loss function is used to focus on the similarity between multi-scale pests, improving recognition accuracy. The results show that TTPRNet achieves a recall of 91% and a mAP of 92.8%, representing 7.1% and 4% improvements over the original model, respectively. TTPRNet outperforms existing object detection models in recall, mAP, and recognition speed, meeting real-time requirements. Furthermore, the model integrates a counting function, enabling precise tallying of pest numbers and types and thus offering practical solutions for accurate identification in complex field conditions. Full article
(This article belongs to the Section Digital Agriculture)
Show Figures

Figure 1

Back to TopTop