A small object detection model in aerial images based on CPDD-YOLOv8

Abstract Aerial images can cover a wide area and capture rich scene information. These images are often taken from a high altitude and contain many small objects. It is difficult to detect small objects accurately because their features are not obvious and are susceptible to background interference....

Full description

Saved in:
Bibliographic Details
Main Authors: Jingyang Wang, Jiayao Gao, Bo Zhang
Format: Article
Language:English
Published: Nature Portfolio 2025-01-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-024-84938-4
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841559698477678592
author Jingyang Wang
Jiayao Gao
Bo Zhang
author_facet Jingyang Wang
Jiayao Gao
Bo Zhang
author_sort Jingyang Wang
collection DOAJ
description Abstract Aerial images can cover a wide area and capture rich scene information. These images are often taken from a high altitude and contain many small objects. It is difficult to detect small objects accurately because their features are not obvious and are susceptible to background interference. The CPDD-YOLOv8 is proposed to improve the performance of small object detection. Firstly, we propose the C2fGAM structure, which integrates the Global Attention Mechanism (GAM) into the C2f structure of the backbone so that the model can better understand the overall semantics of the images. Secondly, a detection layer named P2 is added to extract the shallow features. Thirdly, a new DSC2f structure is proposed, which uses Dynamic Snake Convolution (DSConv) to take the place of the first standard Conv of Bottleneck in the C2f structure, so that the model can adapt to different inputs more effectively. Finally, the Dynamic Head (DyHead), which integrates multiple attention mechanisms, is used in the head to assign different weights to different feature layers. To prove the effectiveness of the CPDD-YOLOv8, we carry out ablation and comparison experiments on the VisDrone2019 dataset. Ablation experiments show that all the improved and added modules in CPDD-YOLOv8 are effective. Comparative experiments suggest that the mAP of CPDD-YOLOv8 is higher than the other seven comparison models. The mAP@0.5 of this model reaches 41%, which is 6.9% higher than that of YOLOv8. The CPDD-YOLOv8’s small object detection rate is improved by 13.1%. The generalizability of the CPDD-YOLOv8 model is verified on the WiderPerson, VOC_MASK and SHWD datasets.
format Article
id doaj-art-de34628601ca4f0db5f732e6aa339af1
institution Kabale University
issn 2045-2322
language English
publishDate 2025-01-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj-art-de34628601ca4f0db5f732e6aa339af12025-01-05T12:17:45ZengNature PortfolioScientific Reports2045-23222025-01-0115111610.1038/s41598-024-84938-4A small object detection model in aerial images based on CPDD-YOLOv8Jingyang Wang0Jiayao Gao1Bo Zhang2School of Information Science and Engineering, Hebei University of Science and TechnologySchool of Information Science and Engineering, Hebei University of Science and TechnologySchool of Cyberspace Security, Hebei University of Engineering ScienceAbstract Aerial images can cover a wide area and capture rich scene information. These images are often taken from a high altitude and contain many small objects. It is difficult to detect small objects accurately because their features are not obvious and are susceptible to background interference. The CPDD-YOLOv8 is proposed to improve the performance of small object detection. Firstly, we propose the C2fGAM structure, which integrates the Global Attention Mechanism (GAM) into the C2f structure of the backbone so that the model can better understand the overall semantics of the images. Secondly, a detection layer named P2 is added to extract the shallow features. Thirdly, a new DSC2f structure is proposed, which uses Dynamic Snake Convolution (DSConv) to take the place of the first standard Conv of Bottleneck in the C2f structure, so that the model can adapt to different inputs more effectively. Finally, the Dynamic Head (DyHead), which integrates multiple attention mechanisms, is used in the head to assign different weights to different feature layers. To prove the effectiveness of the CPDD-YOLOv8, we carry out ablation and comparison experiments on the VisDrone2019 dataset. Ablation experiments show that all the improved and added modules in CPDD-YOLOv8 are effective. Comparative experiments suggest that the mAP of CPDD-YOLOv8 is higher than the other seven comparison models. The mAP@0.5 of this model reaches 41%, which is 6.9% higher than that of YOLOv8. The CPDD-YOLOv8’s small object detection rate is improved by 13.1%. The generalizability of the CPDD-YOLOv8 model is verified on the WiderPerson, VOC_MASK and SHWD datasets.https://doi.org/10.1038/s41598-024-84938-4YOLOv8C2fGAMP2; DSC2fSmall object detection
spellingShingle Jingyang Wang
Jiayao Gao
Bo Zhang
A small object detection model in aerial images based on CPDD-YOLOv8
Scientific Reports
YOLOv8
C2fGAM
P2; DSC2f
Small object detection
title A small object detection model in aerial images based on CPDD-YOLOv8
title_full A small object detection model in aerial images based on CPDD-YOLOv8
title_fullStr A small object detection model in aerial images based on CPDD-YOLOv8
title_full_unstemmed A small object detection model in aerial images based on CPDD-YOLOv8
title_short A small object detection model in aerial images based on CPDD-YOLOv8
title_sort small object detection model in aerial images based on cpdd yolov8
topic YOLOv8
C2fGAM
P2; DSC2f
Small object detection
url https://doi.org/10.1038/s41598-024-84938-4
work_keys_str_mv AT jingyangwang asmallobjectdetectionmodelinaerialimagesbasedoncpddyolov8
AT jiayaogao asmallobjectdetectionmodelinaerialimagesbasedoncpddyolov8
AT bozhang asmallobjectdetectionmodelinaerialimagesbasedoncpddyolov8
AT jingyangwang smallobjectdetectionmodelinaerialimagesbasedoncpddyolov8
AT jiayaogao smallobjectdetectionmodelinaerialimagesbasedoncpddyolov8
AT bozhang smallobjectdetectionmodelinaerialimagesbasedoncpddyolov8