YOLO-DroneMS: Multi-Scale Object Detection Network for Unmanned Aerial Vehicle (UAV) Images

In recent years, research on Unmanned Aerial Vehicles (UAVs) has developed rapidly. Compared to traditional remote-sensing images, UAV images exhibit complex backgrounds, high resolution, and large differences in object scales. Therefore, UAV object detection is an essential yet challenging task. Th...

Full description

Saved in:
Bibliographic Details
Main Authors: Xueqiang Zhao, Yangbo Chen
Format: Article
Language:English
Published: MDPI AG 2024-10-01
Series:Drones
Subjects:
Online Access:https://www.mdpi.com/2504-446X/8/11/609
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1846153714329452544
author Xueqiang Zhao
Yangbo Chen
author_facet Xueqiang Zhao
Yangbo Chen
author_sort Xueqiang Zhao
collection DOAJ
description In recent years, research on Unmanned Aerial Vehicles (UAVs) has developed rapidly. Compared to traditional remote-sensing images, UAV images exhibit complex backgrounds, high resolution, and large differences in object scales. Therefore, UAV object detection is an essential yet challenging task. This paper proposes a multi-scale object detection network, namely YOLO-DroneMS (You Only Look Once for Drone Multi-Scale Object), for UAV images. Targeting the pivotal connection between the backbone and neck, the Large Separable Kernel Attention (LSKA) mechanism is adopted with the Spatial Pyramid Pooling Factor (SPPF), where weighted processing of multi-scale feature maps is performed to focus more on features. And Attentional Scale Sequence Fusion DySample (ASF-DySample) is introduced to perform attention scale sequence fusion and dynamic upsampling to conserve resources. Then, the faster cross-stage partial network bottleneck with two convolutions (named C2f) in the backbone is optimized using the Inverted Residual Mobile Block and Dilated Reparam Block (iRMB-DRB), which balances the advantages of dynamic global modeling and static local information fusion. This optimization effectively increases the model’s receptive field, enhancing its capability for downstream tasks. By replacing the original CIoU with WIoUv3, the model prioritizes anchoring boxes of superior quality, dynamically adjusting weights to enhance detection performance for small objects. Experimental findings on the VisDrone2019 dataset demonstrate that at an Intersection over Union (IoU) of 0.5, YOLO-DroneMS achieves a 3.6% increase in mAP@50 compared to the YOLOv8n model. Moreover, YOLO-DroneMS exhibits improved detection speed, increasing the number of frames per second (FPS) from 78.7 to 83.3. The enhanced model supports diverse target scales and achieves high recognition rates, making it well-suited for drone-based object detection tasks, particularly in scenarios involving multiple object clusters.
format Article
id doaj-art-36c4e488b5bf45ed89bb9c460bc988c3
institution Kabale University
issn 2504-446X
language English
publishDate 2024-10-01
publisher MDPI AG
record_format Article
series Drones
spelling doaj-art-36c4e488b5bf45ed89bb9c460bc988c32024-11-26T18:00:33ZengMDPI AGDrones2504-446X2024-10-0181160910.3390/drones8110609YOLO-DroneMS: Multi-Scale Object Detection Network for Unmanned Aerial Vehicle (UAV) ImagesXueqiang Zhao0Yangbo Chen1School of Geography and Planning, Sun Yat-sen University, Guangzhou 510275, ChinaSchool of Geography and Planning, Sun Yat-sen University, Guangzhou 510275, ChinaIn recent years, research on Unmanned Aerial Vehicles (UAVs) has developed rapidly. Compared to traditional remote-sensing images, UAV images exhibit complex backgrounds, high resolution, and large differences in object scales. Therefore, UAV object detection is an essential yet challenging task. This paper proposes a multi-scale object detection network, namely YOLO-DroneMS (You Only Look Once for Drone Multi-Scale Object), for UAV images. Targeting the pivotal connection between the backbone and neck, the Large Separable Kernel Attention (LSKA) mechanism is adopted with the Spatial Pyramid Pooling Factor (SPPF), where weighted processing of multi-scale feature maps is performed to focus more on features. And Attentional Scale Sequence Fusion DySample (ASF-DySample) is introduced to perform attention scale sequence fusion and dynamic upsampling to conserve resources. Then, the faster cross-stage partial network bottleneck with two convolutions (named C2f) in the backbone is optimized using the Inverted Residual Mobile Block and Dilated Reparam Block (iRMB-DRB), which balances the advantages of dynamic global modeling and static local information fusion. This optimization effectively increases the model’s receptive field, enhancing its capability for downstream tasks. By replacing the original CIoU with WIoUv3, the model prioritizes anchoring boxes of superior quality, dynamically adjusting weights to enhance detection performance for small objects. Experimental findings on the VisDrone2019 dataset demonstrate that at an Intersection over Union (IoU) of 0.5, YOLO-DroneMS achieves a 3.6% increase in mAP@50 compared to the YOLOv8n model. Moreover, YOLO-DroneMS exhibits improved detection speed, increasing the number of frames per second (FPS) from 78.7 to 83.3. The enhanced model supports diverse target scales and achieves high recognition rates, making it well-suited for drone-based object detection tasks, particularly in scenarios involving multiple object clusters.https://www.mdpi.com/2504-446X/8/11/609drone imagesLSKADySampleiRMB-DRBWIoU
spellingShingle Xueqiang Zhao
Yangbo Chen
YOLO-DroneMS: Multi-Scale Object Detection Network for Unmanned Aerial Vehicle (UAV) Images
Drones
drone images
LSKA
DySample
iRMB-DRB
WIoU
title YOLO-DroneMS: Multi-Scale Object Detection Network for Unmanned Aerial Vehicle (UAV) Images
title_full YOLO-DroneMS: Multi-Scale Object Detection Network for Unmanned Aerial Vehicle (UAV) Images
title_fullStr YOLO-DroneMS: Multi-Scale Object Detection Network for Unmanned Aerial Vehicle (UAV) Images
title_full_unstemmed YOLO-DroneMS: Multi-Scale Object Detection Network for Unmanned Aerial Vehicle (UAV) Images
title_short YOLO-DroneMS: Multi-Scale Object Detection Network for Unmanned Aerial Vehicle (UAV) Images
title_sort yolo dronems multi scale object detection network for unmanned aerial vehicle uav images
topic drone images
LSKA
DySample
iRMB-DRB
WIoU
url https://www.mdpi.com/2504-446X/8/11/609
work_keys_str_mv AT xueqiangzhao yolodronemsmultiscaleobjectdetectionnetworkforunmannedaerialvehicleuavimages
AT yangbochen yolodronemsmultiscaleobjectdetectionnetworkforunmannedaerialvehicleuavimages