End to end polysemantic cooperative mixed task trainer for UAV target detection

Abstract With the rapid advancement and application of Unmanned Aerial Vehicles (UAVs), target detection in urban scenes has made significant progress. Achieving precise 3D reconstruction from oblique imagery is essential for accurate urban object detection in UAV images. However, challenges persist...

Full description

Saved in:
Bibliographic Details
Main Authors: Xueying Liao, Xingran Guo, Askar Rozi, Haizheng Yu, Abdukerim Haji
Format: Article
Language:English
Published: Nature Portfolio 2024-11-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-024-81201-8
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract With the rapid advancement and application of Unmanned Aerial Vehicles (UAVs), target detection in urban scenes has made significant progress. Achieving precise 3D reconstruction from oblique imagery is essential for accurate urban object detection in UAV images. However, challenges persist due to low detection accuracy caused by subtle target features, complex backgrounds, and the prevalence of small targets. To address these issues, we introduce the Polysemantic Cooperative Detection Transformer (Pc-DETR), a novel end-to-end UAV image target detection network. Our primary innovation, the Polysemantic Transformer (PoT) Backbone, enhances visual representation by leveraging contextual information to guide a dynamic attention matrix. This matrix, formed through convolutions, captures both static and dynamic features, resulting in superior detection. Additionally, we propose the Polysemantic Cooperative Mixed-Task Training scheme, which employs multiple auxiliary heads for diverse label assignments, boosting the encoder’s learning capacity. This approach customizes queries and optimizes training efficiency without increasing inference costs. Comparative experiments show that Pc-DETR achieves a 3% improvement in detection accuracy over the current state-of-the-art MFEFNet, setting a new benchmark in UAV image detection and advancing methodologies for intelligent UAV surveillance systems.
ISSN:2045-2322