A Dynamic Cascade Cross-Modal Coassisted Network for AAV Image Object Detection

Accurate detection of small objects plays an important role in the application of Autonomous aerial vehicles (AAV). However, current works mainly extract comprehensive features from unimodal images, which can obtain very limited distinguishable features for objects, especially those with small sizes...

Full description

Saved in:
Bibliographic Details
Main Authors: Shu Tian, Li Wang, Lin Cao, Lihong Kang, Xian Sun, Jing Tian, Xiangwei Xing, Bo Shen, Chunzhuo Fan, Kangning Du, Chong Fu, Ye Zhang
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10797700/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841533436363276288
author Shu Tian
Li Wang
Lin Cao
Lihong Kang
Xian Sun
Jing Tian
Xiangwei Xing
Bo Shen
Chunzhuo Fan
Kangning Du
Chong Fu
Ye Zhang
author_facet Shu Tian
Li Wang
Lin Cao
Lihong Kang
Xian Sun
Jing Tian
Xiangwei Xing
Bo Shen
Chunzhuo Fan
Kangning Du
Chong Fu
Ye Zhang
author_sort Shu Tian
collection DOAJ
description Accurate detection of small objects plays an important role in the application of Autonomous aerial vehicles (AAV). However, current works mainly extract comprehensive features from unimodal images, which can obtain very limited distinguishable features for objects, especially those with small sizes. To address this issue, we propose a dynamic cascade cross-modal coassisted network, which integrates multimodal images fusion and fine-grained feature learning to generate powerful object semantic representations. Specifically, we design a multimodal high-order interaction module to achieve collaborative interaction of spatial details and channel dependencies between modalities, thereby enhancing object discrimination. To preserve multimodal fine-grained details, we devise a scale-adaptive dynamic feature prompt module, which dynamically motivates the backbone network to capture feature degradation clues. Meanwhile, to maintain the spatial correlation of multimodal cross-scale features and improve the quality of feature fusion, we derive a global collaborative enhancement module into the feature pyramid network for enhancing the detection accuracy across multiple scales. Extensive experimental results on multimodal datasets have shown that our method achieves favorable performance, surpassing other state-of-the-art methods.
format Article
id doaj-art-c026b5fce01a4053a10e3ff8d73d3a66
institution Kabale University
issn 1939-1404
2151-1535
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
spelling doaj-art-c026b5fce01a4053a10e3ff8d73d3a662025-01-16T00:00:23ZengIEEEIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing1939-14042151-15352025-01-01182749276510.1109/JSTARS.2024.351678310797700A Dynamic Cascade Cross-Modal Coassisted Network for AAV Image Object DetectionShu Tian0https://orcid.org/0000-0003-2275-5853Li Wang1https://orcid.org/0009-0006-2685-2756Lin Cao2https://orcid.org/0000-0003-0875-1549Lihong Kang3Xian Sun4https://orcid.org/0000-0002-0038-9816Jing Tian5Xiangwei Xing6Bo Shen7Chunzhuo Fan8https://orcid.org/0000-0001-9980-9595Kangning Du9https://orcid.org/0000-0002-2998-757XChong Fu10https://orcid.org/0000-0002-4549-744XYe Zhang11https://orcid.org/0000-0001-8721-4535Beijing Information Science and Technology University, Beijing, ChinaBeijing Information Science and Technology University, Beijing, ChinaBeijing Information Science and Technology University, Beijing, ChinaBeijing Remote Sensing Information Institute, Beijing, ChinaAerospace Information Research Institute, Chinese Academy of Sciences, Beijing, ChinaBeijing Remote Sensing Information Institute, Beijing, ChinaBeijing Remote Sensing Information Institute, Beijing, China15th Research Institute of China Electronics Technology Group Corporation, Beijing, ChinaBeijing Remote Sensing Information Institute, Beijing, ChinaBeijing Information Science and Technology University, Beijing, ChinaNortheastern University, Liaoning, ChinaHarbin Institute of Technology, Harbin, ChinaAccurate detection of small objects plays an important role in the application of Autonomous aerial vehicles (AAV). However, current works mainly extract comprehensive features from unimodal images, which can obtain very limited distinguishable features for objects, especially those with small sizes. To address this issue, we propose a dynamic cascade cross-modal coassisted network, which integrates multimodal images fusion and fine-grained feature learning to generate powerful object semantic representations. Specifically, we design a multimodal high-order interaction module to achieve collaborative interaction of spatial details and channel dependencies between modalities, thereby enhancing object discrimination. To preserve multimodal fine-grained details, we devise a scale-adaptive dynamic feature prompt module, which dynamically motivates the backbone network to capture feature degradation clues. Meanwhile, to maintain the spatial correlation of multimodal cross-scale features and improve the quality of feature fusion, we derive a global collaborative enhancement module into the feature pyramid network for enhancing the detection accuracy across multiple scales. Extensive experimental results on multimodal datasets have shown that our method achieves favorable performance, surpassing other state-of-the-art methods.https://ieeexplore.ieee.org/document/10797700/All-weather object detectionhigh-order interactionmultimodal fusionautonomous aerial vehicles (AAV) aerial imagery
spellingShingle Shu Tian
Li Wang
Lin Cao
Lihong Kang
Xian Sun
Jing Tian
Xiangwei Xing
Bo Shen
Chunzhuo Fan
Kangning Du
Chong Fu
Ye Zhang
A Dynamic Cascade Cross-Modal Coassisted Network for AAV Image Object Detection
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
All-weather object detection
high-order interaction
multimodal fusion
autonomous aerial vehicles (AAV) aerial imagery
title A Dynamic Cascade Cross-Modal Coassisted Network for AAV Image Object Detection
title_full A Dynamic Cascade Cross-Modal Coassisted Network for AAV Image Object Detection
title_fullStr A Dynamic Cascade Cross-Modal Coassisted Network for AAV Image Object Detection
title_full_unstemmed A Dynamic Cascade Cross-Modal Coassisted Network for AAV Image Object Detection
title_short A Dynamic Cascade Cross-Modal Coassisted Network for AAV Image Object Detection
title_sort dynamic cascade cross modal coassisted network for aav image object detection
topic All-weather object detection
high-order interaction
multimodal fusion
autonomous aerial vehicles (AAV) aerial imagery
url https://ieeexplore.ieee.org/document/10797700/
work_keys_str_mv AT shutian adynamiccascadecrossmodalcoassistednetworkforaavimageobjectdetection
AT liwang adynamiccascadecrossmodalcoassistednetworkforaavimageobjectdetection
AT lincao adynamiccascadecrossmodalcoassistednetworkforaavimageobjectdetection
AT lihongkang adynamiccascadecrossmodalcoassistednetworkforaavimageobjectdetection
AT xiansun adynamiccascadecrossmodalcoassistednetworkforaavimageobjectdetection
AT jingtian adynamiccascadecrossmodalcoassistednetworkforaavimageobjectdetection
AT xiangweixing adynamiccascadecrossmodalcoassistednetworkforaavimageobjectdetection
AT boshen adynamiccascadecrossmodalcoassistednetworkforaavimageobjectdetection
AT chunzhuofan adynamiccascadecrossmodalcoassistednetworkforaavimageobjectdetection
AT kangningdu adynamiccascadecrossmodalcoassistednetworkforaavimageobjectdetection
AT chongfu adynamiccascadecrossmodalcoassistednetworkforaavimageobjectdetection
AT yezhang adynamiccascadecrossmodalcoassistednetworkforaavimageobjectdetection
AT shutian dynamiccascadecrossmodalcoassistednetworkforaavimageobjectdetection
AT liwang dynamiccascadecrossmodalcoassistednetworkforaavimageobjectdetection
AT lincao dynamiccascadecrossmodalcoassistednetworkforaavimageobjectdetection
AT lihongkang dynamiccascadecrossmodalcoassistednetworkforaavimageobjectdetection
AT xiansun dynamiccascadecrossmodalcoassistednetworkforaavimageobjectdetection
AT jingtian dynamiccascadecrossmodalcoassistednetworkforaavimageobjectdetection
AT xiangweixing dynamiccascadecrossmodalcoassistednetworkforaavimageobjectdetection
AT boshen dynamiccascadecrossmodalcoassistednetworkforaavimageobjectdetection
AT chunzhuofan dynamiccascadecrossmodalcoassistednetworkforaavimageobjectdetection
AT kangningdu dynamiccascadecrossmodalcoassistednetworkforaavimageobjectdetection
AT chongfu dynamiccascadecrossmodalcoassistednetworkforaavimageobjectdetection
AT yezhang dynamiccascadecrossmodalcoassistednetworkforaavimageobjectdetection