A Dynamic Cascade Cross-Modal Coassisted Network for AAV Image Object Detection
Accurate detection of small objects plays an important role in the application of Autonomous aerial vehicles (AAV). However, current works mainly extract comprehensive features from unimodal images, which can obtain very limited distinguishable features for objects, especially those with small sizes...
Saved in:
Main Authors: | , , , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2025-01-01
|
Series: | IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10797700/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1841533436363276288 |
---|---|
author | Shu Tian Li Wang Lin Cao Lihong Kang Xian Sun Jing Tian Xiangwei Xing Bo Shen Chunzhuo Fan Kangning Du Chong Fu Ye Zhang |
author_facet | Shu Tian Li Wang Lin Cao Lihong Kang Xian Sun Jing Tian Xiangwei Xing Bo Shen Chunzhuo Fan Kangning Du Chong Fu Ye Zhang |
author_sort | Shu Tian |
collection | DOAJ |
description | Accurate detection of small objects plays an important role in the application of Autonomous aerial vehicles (AAV). However, current works mainly extract comprehensive features from unimodal images, which can obtain very limited distinguishable features for objects, especially those with small sizes. To address this issue, we propose a dynamic cascade cross-modal coassisted network, which integrates multimodal images fusion and fine-grained feature learning to generate powerful object semantic representations. Specifically, we design a multimodal high-order interaction module to achieve collaborative interaction of spatial details and channel dependencies between modalities, thereby enhancing object discrimination. To preserve multimodal fine-grained details, we devise a scale-adaptive dynamic feature prompt module, which dynamically motivates the backbone network to capture feature degradation clues. Meanwhile, to maintain the spatial correlation of multimodal cross-scale features and improve the quality of feature fusion, we derive a global collaborative enhancement module into the feature pyramid network for enhancing the detection accuracy across multiple scales. Extensive experimental results on multimodal datasets have shown that our method achieves favorable performance, surpassing other state-of-the-art methods. |
format | Article |
id | doaj-art-c026b5fce01a4053a10e3ff8d73d3a66 |
institution | Kabale University |
issn | 1939-1404 2151-1535 |
language | English |
publishDate | 2025-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing |
spelling | doaj-art-c026b5fce01a4053a10e3ff8d73d3a662025-01-16T00:00:23ZengIEEEIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing1939-14042151-15352025-01-01182749276510.1109/JSTARS.2024.351678310797700A Dynamic Cascade Cross-Modal Coassisted Network for AAV Image Object DetectionShu Tian0https://orcid.org/0000-0003-2275-5853Li Wang1https://orcid.org/0009-0006-2685-2756Lin Cao2https://orcid.org/0000-0003-0875-1549Lihong Kang3Xian Sun4https://orcid.org/0000-0002-0038-9816Jing Tian5Xiangwei Xing6Bo Shen7Chunzhuo Fan8https://orcid.org/0000-0001-9980-9595Kangning Du9https://orcid.org/0000-0002-2998-757XChong Fu10https://orcid.org/0000-0002-4549-744XYe Zhang11https://orcid.org/0000-0001-8721-4535Beijing Information Science and Technology University, Beijing, ChinaBeijing Information Science and Technology University, Beijing, ChinaBeijing Information Science and Technology University, Beijing, ChinaBeijing Remote Sensing Information Institute, Beijing, ChinaAerospace Information Research Institute, Chinese Academy of Sciences, Beijing, ChinaBeijing Remote Sensing Information Institute, Beijing, ChinaBeijing Remote Sensing Information Institute, Beijing, China15th Research Institute of China Electronics Technology Group Corporation, Beijing, ChinaBeijing Remote Sensing Information Institute, Beijing, ChinaBeijing Information Science and Technology University, Beijing, ChinaNortheastern University, Liaoning, ChinaHarbin Institute of Technology, Harbin, ChinaAccurate detection of small objects plays an important role in the application of Autonomous aerial vehicles (AAV). However, current works mainly extract comprehensive features from unimodal images, which can obtain very limited distinguishable features for objects, especially those with small sizes. To address this issue, we propose a dynamic cascade cross-modal coassisted network, which integrates multimodal images fusion and fine-grained feature learning to generate powerful object semantic representations. Specifically, we design a multimodal high-order interaction module to achieve collaborative interaction of spatial details and channel dependencies between modalities, thereby enhancing object discrimination. To preserve multimodal fine-grained details, we devise a scale-adaptive dynamic feature prompt module, which dynamically motivates the backbone network to capture feature degradation clues. Meanwhile, to maintain the spatial correlation of multimodal cross-scale features and improve the quality of feature fusion, we derive a global collaborative enhancement module into the feature pyramid network for enhancing the detection accuracy across multiple scales. Extensive experimental results on multimodal datasets have shown that our method achieves favorable performance, surpassing other state-of-the-art methods.https://ieeexplore.ieee.org/document/10797700/All-weather object detectionhigh-order interactionmultimodal fusionautonomous aerial vehicles (AAV) aerial imagery |
spellingShingle | Shu Tian Li Wang Lin Cao Lihong Kang Xian Sun Jing Tian Xiangwei Xing Bo Shen Chunzhuo Fan Kangning Du Chong Fu Ye Zhang A Dynamic Cascade Cross-Modal Coassisted Network for AAV Image Object Detection IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing All-weather object detection high-order interaction multimodal fusion autonomous aerial vehicles (AAV) aerial imagery |
title | A Dynamic Cascade Cross-Modal Coassisted Network for AAV Image Object Detection |
title_full | A Dynamic Cascade Cross-Modal Coassisted Network for AAV Image Object Detection |
title_fullStr | A Dynamic Cascade Cross-Modal Coassisted Network for AAV Image Object Detection |
title_full_unstemmed | A Dynamic Cascade Cross-Modal Coassisted Network for AAV Image Object Detection |
title_short | A Dynamic Cascade Cross-Modal Coassisted Network for AAV Image Object Detection |
title_sort | dynamic cascade cross modal coassisted network for aav image object detection |
topic | All-weather object detection high-order interaction multimodal fusion autonomous aerial vehicles (AAV) aerial imagery |
url | https://ieeexplore.ieee.org/document/10797700/ |
work_keys_str_mv | AT shutian adynamiccascadecrossmodalcoassistednetworkforaavimageobjectdetection AT liwang adynamiccascadecrossmodalcoassistednetworkforaavimageobjectdetection AT lincao adynamiccascadecrossmodalcoassistednetworkforaavimageobjectdetection AT lihongkang adynamiccascadecrossmodalcoassistednetworkforaavimageobjectdetection AT xiansun adynamiccascadecrossmodalcoassistednetworkforaavimageobjectdetection AT jingtian adynamiccascadecrossmodalcoassistednetworkforaavimageobjectdetection AT xiangweixing adynamiccascadecrossmodalcoassistednetworkforaavimageobjectdetection AT boshen adynamiccascadecrossmodalcoassistednetworkforaavimageobjectdetection AT chunzhuofan adynamiccascadecrossmodalcoassistednetworkforaavimageobjectdetection AT kangningdu adynamiccascadecrossmodalcoassistednetworkforaavimageobjectdetection AT chongfu adynamiccascadecrossmodalcoassistednetworkforaavimageobjectdetection AT yezhang adynamiccascadecrossmodalcoassistednetworkforaavimageobjectdetection AT shutian dynamiccascadecrossmodalcoassistednetworkforaavimageobjectdetection AT liwang dynamiccascadecrossmodalcoassistednetworkforaavimageobjectdetection AT lincao dynamiccascadecrossmodalcoassistednetworkforaavimageobjectdetection AT lihongkang dynamiccascadecrossmodalcoassistednetworkforaavimageobjectdetection AT xiansun dynamiccascadecrossmodalcoassistednetworkforaavimageobjectdetection AT jingtian dynamiccascadecrossmodalcoassistednetworkforaavimageobjectdetection AT xiangweixing dynamiccascadecrossmodalcoassistednetworkforaavimageobjectdetection AT boshen dynamiccascadecrossmodalcoassistednetworkforaavimageobjectdetection AT chunzhuofan dynamiccascadecrossmodalcoassistednetworkforaavimageobjectdetection AT kangningdu dynamiccascadecrossmodalcoassistednetworkforaavimageobjectdetection AT chongfu dynamiccascadecrossmodalcoassistednetworkforaavimageobjectdetection AT yezhang dynamiccascadecrossmodalcoassistednetworkforaavimageobjectdetection |