CAIL: Cross-Modal Vehicle Reidentification in Aerial Images Using the Centroid-Aligned Implicit Learning Network
With the rapid development of autonomous aerial vehicles (AAV) remote sensing equipment, multimodal image data in the remote sensing field have exploded in recent years.In order to effectively alleviate the differences between the three modalities of SAR, visible light, and infrared, we proposed a v...
Saved in:
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2025-01-01
|
Series: | IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10783041/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1841542608346677248 |
---|---|
author | Haoran Gao Yiming Yan Yanming He Jianzheng Zhou Zhengning Zhang Yunchao Yang |
author_facet | Haoran Gao Yiming Yan Yanming He Jianzheng Zhou Zhengning Zhang Yunchao Yang |
author_sort | Haoran Gao |
collection | DOAJ |
description | With the rapid development of autonomous aerial vehicles (AAV) remote sensing equipment, multimodal image data in the remote sensing field have exploded in recent years.In order to effectively alleviate the differences between the three modalities of SAR, visible light, and infrared, we proposed a vehicle reidentification (Re-ID) task based on multimodal aerial images. Compared with traditional Re-ID tasks based on fixed optical cameras, synthetic aperture radar (SAR) has the advantage of being unaffected by lighting and weather conditions, and can provide additional information. However, there are significant geometric distortions and radiometric differences between optical images and SAR images, which limit the effectiveness of multimodal image matching. To address the above issues, we propose a novel centroid-aligned implicit learning (CAIL) network to achieve cross-modal Re-ID. Specifically, we employ a multilevel channel fusion (MTT) module to enhance the adaptability of multimodal encoding channels to dimensional changes, thereby extracting the implicit features from different modalities. Furthermore, by integrating the MTT module into the modality multiple implicit learning (MIL) module, we reduce the modal differences between multimodal images, thus achieving effective alignment between them. Additionally, to optimize CAIL, we propose a modality centroid alignment (MCA) loss to enhance the intraclass feature aggregation capability of multimodal data. MCA dynamically aggregates centroid features by taking modality differences into account, and adopts joint optimization to reduce anomalies in the metric learning process. Our proposed approach achieves significant and satisfactory performance on a cross-modal aerial images dataset, in terms of both mAP and rank-1 accuracy. |
format | Article |
id | doaj-art-3c31d82c0e40459cbd96b4abd0544e2a |
institution | Kabale University |
issn | 1939-1404 2151-1535 |
language | English |
publishDate | 2025-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing |
spelling | doaj-art-3c31d82c0e40459cbd96b4abd0544e2a2025-01-14T00:00:37ZengIEEEIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing1939-14042151-15352025-01-01182577258810.1109/JSTARS.2024.351257910783041CAIL: Cross-Modal Vehicle Reidentification in Aerial Images Using the Centroid-Aligned Implicit Learning NetworkHaoran Gao0https://orcid.org/0009-0006-1602-5028Yiming Yan1https://orcid.org/0000-0003-0751-7726Yanming He2Jianzheng Zhou3https://orcid.org/0009-0007-7673-7526Zhengning Zhang4https://orcid.org/0000-0001-6569-4101Yunchao Yang5https://orcid.org/0009-0009-9083-0625College of Information and Communication Engineering, Harbin Engineering University, Harbin, ChinaCollege of Information and Communication Engineering, Harbin Engineering University, Harbin, ChinaSpace Star Technology Company Ltd., Beijing, ChinaHarbin Space Star Data System Technology Co., Ltd., Harbin, ChinaSpace Star Technology Company Ltd., Beijing, ChinaState Key Laboratory of Space-Earth Integrated Information Technology, Beijing Institute of Satellite Information Engineering, Beijing, ChinaWith the rapid development of autonomous aerial vehicles (AAV) remote sensing equipment, multimodal image data in the remote sensing field have exploded in recent years.In order to effectively alleviate the differences between the three modalities of SAR, visible light, and infrared, we proposed a vehicle reidentification (Re-ID) task based on multimodal aerial images. Compared with traditional Re-ID tasks based on fixed optical cameras, synthetic aperture radar (SAR) has the advantage of being unaffected by lighting and weather conditions, and can provide additional information. However, there are significant geometric distortions and radiometric differences between optical images and SAR images, which limit the effectiveness of multimodal image matching. To address the above issues, we propose a novel centroid-aligned implicit learning (CAIL) network to achieve cross-modal Re-ID. Specifically, we employ a multilevel channel fusion (MTT) module to enhance the adaptability of multimodal encoding channels to dimensional changes, thereby extracting the implicit features from different modalities. Furthermore, by integrating the MTT module into the modality multiple implicit learning (MIL) module, we reduce the modal differences between multimodal images, thus achieving effective alignment between them. Additionally, to optimize CAIL, we propose a modality centroid alignment (MCA) loss to enhance the intraclass feature aggregation capability of multimodal data. MCA dynamically aggregates centroid features by taking modality differences into account, and adopts joint optimization to reduce anomalies in the metric learning process. Our proposed approach achieves significant and satisfactory performance on a cross-modal aerial images dataset, in terms of both mAP and rank-1 accuracy.https://ieeexplore.ieee.org/document/10783041/Centroid alignment lossimplicit learningmultimodal aerial imagesautonomous aerial vehicles (AAV)vehicle reidentification (Re-ID) |
spellingShingle | Haoran Gao Yiming Yan Yanming He Jianzheng Zhou Zhengning Zhang Yunchao Yang CAIL: Cross-Modal Vehicle Reidentification in Aerial Images Using the Centroid-Aligned Implicit Learning Network IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing Centroid alignment loss implicit learning multimodal aerial images autonomous aerial vehicles (AAV) vehicle reidentification (Re-ID) |
title | CAIL: Cross-Modal Vehicle Reidentification in Aerial Images Using the Centroid-Aligned Implicit Learning Network |
title_full | CAIL: Cross-Modal Vehicle Reidentification in Aerial Images Using the Centroid-Aligned Implicit Learning Network |
title_fullStr | CAIL: Cross-Modal Vehicle Reidentification in Aerial Images Using the Centroid-Aligned Implicit Learning Network |
title_full_unstemmed | CAIL: Cross-Modal Vehicle Reidentification in Aerial Images Using the Centroid-Aligned Implicit Learning Network |
title_short | CAIL: Cross-Modal Vehicle Reidentification in Aerial Images Using the Centroid-Aligned Implicit Learning Network |
title_sort | cail cross modal vehicle reidentification in aerial images using the centroid aligned implicit learning network |
topic | Centroid alignment loss implicit learning multimodal aerial images autonomous aerial vehicles (AAV) vehicle reidentification (Re-ID) |
url | https://ieeexplore.ieee.org/document/10783041/ |
work_keys_str_mv | AT haorangao cailcrossmodalvehiclereidentificationinaerialimagesusingthecentroidalignedimplicitlearningnetwork AT yimingyan cailcrossmodalvehiclereidentificationinaerialimagesusingthecentroidalignedimplicitlearningnetwork AT yanminghe cailcrossmodalvehiclereidentificationinaerialimagesusingthecentroidalignedimplicitlearningnetwork AT jianzhengzhou cailcrossmodalvehiclereidentificationinaerialimagesusingthecentroidalignedimplicitlearningnetwork AT zhengningzhang cailcrossmodalvehiclereidentificationinaerialimagesusingthecentroidalignedimplicitlearningnetwork AT yunchaoyang cailcrossmodalvehiclereidentificationinaerialimagesusingthecentroidalignedimplicitlearningnetwork |