CAIL: Cross-Modal Vehicle Reidentification in Aerial Images Using the Centroid-Aligned Implicit Learning Network

With the rapid development of autonomous aerial vehicles (AAV) remote sensing equipment, multimodal image data in the remote sensing field have exploded in recent years.In order to effectively alleviate the differences between the three modalities of SAR, visible light, and infrared, we proposed a v...

Full description

Saved in:
Bibliographic Details
Main Authors: Haoran Gao, Yiming Yan, Yanming He, Jianzheng Zhou, Zhengning Zhang, Yunchao Yang
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10783041/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841542608346677248
author Haoran Gao
Yiming Yan
Yanming He
Jianzheng Zhou
Zhengning Zhang
Yunchao Yang
author_facet Haoran Gao
Yiming Yan
Yanming He
Jianzheng Zhou
Zhengning Zhang
Yunchao Yang
author_sort Haoran Gao
collection DOAJ
description With the rapid development of autonomous aerial vehicles (AAV) remote sensing equipment, multimodal image data in the remote sensing field have exploded in recent years.In order to effectively alleviate the differences between the three modalities of SAR, visible light, and infrared, we proposed a vehicle reidentification (Re-ID) task based on multimodal aerial images. Compared with traditional Re-ID tasks based on fixed optical cameras, synthetic aperture radar (SAR) has the advantage of being unaffected by lighting and weather conditions, and can provide additional information. However, there are significant geometric distortions and radiometric differences between optical images and SAR images, which limit the effectiveness of multimodal image matching. To address the above issues, we propose a novel centroid-aligned implicit learning (CAIL) network to achieve cross-modal Re-ID. Specifically, we employ a multilevel channel fusion (MTT) module to enhance the adaptability of multimodal encoding channels to dimensional changes, thereby extracting the implicit features from different modalities. Furthermore, by integrating the MTT module into the modality multiple implicit learning (MIL) module, we reduce the modal differences between multimodal images, thus achieving effective alignment between them. Additionally, to optimize CAIL, we propose a modality centroid alignment (MCA) loss to enhance the intraclass feature aggregation capability of multimodal data. MCA dynamically aggregates centroid features by taking modality differences into account, and adopts joint optimization to reduce anomalies in the metric learning process. Our proposed approach achieves significant and satisfactory performance on a cross-modal aerial images dataset, in terms of both mAP and rank-1 accuracy.
format Article
id doaj-art-3c31d82c0e40459cbd96b4abd0544e2a
institution Kabale University
issn 1939-1404
2151-1535
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
spelling doaj-art-3c31d82c0e40459cbd96b4abd0544e2a2025-01-14T00:00:37ZengIEEEIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing1939-14042151-15352025-01-01182577258810.1109/JSTARS.2024.351257910783041CAIL: Cross-Modal Vehicle Reidentification in Aerial Images Using the Centroid-Aligned Implicit Learning NetworkHaoran Gao0https://orcid.org/0009-0006-1602-5028Yiming Yan1https://orcid.org/0000-0003-0751-7726Yanming He2Jianzheng Zhou3https://orcid.org/0009-0007-7673-7526Zhengning Zhang4https://orcid.org/0000-0001-6569-4101Yunchao Yang5https://orcid.org/0009-0009-9083-0625College of Information and Communication Engineering, Harbin Engineering University, Harbin, ChinaCollege of Information and Communication Engineering, Harbin Engineering University, Harbin, ChinaSpace Star Technology Company Ltd., Beijing, ChinaHarbin Space Star Data System Technology Co., Ltd., Harbin, ChinaSpace Star Technology Company Ltd., Beijing, ChinaState Key Laboratory of Space-Earth Integrated Information Technology, Beijing Institute of Satellite Information Engineering, Beijing, ChinaWith the rapid development of autonomous aerial vehicles (AAV) remote sensing equipment, multimodal image data in the remote sensing field have exploded in recent years.In order to effectively alleviate the differences between the three modalities of SAR, visible light, and infrared, we proposed a vehicle reidentification (Re-ID) task based on multimodal aerial images. Compared with traditional Re-ID tasks based on fixed optical cameras, synthetic aperture radar (SAR) has the advantage of being unaffected by lighting and weather conditions, and can provide additional information. However, there are significant geometric distortions and radiometric differences between optical images and SAR images, which limit the effectiveness of multimodal image matching. To address the above issues, we propose a novel centroid-aligned implicit learning (CAIL) network to achieve cross-modal Re-ID. Specifically, we employ a multilevel channel fusion (MTT) module to enhance the adaptability of multimodal encoding channels to dimensional changes, thereby extracting the implicit features from different modalities. Furthermore, by integrating the MTT module into the modality multiple implicit learning (MIL) module, we reduce the modal differences between multimodal images, thus achieving effective alignment between them. Additionally, to optimize CAIL, we propose a modality centroid alignment (MCA) loss to enhance the intraclass feature aggregation capability of multimodal data. MCA dynamically aggregates centroid features by taking modality differences into account, and adopts joint optimization to reduce anomalies in the metric learning process. Our proposed approach achieves significant and satisfactory performance on a cross-modal aerial images dataset, in terms of both mAP and rank-1 accuracy.https://ieeexplore.ieee.org/document/10783041/Centroid alignment lossimplicit learningmultimodal aerial imagesautonomous aerial vehicles (AAV)vehicle reidentification (Re-ID)
spellingShingle Haoran Gao
Yiming Yan
Yanming He
Jianzheng Zhou
Zhengning Zhang
Yunchao Yang
CAIL: Cross-Modal Vehicle Reidentification in Aerial Images Using the Centroid-Aligned Implicit Learning Network
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Centroid alignment loss
implicit learning
multimodal aerial images
autonomous aerial vehicles (AAV)
vehicle reidentification (Re-ID)
title CAIL: Cross-Modal Vehicle Reidentification in Aerial Images Using the Centroid-Aligned Implicit Learning Network
title_full CAIL: Cross-Modal Vehicle Reidentification in Aerial Images Using the Centroid-Aligned Implicit Learning Network
title_fullStr CAIL: Cross-Modal Vehicle Reidentification in Aerial Images Using the Centroid-Aligned Implicit Learning Network
title_full_unstemmed CAIL: Cross-Modal Vehicle Reidentification in Aerial Images Using the Centroid-Aligned Implicit Learning Network
title_short CAIL: Cross-Modal Vehicle Reidentification in Aerial Images Using the Centroid-Aligned Implicit Learning Network
title_sort cail cross modal vehicle reidentification in aerial images using the centroid aligned implicit learning network
topic Centroid alignment loss
implicit learning
multimodal aerial images
autonomous aerial vehicles (AAV)
vehicle reidentification (Re-ID)
url https://ieeexplore.ieee.org/document/10783041/
work_keys_str_mv AT haorangao cailcrossmodalvehiclereidentificationinaerialimagesusingthecentroidalignedimplicitlearningnetwork
AT yimingyan cailcrossmodalvehiclereidentificationinaerialimagesusingthecentroidalignedimplicitlearningnetwork
AT yanminghe cailcrossmodalvehiclereidentificationinaerialimagesusingthecentroidalignedimplicitlearningnetwork
AT jianzhengzhou cailcrossmodalvehiclereidentificationinaerialimagesusingthecentroidalignedimplicitlearningnetwork
AT zhengningzhang cailcrossmodalvehiclereidentificationinaerialimagesusingthecentroidalignedimplicitlearningnetwork
AT yunchaoyang cailcrossmodalvehiclereidentificationinaerialimagesusingthecentroidalignedimplicitlearningnetwork