INRNet: Neighborhood Re-Ranking-Based Method for Pedestrian Text-Image Retrieval
The Pedestrian Text-Image Retrieval task aims to retrieve the target pedestrian image based on textual description. The primary challenge of this task lies in mapping two heteromodal data (visual and textual descriptions) into a unified feature space. Previous approaches have focused on global or lo...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2025-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10818620/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1841557020919988224 |
---|---|
author | Kehao Wang Yuhui Wang Lian Xue Qifeng Li |
author_facet | Kehao Wang Yuhui Wang Lian Xue Qifeng Li |
author_sort | Kehao Wang |
collection | DOAJ |
description | The Pedestrian Text-Image Retrieval task aims to retrieve the target pedestrian image based on textual description. The primary challenge of this task lies in mapping two heteromodal data (visual and textual descriptions) into a unified feature space. Previous approaches have focused on global or local matching methods. However, global matching methods are susceptible to result in weak alignment, while local matching methods may lead to ambiguous matching phenomenon. In order to address the issues arising from the above methods, we introduce Implicit Neighbourhood Reranking Network (INRNet) which utilizes a bilateral feature extractor to learn global image-text matching knowledge and leverages nearest neighbors as prior knowledge to mine positive samples. Specifically, our proposed approach involves using the bilateral feature extractor to extract features from both texts and pedestrian images and employs a Similarity Distribution Matching (SDM) method to establish preliminary global text-image alignment. Subsequently we establish a Neighborhood Data Construction Mechanism (NDCM), restructuring the data for re-ranking tasks. Finally, we input the restructured data into our Implicit Neighborhood Inference (INI) module, utilizing nearest neighbor intersection to optimize retrieval performance. Through extensive experimentation, our proposed method demonstrates superior performance across three public datasets. |
format | Article |
id | doaj-art-70cd2ca0333b463495cbeb2ab5fc6b37 |
institution | Kabale University |
issn | 2169-3536 |
language | English |
publishDate | 2025-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj-art-70cd2ca0333b463495cbeb2ab5fc6b372025-01-07T00:01:38ZengIEEEIEEE Access2169-35362025-01-01131470148010.1109/ACCESS.2024.351853510818620INRNet: Neighborhood Re-Ranking-Based Method for Pedestrian Text-Image RetrievalKehao Wang0https://orcid.org/0000-0001-9843-8104Yuhui Wang1Lian Xue2Qifeng Li3School of Information Engineering, Wuhan University of Technology, Wuhan, ChinaSchool of Information Engineering, Wuhan University of Technology, Wuhan, ChinaSchool of Artificial Intelligence, Wuhan Technology and Business University, Wuhan, ChinaSchool of Information Engineering, Wuhan University of Technology, Wuhan, ChinaThe Pedestrian Text-Image Retrieval task aims to retrieve the target pedestrian image based on textual description. The primary challenge of this task lies in mapping two heteromodal data (visual and textual descriptions) into a unified feature space. Previous approaches have focused on global or local matching methods. However, global matching methods are susceptible to result in weak alignment, while local matching methods may lead to ambiguous matching phenomenon. In order to address the issues arising from the above methods, we introduce Implicit Neighbourhood Reranking Network (INRNet) which utilizes a bilateral feature extractor to learn global image-text matching knowledge and leverages nearest neighbors as prior knowledge to mine positive samples. Specifically, our proposed approach involves using the bilateral feature extractor to extract features from both texts and pedestrian images and employs a Similarity Distribution Matching (SDM) method to establish preliminary global text-image alignment. Subsequently we establish a Neighborhood Data Construction Mechanism (NDCM), restructuring the data for re-ranking tasks. Finally, we input the restructured data into our Implicit Neighborhood Inference (INI) module, utilizing nearest neighbor intersection to optimize retrieval performance. Through extensive experimentation, our proposed method demonstrates superior performance across three public datasets.https://ieeexplore.ieee.org/document/10818620/Pedestrian text-image retrievalnearest-neighbor datare-rankingmultimodal retrieval |
spellingShingle | Kehao Wang Yuhui Wang Lian Xue Qifeng Li INRNet: Neighborhood Re-Ranking-Based Method for Pedestrian Text-Image Retrieval IEEE Access Pedestrian text-image retrieval nearest-neighbor data re-ranking multimodal retrieval |
title | INRNet: Neighborhood Re-Ranking-Based Method for Pedestrian Text-Image Retrieval |
title_full | INRNet: Neighborhood Re-Ranking-Based Method for Pedestrian Text-Image Retrieval |
title_fullStr | INRNet: Neighborhood Re-Ranking-Based Method for Pedestrian Text-Image Retrieval |
title_full_unstemmed | INRNet: Neighborhood Re-Ranking-Based Method for Pedestrian Text-Image Retrieval |
title_short | INRNet: Neighborhood Re-Ranking-Based Method for Pedestrian Text-Image Retrieval |
title_sort | inrnet neighborhood re ranking based method for pedestrian text image retrieval |
topic | Pedestrian text-image retrieval nearest-neighbor data re-ranking multimodal retrieval |
url | https://ieeexplore.ieee.org/document/10818620/ |
work_keys_str_mv | AT kehaowang inrnetneighborhoodrerankingbasedmethodforpedestriantextimageretrieval AT yuhuiwang inrnetneighborhoodrerankingbasedmethodforpedestriantextimageretrieval AT lianxue inrnetneighborhoodrerankingbasedmethodforpedestriantextimageretrieval AT qifengli inrnetneighborhoodrerankingbasedmethodforpedestriantextimageretrieval |