A Patch-Level Region-Aware Module with a Multi-Label Framework for Remote Sensing Image Captioning
Recent Transformer-based works can generate high-quality captions for remote sensing images (RSIs). However, these methods generally feed global or grid visual features to a Transformer-based captioning model for associating cross-modal information, which limits performance. In this work, we investi...
Saved in:
| Main Authors: | Yunpeng Li, Xiangrong Zhang, Tianyang Zhang, Guanchun Wang, Xinlin Wang, Shuo Li |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2024-10-01
|
| Series: | Remote Sensing |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2072-4292/16/21/3987 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
-
Thangka image captioning model with Salient Attention and Local Interaction Aggregator
by: Wenjin Hu, et al.
Published: (2024-11-01) -
Novel Advance Image Caption Generation Utilizing Vision Transformer and Generative Adversarial Networks
by: Shourya Tyagi, et al.
Published: (2024-11-01) -
Remote Sensing Image Change Captioning Using Multi-Attentive Network with Diffusion Model
by: Yue Yang, et al.
Published: (2024-11-01) -
A Multi-Label Image Classification Method based on Label Correlation Learning Network
by: WANG Lufang, et al.
Published: (2024-11-01) -
A Review of Deep Learning-Based Remote Sensing Image Caption: Methods, Models, Comparisons and Future Directions
by: Ke Zhang, et al.
Published: (2024-11-01)