A Patch-Level Region-Aware Module with a Multi-Label Framework for Remote Sensing Image Captioning

Recent Transformer-based works can generate high-quality captions for remote sensing images (RSIs). However, these methods generally feed global or grid visual features to a Transformer-based captioning model for associating cross-modal information, which limits performance. In this work, we investi...

Full description

Saved in:
Bibliographic Details
Main Authors: Yunpeng Li, Xiangrong Zhang, Tianyang Zhang, Guanchun Wang, Xinlin Wang, Shuo Li
Format: Article
Language:English
Published: MDPI AG 2024-10-01
Series:Remote Sensing
Subjects:
Online Access:https://www.mdpi.com/2072-4292/16/21/3987
Tags: Add Tag
No Tags, Be the first to tag this record!