MFEAM: Multi-View Feature Enhanced Attention Model for Image Captioning

MFEAM: Multi-View Feature Enhanced Attention Model for Image Captioning

Image captioning plays a crucial role in aligning visual content with natural language, serving as a key step toward effective cross-modal understanding. Transformer has become the dominant language model in image captioning. Existing Transformer-based models seldom highlight important features from...

Full description

Saved in:

Bibliographic Details
Main Authors:	Yang Cui, Juan Zhang
Format:	Article
Language:	English
Published:	MDPI AG 2025-07-01
Series:	Applied Sciences
Subjects:	image captioning language model knowledge distillation multi-view feature enhanced attention model averaging
Online Access:	https://www.mdpi.com/2076-3417/15/15/8368
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Chinese Image Captioning Based on Deep Fusion Feature and Multi-Layer Feature Filtering Block
by: Xi Yang, et al.
Published: (2025-01-01)

Enhanced group relation learning via aligned attention masking for fashion product captioning
by: Yuhao Tang, et al.
Published: (2025-08-01)

Enabling High-Level Worker-Centric Semantic Understanding of Onsite Images Using Visual Language Models with Attention Mechanism and Beam Search Strategy
by: Hui Deng, et al.
Published: (2025-03-01)

A novel image captioning model with visual-semantic similarities and visual representations re-weighting
by: Alaa Thobhani, et al.
Published: (2024-09-01)

AFNE-Net: Semantic Segmentation of Remote Sensing Images via Attention-Based Feature Fusion and Neighborhood Feature Enhancement
by: Ke Li, et al.
Published: (2025-07-01)

Fine-Grained Length Controllable Video Captioning With Ordinal Embeddings
by: Tomoya Nitta, et al.
Published: (2024-01-01)

Semantic-Guided Selective Representation for Image Captioning
by: Yinan Li, et al.
Published: (2023-01-01)

Contrastive learning based remote sensing text-to-image generation for few-shot remote sensing image captioning
by: Haonan Zhou, et al.
Published: (2025-08-01)

Multi-view semi-supervised attention network for 3D cardiac image segmentation
by: Huaidong Li, et al.
Published: (2025-05-01)

MultiV_Nm: a prediction method for 2′-O-methylation sites based on multi-view features
by: Lei Bai, et al.
Published: (2025-05-01)

Multi-View Graph Contrastive Neural Networks for Session-Based Recommendation
by: Pengbo Huang, et al.
Published: (2025-05-01)

Image small target detection in complex traffic scenes based on Yolov8 multiscale feature fusion
by: Xuguang Chai, et al.
Published: (2025-07-01)

View-label driven cross-space structure alignment for incomplete multi-view partial multi-label classification
by: Shenrun Ding, et al.
Published: (2025-07-01)

Design of electrocatalysts based on knowledge enhanced LLMs
by: WANG Ludi, et al.
Published: (2025-03-01)

“Your subtitles will look like this”: Exploring user preferences for closed captions across streaming platforms
by: Gabriele Uzzo
Published: (2025-07-01)

GLEM: a global–local enhancement method for fine-grained image recognition with attention erasure and multi-view cropping
by: Chenglong Zhou, et al.
Published: (2025-07-01)

Attention-Based Dual-Knowledge Distillation for Alzheimer’s Disease Stage Detection Using MRI Scans
by: Chandita Barman, et al.
Published: (2025-01-01)

Training strategies for semi-supervised remote sensing image captioning
by: Qimin Cheng, et al.
Published: (2025-07-01)

CropSTS: A Remote Sensing Foundation Model for Cropland Classification with Decoupled Spatiotemporal Attention
by: Jian Yan, et al.
Published: (2025-07-01)

Human Scene Understanding Mechanism-Based Image Captioning for Blind Assistance
by: Jong-Hoon Kim, et al.
Published: (2025-01-01)

Multi-view co-occurrence and dual-modality framework for breast cancer classification
by: Chong Su, et al.
Published: (2025-06-01)

Interpretable Chinese Fake News Detection With Chain-of-Thought and In-Context Learning
by: Bingyi Liu, et al.
Published: (2025-01-01)

Attention-based deep incomplete multi-view clustering via bi-alignment guidance
by: Ao Li, et al.
Published: (2025-06-01)

Knowledge Distillation with Geometry-Consistent Feature Alignment for Robust Low-Light Apple Detection
by: Yuanping Shi, et al.
Published: (2025-08-01)

Structure-guided decoupled contrastive framework for partial multi-view incomplete multi-label classification
by: Linqian Yang, et al.
Published: (2025-07-01)

Aligning to the teacher: multilevel feature-aligned knowledge distillation
by: Yang Zhang, et al.
Published: (2025-08-01)

Multi-View Video Quality Enhancement Method Based on Multi-Scale Fusion Convolutional Neural Network and Visual Saliency
by: Weizhe Wang, et al.
Published: (2024-01-01)

4D trajectory lightweight prediction algorithm based on knowledge distillation technique
by: Weizhen Tang, et al.
Published: (2025-08-01)

X-FASNet: cross-scale feature-aware with self-attention network for cognitive decline assessment in Alzheimer's disease
by: Wenhui Chen, et al.
Published: (2025-08-01)

Evaluating the Simple View of Reading Model: Longitudinal Testing and Applicability to the Swedish Language
by: Thomas Nordström, et al.
Published: (2025-02-01)

Code summarization based on large model knowledge distillation
by: YOU Gang, LIU Wenjie, LI Meipeng, SUN Liqun, WANG Lian, TIAN Tieku
Published: (2025-08-01)

A two-stage multi-scale attention-based network for weakly supervised cataract fundus image enhancement
by: Xiaoyong Fang, et al.
Published: (2025-07-01)

Change Detection on Remote Sensing Images Using Multidimensional Attention Network
by: Yiming Zhang, et al.
Published: (2025-01-01)

Twofold dynamic attention guided deep network and noise-aware mechanism for image denoising
by: Zihao Chen, et al.
Published: (2023-03-01)

MTLink: Adaptive multi-task learning based pre-trained language model for traceability link recovery between issues and commits
by: Yang Deng, et al.
Published: (2024-02-01)

Deep Learning-Based Speech Emotion Recognition Using Multi-Level Fusion of Concurrent Features
by: Samuel, Kakuba, et al.
Published: (2023)

Efficient detection of highway pavement cracks using computer vision-based MSFNet model
by: Zhengfa Jiang, et al.
Published: (2025-12-01)

Hierarchical Feature Fusion and Enhanced Attention Mechanism for Robust GAN-Generated Image Detection
by: Weinan Zhang, et al.
Published: (2025-04-01)

A Student Facial Expression Recognition Model Based on Multi-Scale and Deep Fine-Grained Feature Attention Enhancement
by: Zhaoyu Shou, et al.
Published: (2024-10-01)

Enhancing Unconditional Molecule Generation via Online Knowledge Distillation of Scaffolds
by: Huibin Wang, et al.
Published: (2025-03-01)