AFT-SAM: Adaptive Fusion Transformer with a Sparse Attention Mechanism for Audio–Visual Speech Recognition

Aiming at the problems of serious information redundancy, complex inter-modal information interaction, and difficult multimodal fusion faced by the audio–visual speech recognition system when dealing with complex multimodal information, this paper proposes an adaptive fusion transformer algorithm (A...

Full description

Saved in:
Bibliographic Details
Main Authors: Na Che, Yiming Zhu, Haiyan Wang, Xianwei Zeng, Qinsheng Du
Format: Article
Language:English
Published: MDPI AG 2024-12-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/15/1/199
Tags: Add Tag
No Tags, Be the first to tag this record!

Similar Items