AFT-SAM: Adaptive Fusion Transformer with a Sparse Attention Mechanism for Audio–Visual Speech Recognition
Aiming at the problems of serious information redundancy, complex inter-modal information interaction, and difficult multimodal fusion faced by the audio–visual speech recognition system when dealing with complex multimodal information, this paper proposes an adaptive fusion transformer algorithm (A...
Saved in:
Main Authors: | Na Che, Yiming Zhu, Haiyan Wang, Xianwei Zeng, Qinsheng Du |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2024-12-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3417/15/1/199 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
-
Objective assessment of communication speech interference effect based on feature fusion
by: Yun LIN, et al.
Published: (2023-03-01) -
End-to-end audiovisual speech recognition based on attention fusion of SDBN and BLSTM
by: Yiming WANG, et al.
Published: (2019-12-01) -
Dual-feature speech emotion recognition fusion algorithm based on wavelet scattering transform and MFCC
by: YING Na, et al.
Published: (2024-05-01) -
Metaphor recognition based on cross-modal multi-level information fusion
by: Qimeng Yang, et al.
Published: (2024-12-01) -
Parkinson’s Disease Prediction: An Attention-Based Multimodal Fusion Framework Using Handwriting and Clinical Data
by: Sabrina Benredjem, et al.
Published: (2024-12-01)