Multimodal fusion: A study on speech-text emotion recognition with the integration of deep learning
Recognition of various human emotions holds significant value in numerous real-world scenarios. This paper focuses on the multimodal fusion of speech and text for emotion recognition. A 39-dimensional Mel-frequency cepstral coefficient (MFCC) was used as a feature for speech emotion. A 300-dimension...
Saved in:
Main Authors: | Yanan Shang, Tianqi Fu |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2024-12-01
|
Series: | Intelligent Systems with Applications |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S2667305324001108 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
-
Chinese Mathematical Knowledge Entity Recognition Based on Linguistically Motivated Bidirectional Encoder Representation from Transformers
by: Wei Song, et al.
Published: (2025-01-01) -
DropBlock based bimodal hybrid neural network for wireless communication modulation recognition
by: Yan GAO, et al.
Published: (2022-05-01) -
Polish Speech and Text Emotion Recognition in a Multimodal Emotion Analysis System
by: Kamil Skowroński, et al.
Published: (2024-11-01) -
Research on Medical Text Parsing Method Based on BiGRU-BiLSTM Multi-Task Learning
by: Yunli Fan, et al.
Published: (2024-11-01) -
A hybrid deep learning framework for short-term load forecasting with improved data cleansing and preprocessing techniques
by: Muhammad Sajid Iqbal, et al.
Published: (2024-12-01)