SlowFast-TCN: A Deep Learning Approach for Visual Speech Recognition
Visual Speech Recognition (VSR), commonly referred to as automated lip-reading, is an emerging technology that interprets speech by visually analyzing lip movements. A challenge in VSR where visually distinct words produce similar lip movements is known as homopheme problem. Visemes are the basic vi...
Saved in:
Main Authors: | Nicole Yah Yie Ha, Lee-Yeng Ong, Meng-Chew Leow |
---|---|
Format: | Article |
Language: | English |
Published: |
Ital Publication
2024-12-01
|
Series: | Emerging Science Journal |
Subjects: | |
Online Access: | https://ijournalse.org/index.php/ESJ/article/view/2670 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
-
Classification of Speech Emotion State Based on Feature Map Fusion of TCN and Pretrained CNN Model From Korean Speech Emotion Data
by: A-Hyeon Jo, et al.
Published: (2025-01-01) -
JEP-KD: Joint-Embedding Predictive Architecture Based Knowledge Distillation for Visual Speech Recognition
by: Chang Sun, et al.
Published: (2024-01-01) -
Analysis for speech and esthetics in sixty consecutive patients with cleft lip and palate
by: Mahantesh S Shiraganvi, et al.
Published: (2011-10-01) -
LipBengal: Pioneering Bengali lip-reading dataset for pronunciation mapping through lip gesturesHugging Face
by: Md. Tanvir Rahman Sahed, et al.
Published: (2025-02-01) -
Deep Transfer Learning for Lip Reading Based on NASNetMobile Pretrained Model in Wild Dataset
by: Ashwaq Waleed Abdul Ameer, et al.
Published: (2025-01-01)