Automatic Speech Recognition: A survey of deep learning techniques and approaches
Significant research has been conducted during the last decade on the application of machine learning for speech processing, particularly speech recognition. However, in recent years, deep learning models have shown promising results for different speech related applications. With the emergence of e...
Saved in:
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
KeAi Communications Co., Ltd.
2025-12-01
|
Series: | International Journal of Cognitive Computing in Engineering |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S2666307424000573 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Significant research has been conducted during the last decade on the application of machine learning for speech processing, particularly speech recognition. However, in recent years, deep learning models have shown promising results for different speech related applications. With the emergence of end-to-end models, deep learning has revolutionized the field of Automatic Speech Recognition (ASR). A recent surge in transfer learning-based models and attention-based approaches on large datasets has further given an impetus to ASR. This paper provides a thorough review of the numerous studies conducted since 2010, as well as an extensive comparison of the state-of-the-art methods that are now being used in this research area, with a special focus on the numerous deep learning models, along with an analysis of contemporary approaches for both monolingual and multilingual models. Deep learning approaches are data dependent and their accuracy varies on different datasets. In this paper, we have also analyzed the various models on publicly accessible speech datasets to understand model performance across diverse datasets for practical deployment. This study also highlights the research findings and challenges with way forward that may be used as a beginning point for academicians interested in open-source Automatic Speech Recognition (ASR) research, particularly focusing on mitigating data dependency and generalizability across low resource languages, speaker variability, and noise conditions. |
---|---|
ISSN: | 2666-3074 |