qArI: A Hybrid CTC/Attention-Based Model for Quran Recitation Recognition Using Bidirectional LSTMP in an End-to-End Architecture

The accurate speech recognition of the Holy Quran is crucial for maintaining the traditional recitation styles and pronunciations, which helps in preserving the authenticity of the Quranic teachings and ensuring their accurate transmission across generations. Though the application of freshly develo...

Full description

Saved in:

Bibliographic Details
Main Authors:	Sumayya Alfadhli, Hajar Alharbi, Asma Cherif
Format:	Article
Language:	English
Published:	IEEE 2024-01-01
Series:	IEEE Access
Subjects:	Acoustic models attention bidirectional LSTMP CTC language model Quran recitation
Online Access:	https://ieeexplore.ieee.org/document/10589392/
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1841546142460936192
author	Sumayya Alfadhli Hajar Alharbi Asma Cherif
author_facet	Sumayya Alfadhli Hajar Alharbi Asma Cherif
author_sort	Sumayya Alfadhli
collection	DOAJ
description	The accurate speech recognition of the Holy Quran is crucial for maintaining the traditional recitation styles and pronunciations, which helps in preserving the authenticity of the Quranic teachings and ensuring their accurate transmission across generations. Though the application of freshly developed models to spoken and written Arabic and non-Arabic speech recognition has yielded highly accurate results, research on Holy Quran is still in its early levels. Indeed, speech recognition of the Holy Quran presents several challenges, including language complexity and the absence of a comprehensive dataset. This research aims to improve the accuracy of speech recognition models for the recital of the Holy Quran. A new dataset called comprehensive Quranic dataset version 1 (CQDV1) is created to serves the HQSR field. The dataset is publicly available for use by other researchers and includes recitations of the entire Quran (114 sura, recited by 35 reciters), based on Hafs from Asim narrative.The study explores the development of a speech recognition model for the accurate recital of the Holy Quran. The model combines a connectionist temporal classification (CTC)/attention loss function with a Bidirectional Long Short-Term Memory with projections (BLSTMP) architecture and a token-based recurrent neural network language model (RNNLM) using CQDV1 dataset. The results achieved were a token error rate (TER) of 6.4%, a word error rate (WER) of 10.4%, and a sentence error rate (SER) of 55.3% with <inline-formula> <tex-math notation="LaTeX">$\lambda =0.2$ </tex-math></inline-formula>.
format	Article
id	doaj-art-8f14fa6b8b9c4784b832e000a6ec028e
institution	Kabale University
issn	2169-3536
language	English
publishDate	2024-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj-art-8f14fa6b8b9c4784b832e000a6ec028e2025-01-11T00:00:41ZengIEEEIEEE Access2169-35362024-01-0112957629577710.1109/ACCESS.2024.342527310589392qArI: A Hybrid CTC/Attention-Based Model for Quran Recitation Recognition Using Bidirectional LSTMP in an End-to-End ArchitectureSumayya Alfadhli0https://orcid.org/0000-0002-1384-5083Hajar Alharbi1Asma Cherif2Department of Computer Science, Adham University College, Umm Al-Qura University, Makkah, Saudi ArabiaDepartment of Computer Science, Adham University College, Umm Al-Qura University, Makkah, Saudi ArabiaDepartment of Information Technology, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi ArabiaThe accurate speech recognition of the Holy Quran is crucial for maintaining the traditional recitation styles and pronunciations, which helps in preserving the authenticity of the Quranic teachings and ensuring their accurate transmission across generations. Though the application of freshly developed models to spoken and written Arabic and non-Arabic speech recognition has yielded highly accurate results, research on Holy Quran is still in its early levels. Indeed, speech recognition of the Holy Quran presents several challenges, including language complexity and the absence of a comprehensive dataset. This research aims to improve the accuracy of speech recognition models for the recital of the Holy Quran. A new dataset called comprehensive Quranic dataset version 1 (CQDV1) is created to serves the HQSR field. The dataset is publicly available for use by other researchers and includes recitations of the entire Quran (114 sura, recited by 35 reciters), based on Hafs from Asim narrative.The study explores the development of a speech recognition model for the accurate recital of the Holy Quran. The model combines a connectionist temporal classification (CTC)/attention loss function with a Bidirectional Long Short-Term Memory with projections (BLSTMP) architecture and a token-based recurrent neural network language model (RNNLM) using CQDV1 dataset. The results achieved were a token error rate (TER) of 6.4%, a word error rate (WER) of 10.4%, and a sentence error rate (SER) of 55.3% with <inline-formula> <tex-math notation="LaTeX">$\lambda =0.2$ </tex-math></inline-formula>.https://ieeexplore.ieee.org/document/10589392/Acoustic modelsattentionbidirectional LSTMPCTClanguage modelQuran recitation
spellingShingle	Sumayya Alfadhli Hajar Alharbi Asma Cherif qArI: A Hybrid CTC/Attention-Based Model for Quran Recitation Recognition Using Bidirectional LSTMP in an End-to-End Architecture IEEE Access Acoustic models attention bidirectional LSTMP CTC language model Quran recitation
title	qArI: A Hybrid CTC/Attention-Based Model for Quran Recitation Recognition Using Bidirectional LSTMP in an End-to-End Architecture
title_full	qArI: A Hybrid CTC/Attention-Based Model for Quran Recitation Recognition Using Bidirectional LSTMP in an End-to-End Architecture
title_fullStr	qArI: A Hybrid CTC/Attention-Based Model for Quran Recitation Recognition Using Bidirectional LSTMP in an End-to-End Architecture
title_full_unstemmed	qArI: A Hybrid CTC/Attention-Based Model for Quran Recitation Recognition Using Bidirectional LSTMP in an End-to-End Architecture
title_short	qArI: A Hybrid CTC/Attention-Based Model for Quran Recitation Recognition Using Bidirectional LSTMP in an End-to-End Architecture
title_sort	qari a hybrid ctc attention based model for quran recitation recognition using bidirectional lstmp in an end to end architecture
topic	Acoustic models attention bidirectional LSTMP CTC language model Quran recitation
url	https://ieeexplore.ieee.org/document/10589392/
work_keys_str_mv	AT sumayyaalfadhli qariahybridctcattentionbasedmodelforquranrecitationrecognitionusingbidirectionallstmpinanendtoendarchitecture AT hajaralharbi qariahybridctcattentionbasedmodelforquranrecitationrecognitionusingbidirectionallstmpinanendtoendarchitecture AT asmacherif qariahybridctcattentionbasedmodelforquranrecitationrecognitionusingbidirectionallstmpinanendtoendarchitecture

qArI: A Hybrid CTC/Attention-Based Model for Quran Recitation Recognition Using Bidirectional LSTMP in an End-to-End Architecture

Similar Items