Pretraining Enhanced RNN Transducer
Recurrent neural network transducer (RNN-T) is an important branch of current end-to-end automatic speech recognition (ASR). Various promising approaches have been designed for boosting RNN-T architecture; however, few studies exploit the effectiveness of pretrained methods in this framework. In thi...
Saved in:
Main Authors: | Junyu Lu, Rongzhong Lian, Di Jiang, Yuanfeng Song, Zhiyang Su, Victor Junqiu Wei, Lin Yang |
---|---|
Format: | Article |
Language: | English |
Published: |
Tsinghua University Press
2024-12-01
|
Series: | CAAI Artificial Intelligence Research |
Subjects: | |
Online Access: | https://www.sciopen.com/article/10.26599/AIR.2024.9150039 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
-
Classification of Speech Emotion State Based on Feature Map Fusion of TCN and Pretrained CNN Model From Korean Speech Emotion Data
by: A-Hyeon Jo, et al.
Published: (2025-01-01) -
Explainable Self-Supervised Dynamic Neuroimaging Using Time Reversal
by: Zafar Iqbal, et al.
Published: (2025-01-01) -
Exploring Fragment Adding Strategies to Enhance Molecule Pretraining in AI-Driven Drug Discovery
by: Zhaoxu Meng, et al.
Published: (2024-09-01) -
Building Arabic Speech Recognition System Using HuBERT Model and Studying the Sources of Errors [Arabic]
by: Rima Sbih, et al.
Published: (2025-01-01) -
ZeST: A Zero-Resourced Speech-to-Speech Translation Approach for Unknown, Unpaired, and Untranscribed Languages
by: Luan Thanh Nguyen, et al.
Published: (2025-01-01)