-
1
-
2
-
3
Speech Perception as a Function of the Number of Channels and Channel Interaction in Cochlear Implant Simulation
Published 2023-12-01“…Results: In the first experiment, participants scored 57.93%, 80.97%, 83.59%, 91.03%, and 95.45% under 8, 12, 16, and 22-channel vocoder and non-vocoder conditions, respectively. …”
Get full text
Article -
4
Tibetan–Chinese speech-to-speech translation based on discrete units
Published 2025-01-01“…Leveraging HuBERT model to extract discrete units of target speech, we develop a speech-to-unit translation (S2UT) model using an encoder-decoder architecture which subsequently generates speech output through a unit-based vocoder. By employing SSL and utilizing discrete representations as training targets, our approach effectively captures linguistic differences, facilitating direct translation between the two languages. …”
Get full text
Article -
5
Time Series Classification of Raw Voice Waveforms for Parkinson's Disease Detection Using Generative Adversarial Network-Driven Data Augmentation
Published 2025-01-01“…This article also implements a data augmentation solution. Big Vocoder Slicing Adversarial Network (BigVSAN) is used to generate synthetic voice data that mimics the characteristics of real patients and healthy subjects. …”
Get full text
Article -
6
GOLF: A Singing Voice Synthesiser with Glottal Flow Wavetables and LPC Filters
Published 2024-12-01“…We show it is competitive with state-of-the-art singing voice vocoders, requiring fewer synthesis parameters and less memory to train, and runs an order of magnitude faster for inference. …”
Get full text
Article -
7
End-to-End Speech Synthesis for Tibetan Multidialect
Published 2021-01-01“…Thirdly, two dialect-specific WaveNet vocoders are combined with the feature prediction network, which synthesizes the Mel spectrum of Lhasa-Ü-Tsang and Amdo pastoral dialect into time-domain waveform, respectively. …”
Get full text
Article