NeuroVoz: a Castillian Spanish corpus of parkinsonian speech

Abstract The screening of Parkinson’s Disease (PD) through speech is hindered by a notable lack of publicly available datasets in different languages. This fact limits the reproducibility and further exploration of existing research. To address this gap, this manuscript presents the NeuroVoz corpus...

Full description

Saved in:
Bibliographic Details
Main Authors: Janaína Mendes-Laureano, Jorge A. Gómez-García, Alejandro Guerrero-López, Elisa Luque-Buzo, Julián D. Arias-Londoño, Francisco J. Grandas-Pérez, Juan I. Godino-Llorente
Format: Article
Language:English
Published: Nature Portfolio 2024-12-01
Series:Scientific Data
Online Access:https://doi.org/10.1038/s41597-024-04186-z
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1846112859797323776
author Janaína Mendes-Laureano
Jorge A. Gómez-García
Alejandro Guerrero-López
Elisa Luque-Buzo
Julián D. Arias-Londoño
Francisco J. Grandas-Pérez
Juan I. Godino-Llorente
author_facet Janaína Mendes-Laureano
Jorge A. Gómez-García
Alejandro Guerrero-López
Elisa Luque-Buzo
Julián D. Arias-Londoño
Francisco J. Grandas-Pérez
Juan I. Godino-Llorente
author_sort Janaína Mendes-Laureano
collection DOAJ
description Abstract The screening of Parkinson’s Disease (PD) through speech is hindered by a notable lack of publicly available datasets in different languages. This fact limits the reproducibility and further exploration of existing research. To address this gap, this manuscript presents the NeuroVoz corpus consisting of 112 native Castilian-Spanish speakers, including 58 healthy controls and 54 individuals with PD, all recorded in ON state. The corpus showcases a diverse array of speech tasks: sustained vowels; diadochokinetic tests; 16 Listen-and-Repeat utterances; and spontaneous monologues. The dataset is also complemented with subjective assessments of voice quality performed by an expert according to the GRBAS scale (Grade/Roughness/Breathiness/Asthenia/Strain), as well as annotations with a thorough examination of phonation quality, intensity, speed, resonance, intelligibility, and prosody. The corpus offers a substantial resource for the exploration of the impact of PD on speech. This data set has already supported several studies, achieving a benchmark accuracy of 89% for the screening of PD. Despite these advances, the broader challenge of conducting a language-agnostic, cross-corpora analysis of Parkinsonian speech patterns remains open.
format Article
id doaj-art-899a21a5d6b64d49beb59b36bd2b8bc1
institution Kabale University
issn 2052-4463
language English
publishDate 2024-12-01
publisher Nature Portfolio
record_format Article
series Scientific Data
spelling doaj-art-899a21a5d6b64d49beb59b36bd2b8bc12024-12-22T12:14:39ZengNature PortfolioScientific Data2052-44632024-12-0111111410.1038/s41597-024-04186-zNeuroVoz: a Castillian Spanish corpus of parkinsonian speechJanaína Mendes-Laureano0Jorge A. Gómez-García1Alejandro Guerrero-López2Elisa Luque-Buzo3Julián D. Arias-Londoño4Francisco J. Grandas-Pérez5Juan I. Godino-Llorente6Escuela Técnica Superior de Ingenieros de Telecomunicación, Universidad Politécnica de MadridEscuela Técnica Superior de Ingenieros de Telecomunicación, Universidad Politécnica de MadridEscuela Técnica Superior de Ingenieros de Telecomunicación, Universidad Politécnica de MadridDepartment of Neurology, Hospital General Universitario Gregorio MarañónEscuela Técnica Superior de Ingenieros de Telecomunicación, Universidad Politécnica de MadridDepartment of Neurology, Hospital General Universitario Gregorio MarañónEscuela Técnica Superior de Ingenieros de Telecomunicación, Universidad Politécnica de MadridAbstract The screening of Parkinson’s Disease (PD) through speech is hindered by a notable lack of publicly available datasets in different languages. This fact limits the reproducibility and further exploration of existing research. To address this gap, this manuscript presents the NeuroVoz corpus consisting of 112 native Castilian-Spanish speakers, including 58 healthy controls and 54 individuals with PD, all recorded in ON state. The corpus showcases a diverse array of speech tasks: sustained vowels; diadochokinetic tests; 16 Listen-and-Repeat utterances; and spontaneous monologues. The dataset is also complemented with subjective assessments of voice quality performed by an expert according to the GRBAS scale (Grade/Roughness/Breathiness/Asthenia/Strain), as well as annotations with a thorough examination of phonation quality, intensity, speed, resonance, intelligibility, and prosody. The corpus offers a substantial resource for the exploration of the impact of PD on speech. This data set has already supported several studies, achieving a benchmark accuracy of 89% for the screening of PD. Despite these advances, the broader challenge of conducting a language-agnostic, cross-corpora analysis of Parkinsonian speech patterns remains open.https://doi.org/10.1038/s41597-024-04186-z
spellingShingle Janaína Mendes-Laureano
Jorge A. Gómez-García
Alejandro Guerrero-López
Elisa Luque-Buzo
Julián D. Arias-Londoño
Francisco J. Grandas-Pérez
Juan I. Godino-Llorente
NeuroVoz: a Castillian Spanish corpus of parkinsonian speech
Scientific Data
title NeuroVoz: a Castillian Spanish corpus of parkinsonian speech
title_full NeuroVoz: a Castillian Spanish corpus of parkinsonian speech
title_fullStr NeuroVoz: a Castillian Spanish corpus of parkinsonian speech
title_full_unstemmed NeuroVoz: a Castillian Spanish corpus of parkinsonian speech
title_short NeuroVoz: a Castillian Spanish corpus of parkinsonian speech
title_sort neurovoz a castillian spanish corpus of parkinsonian speech
url https://doi.org/10.1038/s41597-024-04186-z
work_keys_str_mv AT janainamendeslaureano neurovozacastillianspanishcorpusofparkinsonianspeech
AT jorgeagomezgarcia neurovozacastillianspanishcorpusofparkinsonianspeech
AT alejandroguerrerolopez neurovozacastillianspanishcorpusofparkinsonianspeech
AT elisaluquebuzo neurovozacastillianspanishcorpusofparkinsonianspeech
AT juliandariaslondono neurovozacastillianspanishcorpusofparkinsonianspeech
AT franciscojgrandasperez neurovozacastillianspanishcorpusofparkinsonianspeech
AT juanigodinollorente neurovozacastillianspanishcorpusofparkinsonianspeech