Ability of Human Auditory Perception to Distinguish Human-Imitated Speech

Distinguishing human-imitated speech from genuine speech presents a significant challenge for listeners due to their natural resemblance. Human auditory perception (HAP) has been widely studied to understand its mechanisms, with HAP-based acoustic features and metrics applied in various applications...

Full description

Saved in:

Bibliographic Details
Main Authors:	Khalid Zaman, Kai Li, Islam J. A. M. Samiul, Yasufumi Uezu, Shunsuke Kidani, Masashi Unoki
Format:	Article
Language:	English
Published:	IEEE 2025-01-01
Series:	IEEE Access
Subjects:	Human-imitated speech human auditory perception timbral features human listeners
Online Access:	https://ieeexplore.ieee.org/document/10829923/
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1841542597209751552
author	Khalid Zaman Kai Li Islam J. A. M. Samiul Yasufumi Uezu Shunsuke Kidani Masashi Unoki
author_facet	Khalid Zaman Kai Li Islam J. A. M. Samiul Yasufumi Uezu Shunsuke Kidani Masashi Unoki
author_sort	Khalid Zaman
collection	DOAJ
description	Distinguishing human-imitated speech from genuine speech presents a significant challenge for listeners due to their natural resemblance. Human auditory perception (HAP) has been widely studied to understand its mechanisms, with HAP-based acoustic features and metrics applied in various applications to assess sound quality and discriminate sound events. Leveraging these insights, this study specifically aims to evaluate HAP’s effectiveness in differentiating genuine from imitated speech through a systematic subject test. To this end, the study applies HAP to the task of distinguishing genuine from imitated speech, using a specially developed dataset of human-imitated speech, due to the lack of comparable publicly available datasets. A three-phase, human-centered approach was used to evaluate HAP ability, and participants achieved an average accuracy of 70.10% in distinguishing genuine from imitated speech in the final test. Additionally, a feasibility study was conducted using two feature sets for machine classification; among the timbral features, boominess and depth performed best with accuracies of 62% and 60%, respectively, while general features like Mel-spectrograms reached 51%. These results underscore the importance of auditory-related features in effectively detecting imitated speech.
format	Article
id	doaj-art-f14ffc37130f4927b96bd918c2a33008
institution	Kabale University
issn	2169-3536
language	English
publishDate	2025-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj-art-f14ffc37130f4927b96bd918c2a330082025-01-14T00:02:27ZengIEEEIEEE Access2169-35362025-01-01136225623610.1109/ACCESS.2025.352663110829923Ability of Human Auditory Perception to Distinguish Human-Imitated SpeechKhalid Zaman0https://orcid.org/0009-0004-0809-7537Kai Li1Islam J. A. M. Samiul2Yasufumi Uezu3https://orcid.org/0009-0006-0273-5782Shunsuke Kidani4https://orcid.org/0000-0001-6491-9540Masashi Unoki5https://orcid.org/0000-0002-6605-2052Graduate School of Advanced Science and Technology, Japan Advanced Institute of Science and Technology, Nomi, Ishikawa, JapanGraduate School of Advanced Science and Technology, Japan Advanced Institute of Science and Technology, Nomi, Ishikawa, JapanGraduate School of Advanced Science and Technology, Japan Advanced Institute of Science and Technology, Nomi, Ishikawa, JapanGraduate School of Advanced Science and Technology, Japan Advanced Institute of Science and Technology, Nomi, Ishikawa, JapanGraduate School of Advanced Science and Technology, Japan Advanced Institute of Science and Technology, Nomi, Ishikawa, JapanGraduate School of Advanced Science and Technology, Japan Advanced Institute of Science and Technology, Nomi, Ishikawa, JapanDistinguishing human-imitated speech from genuine speech presents a significant challenge for listeners due to their natural resemblance. Human auditory perception (HAP) has been widely studied to understand its mechanisms, with HAP-based acoustic features and metrics applied in various applications to assess sound quality and discriminate sound events. Leveraging these insights, this study specifically aims to evaluate HAP’s effectiveness in differentiating genuine from imitated speech through a systematic subject test. To this end, the study applies HAP to the task of distinguishing genuine from imitated speech, using a specially developed dataset of human-imitated speech, due to the lack of comparable publicly available datasets. A three-phase, human-centered approach was used to evaluate HAP ability, and participants achieved an average accuracy of 70.10% in distinguishing genuine from imitated speech in the final test. Additionally, a feasibility study was conducted using two feature sets for machine classification; among the timbral features, boominess and depth performed best with accuracies of 62% and 60%, respectively, while general features like Mel-spectrograms reached 51%. These results underscore the importance of auditory-related features in effectively detecting imitated speech.https://ieeexplore.ieee.org/document/10829923/Human-imitated speechhuman auditory perceptiontimbral featureshuman listeners
spellingShingle	Khalid Zaman Kai Li Islam J. A. M. Samiul Yasufumi Uezu Shunsuke Kidani Masashi Unoki Ability of Human Auditory Perception to Distinguish Human-Imitated Speech IEEE Access Human-imitated speech human auditory perception timbral features human listeners
title	Ability of Human Auditory Perception to Distinguish Human-Imitated Speech
title_full	Ability of Human Auditory Perception to Distinguish Human-Imitated Speech
title_fullStr	Ability of Human Auditory Perception to Distinguish Human-Imitated Speech
title_full_unstemmed	Ability of Human Auditory Perception to Distinguish Human-Imitated Speech
title_short	Ability of Human Auditory Perception to Distinguish Human-Imitated Speech
title_sort	ability of human auditory perception to distinguish human imitated speech
topic	Human-imitated speech human auditory perception timbral features human listeners
url	https://ieeexplore.ieee.org/document/10829923/
work_keys_str_mv	AT khalidzaman abilityofhumanauditoryperceptiontodistinguishhumanimitatedspeech AT kaili abilityofhumanauditoryperceptiontodistinguishhumanimitatedspeech AT islamjamsamiul abilityofhumanauditoryperceptiontodistinguishhumanimitatedspeech AT yasufumiuezu abilityofhumanauditoryperceptiontodistinguishhumanimitatedspeech AT shunsukekidani abilityofhumanauditoryperceptiontodistinguishhumanimitatedspeech AT masashiunoki abilityofhumanauditoryperceptiontodistinguishhumanimitatedspeech

Ability of Human Auditory Perception to Distinguish Human-Imitated Speech

Similar Items