Synthetic speech detection method using texture feature based on circumferential local ternary pattern
In order to further improve the accuracy of synthetic speech detection, a synthetic speech detection method using texture feature based on circumferential local ternary pattern (CLTP) was proposed.The method extracted the texture information from the speech spectrogram using the CLTP and applied it...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | zho |
Published: |
Beijing Xintong Media Co., Ltd
2023-06-01
|
Series: | Dianxin kexue |
Subjects: | |
Online Access: | http://www.telecomsci.com/zh/article/doi/10.11959/j.issn.1000-0801.2023121/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1841530832564518912 |
---|---|
author | Honghui JIN Zhihua JIAN Man YANG Chao WU |
author_facet | Honghui JIN Zhihua JIAN Man YANG Chao WU |
author_sort | Honghui JIN |
collection | DOAJ |
description | In order to further improve the accuracy of synthetic speech detection, a synthetic speech detection method using texture feature based on circumferential local ternary pattern (CLTP) was proposed.The method extracted the texture information from the speech spectrogram using the CLTP and applied it as the feature representation of speech.The deep residual network was employed as the back-end classifier to determine the real or spoofing speech.The experimental results demonstrate that, on the ASVspoof 2019 dataset, the proposed method reduces the equal error rate (EER) by 54.29% and 2.15% respectively, compared with the traditional constant Q cepstral coefficient (CQCC) and linear predictive cepstral coefficient (LPCC), and reduced the EER by 17.14% compared with the local ternary pattern(LTP) texture features.The CLTP comprehensively takes into account the differences between the central and peripheral pixels in the neighborhood and between each peripheral pixel.Then it can acquire more texture information from the speech spectrogram, and improve the accuracy of synthetic speech detection. |
format | Article |
id | doaj-art-71259d0f51754d50be82dd3c01317747 |
institution | Kabale University |
issn | 1000-0801 |
language | zho |
publishDate | 2023-06-01 |
publisher | Beijing Xintong Media Co., Ltd |
record_format | Article |
series | Dianxin kexue |
spelling | doaj-art-71259d0f51754d50be82dd3c013177472025-01-15T02:58:33ZzhoBeijing Xintong Media Co., LtdDianxin kexue1000-08012023-06-0139859559565886Synthetic speech detection method using texture feature based on circumferential local ternary patternHonghui JINZhihua JIANMan YANGChao WUIn order to further improve the accuracy of synthetic speech detection, a synthetic speech detection method using texture feature based on circumferential local ternary pattern (CLTP) was proposed.The method extracted the texture information from the speech spectrogram using the CLTP and applied it as the feature representation of speech.The deep residual network was employed as the back-end classifier to determine the real or spoofing speech.The experimental results demonstrate that, on the ASVspoof 2019 dataset, the proposed method reduces the equal error rate (EER) by 54.29% and 2.15% respectively, compared with the traditional constant Q cepstral coefficient (CQCC) and linear predictive cepstral coefficient (LPCC), and reduced the EER by 17.14% compared with the local ternary pattern(LTP) texture features.The CLTP comprehensively takes into account the differences between the central and peripheral pixels in the neighborhood and between each peripheral pixel.Then it can acquire more texture information from the speech spectrogram, and improve the accuracy of synthetic speech detection.http://www.telecomsci.com/zh/article/doi/10.11959/j.issn.1000-0801.2023121/speaker verificationsynthetic speech detectionCLTPdeep residual network |
spellingShingle | Honghui JIN Zhihua JIAN Man YANG Chao WU Synthetic speech detection method using texture feature based on circumferential local ternary pattern Dianxin kexue speaker verification synthetic speech detection CLTP deep residual network |
title | Synthetic speech detection method using texture feature based on circumferential local ternary pattern |
title_full | Synthetic speech detection method using texture feature based on circumferential local ternary pattern |
title_fullStr | Synthetic speech detection method using texture feature based on circumferential local ternary pattern |
title_full_unstemmed | Synthetic speech detection method using texture feature based on circumferential local ternary pattern |
title_short | Synthetic speech detection method using texture feature based on circumferential local ternary pattern |
title_sort | synthetic speech detection method using texture feature based on circumferential local ternary pattern |
topic | speaker verification synthetic speech detection CLTP deep residual network |
url | http://www.telecomsci.com/zh/article/doi/10.11959/j.issn.1000-0801.2023121/ |
work_keys_str_mv | AT honghuijin syntheticspeechdetectionmethodusingtexturefeaturebasedoncircumferentiallocalternarypattern AT zhihuajian syntheticspeechdetectionmethodusingtexturefeaturebasedoncircumferentiallocalternarypattern AT manyang syntheticspeechdetectionmethodusingtexturefeaturebasedoncircumferentiallocalternarypattern AT chaowu syntheticspeechdetectionmethodusingtexturefeaturebasedoncircumferentiallocalternarypattern |