Speaker verification method based on cross-domain attentive feature fusion

Aiming at the problem that the lack of structure information among speech signal sample in the front-end acoustic features of speaker verification system, a speaker verification method based on cross-domain attentive feature fusion was proposed.Firstly, a feature extraction method based on the graph...

Full description

Saved in:
Bibliographic Details
Main Authors: Zhen YANG, Tianlang WANG, Haiyan GUO, Tingting WANG
Format: Article
Language:zho
Published: Editorial Department of Journal on Communications 2023-08-01
Series:Tongxin xuebao
Subjects:
Online Access:http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000-436x.2023142/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841540064620838912
author Zhen YANG
Tianlang WANG
Haiyan GUO
Tingting WANG
author_facet Zhen YANG
Tianlang WANG
Haiyan GUO
Tingting WANG
author_sort Zhen YANG
collection DOAJ
description Aiming at the problem that the lack of structure information among speech signal sample in the front-end acoustic features of speaker verification system, a speaker verification method based on cross-domain attentive feature fusion was proposed.Firstly, a feature extraction method based on the graph signal processing (GSP) was proposed to extract the structural information of speech signals, each sample point in a speech signal frame was regarded as a graph node to construct the speech graph signal and the graph frequency information of the speech signal was extracted through the graph Fourier transform and filter banks.Then, an attentive feature fusion network with the residual neural network and the squeeze-and- excitation block was proposed to fuse the features in the traditional time-frequency domain and those in the graph frequency domain to promote the speaker verification system performance.Finally, the experiment was carried out on the VoxCeleb, SITW, and CN-Celeb datasets.The experimental results show that the proposed method performs better than the baseline ECAPA-TDNN model in terms of equal error rate (EER) and minimum detection cost function (min-DCF).
format Article
id doaj-art-aec70d9e9e824d62a8eda2b4c2da65d6
institution Kabale University
issn 1000-436X
language zho
publishDate 2023-08-01
publisher Editorial Department of Journal on Communications
record_format Article
series Tongxin xuebao
spelling doaj-art-aec70d9e9e824d62a8eda2b4c2da65d62025-01-14T06:22:47ZzhoEditorial Department of Journal on CommunicationsTongxin xuebao1000-436X2023-08-0144899859385802Speaker verification method based on cross-domain attentive feature fusionZhen YANGTianlang WANGHaiyan GUOTingting WANGAiming at the problem that the lack of structure information among speech signal sample in the front-end acoustic features of speaker verification system, a speaker verification method based on cross-domain attentive feature fusion was proposed.Firstly, a feature extraction method based on the graph signal processing (GSP) was proposed to extract the structural information of speech signals, each sample point in a speech signal frame was regarded as a graph node to construct the speech graph signal and the graph frequency information of the speech signal was extracted through the graph Fourier transform and filter banks.Then, an attentive feature fusion network with the residual neural network and the squeeze-and- excitation block was proposed to fuse the features in the traditional time-frequency domain and those in the graph frequency domain to promote the speaker verification system performance.Finally, the experiment was carried out on the VoxCeleb, SITW, and CN-Celeb datasets.The experimental results show that the proposed method performs better than the baseline ECAPA-TDNN model in terms of equal error rate (EER) and minimum detection cost function (min-DCF).http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000-436x.2023142/speaker verificationgraph signal processingattentive feature fusion
spellingShingle Zhen YANG
Tianlang WANG
Haiyan GUO
Tingting WANG
Speaker verification method based on cross-domain attentive feature fusion
Tongxin xuebao
speaker verification
graph signal processing
attentive feature fusion
title Speaker verification method based on cross-domain attentive feature fusion
title_full Speaker verification method based on cross-domain attentive feature fusion
title_fullStr Speaker verification method based on cross-domain attentive feature fusion
title_full_unstemmed Speaker verification method based on cross-domain attentive feature fusion
title_short Speaker verification method based on cross-domain attentive feature fusion
title_sort speaker verification method based on cross domain attentive feature fusion
topic speaker verification
graph signal processing
attentive feature fusion
url http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000-436x.2023142/
work_keys_str_mv AT zhenyang speakerverificationmethodbasedoncrossdomainattentivefeaturefusion
AT tianlangwang speakerverificationmethodbasedoncrossdomainattentivefeaturefusion
AT haiyanguo speakerverificationmethodbasedoncrossdomainattentivefeaturefusion
AT tingtingwang speakerverificationmethodbasedoncrossdomainattentivefeaturefusion