Small-size spectral features for machine learning in voice signal analysis and classification tasks

Objectives. The problem of developing a method for calculating small-sized spectral features that increases the efficiency of existing machine learning systems for analyzing and classifying voice signals is being solved.Methods. Spectral features are extracted using a generative approach, which invo...

Full description

Saved in:
Bibliographic Details
Main Authors: D. S. Likhachov, M. I. Vashkevich, N. A. Petrovsky, E. S. Azarov
Format: Article
Language:Russian
Published: National Academy of Sciences of Belarus, the United Institute of Informatics Problems 2023-03-01
Series:Informatika
Subjects:
Online Access:https://inf.grid.by/jour/article/view/1234
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Objectives. The problem of developing a method for calculating small-sized spectral features that increases the efficiency of existing machine learning systems for analyzing and classifying voice signals is being solved.Methods. Spectral features are extracted using a generative approach, which involves calculating a discrete Fourier spectrum for a sequence of samples generated using an autoregressive model of input voice signal. The generated sequence processed by the discrete Fourier transform considers the periodicity of the transform and thereby increase the accuracy of spectral estimation of analyzed signal.Results. A generative method for calculating spectral features intended for use in machine learning systems for the analysis and classification of voice signals is proposed and described. An experimental analysis of the  accuracy and stability of the spectrum representation of a test signal with a known spectral composition has been carried out using the envelopes. The envelopes were calculated using  proposed generative method and using discrete Fourier transform with different analysis windows (rectangular window and Hanna window).  The analysis showed that spectral envelopes obtained using the proposed method more accurately represent the spectrum of test signal according to the criterion of minimum square error. A comparison of the effectiveness of voice signal classification with proposed features and the features based on the mel-frequency kepstral  coefficients is carried out. A diagnostic system for amyotrophic lateral sclerosis was used as a basic test system to evaluate the effectiveness of proposed approach in practice. Conclusion. The obtained experimental results showed a significant increase of classification accuracy when using proposed approach for calculating features compared with the features based on the mel-frequency kepstral coefficients.
ISSN:1816-0301