DeCGAN: Speech Enhancement Algorithm for Air Traffic Control

Air traffic control (ATC) communication is susceptible to speech noise interference, which undermines the quality of civil aviation speech. To resolve this problem, we propose a speech enhancement model, termed DeCGAN, based on the DeConformer generative adversarial network. The model’s generator, t...

Full description

Saved in:
Bibliographic Details
Main Authors: Haijun Liang, Yimin He, Hanwen Chang, Jianguo Kong
Format: Article
Language:English
Published: MDPI AG 2025-04-01
Series:Algorithms
Subjects:
Online Access:https://www.mdpi.com/1999-4893/18/5/245
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849327745036713984
author Haijun Liang
Yimin He
Hanwen Chang
Jianguo Kong
author_facet Haijun Liang
Yimin He
Hanwen Chang
Jianguo Kong
author_sort Haijun Liang
collection DOAJ
description Air traffic control (ATC) communication is susceptible to speech noise interference, which undermines the quality of civil aviation speech. To resolve this problem, we propose a speech enhancement model, termed DeCGAN, based on the DeConformer generative adversarial network. The model’s generator, the DeConformer module, combining a time frequency channel attention (TFC-SA) module and a deformable convolution-based feedforward neural network (DeConv-FFN), effectively captures both long-range dependencies and local features of speech signals. For this study, the outputs from two branches—the mask decoder and the complex decoder—were amalgamated to produce an enhanced speech signal. An evaluation metric discriminator was then utilized to derive speech quality evaluation scores, and adversarial training was implemented to generate higher-quality speech. Subsequently, experiments were performed to compare DeCGAN with other speech enhancement models on the ATC dataset. The experimental results demonstrate that the proposed model is highly competitive compared to existing models. Specifically, the DeCGAN model achieved a perceptual evaluation of speech quality (PESQ) score of 3.31 and short-time objective intelligibility (STOI) value of 0.96.
format Article
id doaj-art-8b0dd578086a4a3399fbd0a6d033bbb4
institution Kabale University
issn 1999-4893
language English
publishDate 2025-04-01
publisher MDPI AG
record_format Article
series Algorithms
spelling doaj-art-8b0dd578086a4a3399fbd0a6d033bbb42025-08-20T03:47:48ZengMDPI AGAlgorithms1999-48932025-04-0118524510.3390/a18050245DeCGAN: Speech Enhancement Algorithm for Air Traffic ControlHaijun Liang0Yimin He1Hanwen Chang2Jianguo Kong3Key Laboratory of Flight Techniques and Flight Safety, Civil Aviation Flight University of China, Jianyang 641400, ChinaKey Laboratory of Flight Techniques and Flight Safety, Civil Aviation Flight University of China, Jianyang 641400, ChinaKey Laboratory of Flight Techniques and Flight Safety, Civil Aviation Flight University of China, Jianyang 641400, ChinaKey Laboratory of Flight Techniques and Flight Safety, Civil Aviation Flight University of China, Jianyang 641400, ChinaAir traffic control (ATC) communication is susceptible to speech noise interference, which undermines the quality of civil aviation speech. To resolve this problem, we propose a speech enhancement model, termed DeCGAN, based on the DeConformer generative adversarial network. The model’s generator, the DeConformer module, combining a time frequency channel attention (TFC-SA) module and a deformable convolution-based feedforward neural network (DeConv-FFN), effectively captures both long-range dependencies and local features of speech signals. For this study, the outputs from two branches—the mask decoder and the complex decoder—were amalgamated to produce an enhanced speech signal. An evaluation metric discriminator was then utilized to derive speech quality evaluation scores, and adversarial training was implemented to generate higher-quality speech. Subsequently, experiments were performed to compare DeCGAN with other speech enhancement models on the ATC dataset. The experimental results demonstrate that the proposed model is highly competitive compared to existing models. Specifically, the DeCGAN model achieved a perceptual evaluation of speech quality (PESQ) score of 3.31 and short-time objective intelligibility (STOI) value of 0.96.https://www.mdpi.com/1999-4893/18/5/245speech enhancementair traffic controlDeConformerdeep learninggenerative adversarial network (GAN)
spellingShingle Haijun Liang
Yimin He
Hanwen Chang
Jianguo Kong
DeCGAN: Speech Enhancement Algorithm for Air Traffic Control
Algorithms
speech enhancement
air traffic control
DeConformer
deep learning
generative adversarial network (GAN)
title DeCGAN: Speech Enhancement Algorithm for Air Traffic Control
title_full DeCGAN: Speech Enhancement Algorithm for Air Traffic Control
title_fullStr DeCGAN: Speech Enhancement Algorithm for Air Traffic Control
title_full_unstemmed DeCGAN: Speech Enhancement Algorithm for Air Traffic Control
title_short DeCGAN: Speech Enhancement Algorithm for Air Traffic Control
title_sort decgan speech enhancement algorithm for air traffic control
topic speech enhancement
air traffic control
DeConformer
deep learning
generative adversarial network (GAN)
url https://www.mdpi.com/1999-4893/18/5/245
work_keys_str_mv AT haijunliang decganspeechenhancementalgorithmforairtrafficcontrol
AT yiminhe decganspeechenhancementalgorithmforairtrafficcontrol
AT hanwenchang decganspeechenhancementalgorithmforairtrafficcontrol
AT jianguokong decganspeechenhancementalgorithmforairtrafficcontrol