DeCGAN: Speech Enhancement Algorithm for Air Traffic Control
Air traffic control (ATC) communication is susceptible to speech noise interference, which undermines the quality of civil aviation speech. To resolve this problem, we propose a speech enhancement model, termed DeCGAN, based on the DeConformer generative adversarial network. The model’s generator, t...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-04-01
|
| Series: | Algorithms |
| Subjects: | |
| Online Access: | https://www.mdpi.com/1999-4893/18/5/245 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849327745036713984 |
|---|---|
| author | Haijun Liang Yimin He Hanwen Chang Jianguo Kong |
| author_facet | Haijun Liang Yimin He Hanwen Chang Jianguo Kong |
| author_sort | Haijun Liang |
| collection | DOAJ |
| description | Air traffic control (ATC) communication is susceptible to speech noise interference, which undermines the quality of civil aviation speech. To resolve this problem, we propose a speech enhancement model, termed DeCGAN, based on the DeConformer generative adversarial network. The model’s generator, the DeConformer module, combining a time frequency channel attention (TFC-SA) module and a deformable convolution-based feedforward neural network (DeConv-FFN), effectively captures both long-range dependencies and local features of speech signals. For this study, the outputs from two branches—the mask decoder and the complex decoder—were amalgamated to produce an enhanced speech signal. An evaluation metric discriminator was then utilized to derive speech quality evaluation scores, and adversarial training was implemented to generate higher-quality speech. Subsequently, experiments were performed to compare DeCGAN with other speech enhancement models on the ATC dataset. The experimental results demonstrate that the proposed model is highly competitive compared to existing models. Specifically, the DeCGAN model achieved a perceptual evaluation of speech quality (PESQ) score of 3.31 and short-time objective intelligibility (STOI) value of 0.96. |
| format | Article |
| id | doaj-art-8b0dd578086a4a3399fbd0a6d033bbb4 |
| institution | Kabale University |
| issn | 1999-4893 |
| language | English |
| publishDate | 2025-04-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Algorithms |
| spelling | doaj-art-8b0dd578086a4a3399fbd0a6d033bbb42025-08-20T03:47:48ZengMDPI AGAlgorithms1999-48932025-04-0118524510.3390/a18050245DeCGAN: Speech Enhancement Algorithm for Air Traffic ControlHaijun Liang0Yimin He1Hanwen Chang2Jianguo Kong3Key Laboratory of Flight Techniques and Flight Safety, Civil Aviation Flight University of China, Jianyang 641400, ChinaKey Laboratory of Flight Techniques and Flight Safety, Civil Aviation Flight University of China, Jianyang 641400, ChinaKey Laboratory of Flight Techniques and Flight Safety, Civil Aviation Flight University of China, Jianyang 641400, ChinaKey Laboratory of Flight Techniques and Flight Safety, Civil Aviation Flight University of China, Jianyang 641400, ChinaAir traffic control (ATC) communication is susceptible to speech noise interference, which undermines the quality of civil aviation speech. To resolve this problem, we propose a speech enhancement model, termed DeCGAN, based on the DeConformer generative adversarial network. The model’s generator, the DeConformer module, combining a time frequency channel attention (TFC-SA) module and a deformable convolution-based feedforward neural network (DeConv-FFN), effectively captures both long-range dependencies and local features of speech signals. For this study, the outputs from two branches—the mask decoder and the complex decoder—were amalgamated to produce an enhanced speech signal. An evaluation metric discriminator was then utilized to derive speech quality evaluation scores, and adversarial training was implemented to generate higher-quality speech. Subsequently, experiments were performed to compare DeCGAN with other speech enhancement models on the ATC dataset. The experimental results demonstrate that the proposed model is highly competitive compared to existing models. Specifically, the DeCGAN model achieved a perceptual evaluation of speech quality (PESQ) score of 3.31 and short-time objective intelligibility (STOI) value of 0.96.https://www.mdpi.com/1999-4893/18/5/245speech enhancementair traffic controlDeConformerdeep learninggenerative adversarial network (GAN) |
| spellingShingle | Haijun Liang Yimin He Hanwen Chang Jianguo Kong DeCGAN: Speech Enhancement Algorithm for Air Traffic Control Algorithms speech enhancement air traffic control DeConformer deep learning generative adversarial network (GAN) |
| title | DeCGAN: Speech Enhancement Algorithm for Air Traffic Control |
| title_full | DeCGAN: Speech Enhancement Algorithm for Air Traffic Control |
| title_fullStr | DeCGAN: Speech Enhancement Algorithm for Air Traffic Control |
| title_full_unstemmed | DeCGAN: Speech Enhancement Algorithm for Air Traffic Control |
| title_short | DeCGAN: Speech Enhancement Algorithm for Air Traffic Control |
| title_sort | decgan speech enhancement algorithm for air traffic control |
| topic | speech enhancement air traffic control DeConformer deep learning generative adversarial network (GAN) |
| url | https://www.mdpi.com/1999-4893/18/5/245 |
| work_keys_str_mv | AT haijunliang decganspeechenhancementalgorithmforairtrafficcontrol AT yiminhe decganspeechenhancementalgorithmforairtrafficcontrol AT hanwenchang decganspeechenhancementalgorithmforairtrafficcontrol AT jianguokong decganspeechenhancementalgorithmforairtrafficcontrol |