Minimum effort adaptation of automatic speech recognition system in air traffic management
Advancements in Automatic Speech Recognition (ASR) technology is exemplified by ubiquitous voice assistants such as Siri and Alexa. Researchers have been exploring the application of ASR for Air Traffic Management (ATM) systems. Initial prototypes utilized ASR to pre-fill aircraft radar labels and...
Saved in:
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
TU Delft OPEN Publishing
2025-01-01
|
Series: | European Journal of Transport and Infrastructure Research |
Subjects: | |
Online Access: | https://journals.open.tudelft.nl/ejtir/article/view/7531 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1841556332499435520 |
---|---|
author | Mrinmoy Bhattacharjee Petr Motlicek Srikanth Madikeri Hartmut Helmke Oliver Ohneiser Matthias Kleinert Heiko Ehr |
author_facet | Mrinmoy Bhattacharjee Petr Motlicek Srikanth Madikeri Hartmut Helmke Oliver Ohneiser Matthias Kleinert Heiko Ehr |
author_sort | Mrinmoy Bhattacharjee |
collection | DOAJ |
description |
Advancements in Automatic Speech Recognition (ASR) technology is exemplified by ubiquitous voice assistants such as Siri and Alexa. Researchers have been exploring the application of ASR for Air Traffic Management (ATM) systems. Initial prototypes utilized ASR to pre-fill aircraft radar labels and achieved a technological readiness level before industrialization (TRL6). However, accurately recognizing infrequently used but highly informative domain-specific vocabulary is still an issue. This includes waypoint names specific to each airspace region and unique airline designators, e.g., “dexon” or “pobeda”. Traditionally, open-source ASR toolkits or large pre-trained models require substantial domain-specific transcribed speech data to adapt to specialized vocabularies. However, typically, a “universal” ASR engine capable of reliably recognizing a core dictionary of several hundreds of frequently used words suffices for ATM applications. The challenge lies in dynamically integrating the additional region-specific words used less frequently. These uncommon words are crucial for maintaining clear communication within the ATM environment. This paper proposes a novel approach that facilitates the dynamic integration of these new and specific word entities into the existing universal ASR system. This paves the way for “plug-and-play” customization with minimal expert intervention and eliminates the need for extensive fine-tuning of the universal ASR model. The proposed approach demonstrably improves the accuracy of these region-specific words by a factor of ≈7 (from 10% F1-score to 70%) for all rare words and ≈5 (from 13% F1-score to 64%) for waypoints.
|
format | Article |
id | doaj-art-18f68c71e9ea4e43a006847a68ea9084 |
institution | Kabale University |
issn | 1567-7141 |
language | English |
publishDate | 2025-01-01 |
publisher | TU Delft OPEN Publishing |
record_format | Article |
series | European Journal of Transport and Infrastructure Research |
spelling | doaj-art-18f68c71e9ea4e43a006847a68ea90842025-01-07T09:43:21ZengTU Delft OPEN PublishingEuropean Journal of Transport and Infrastructure Research1567-71412025-01-0124410.59490/ejtir.2024.24.4.7531Minimum effort adaptation of automatic speech recognition system in air traffic managementMrinmoy Bhattacharjee0Petr Motlicek1Srikanth Madikeri2Hartmut Helmke3Oliver Ohneiser4Matthias Kleinert5Heiko Ehr6Idiap Research Institute, SwitzerlandIdiap Research InstituteIdiap Research InstituteInstitute of Flight Guidance, German Aerospace Center (DLR) Braunschweig, GermanyInstitute of Flight Guidance, German Aerospace Center (DLR) Braunschweig, GermanyInstitute of Flight Guidance, German Aerospace Center (DLR) Braunschweig, GermanyInstitute of Flight Guidance, German Aerospace Center (DLR) Braunschweig, Germany Advancements in Automatic Speech Recognition (ASR) technology is exemplified by ubiquitous voice assistants such as Siri and Alexa. Researchers have been exploring the application of ASR for Air Traffic Management (ATM) systems. Initial prototypes utilized ASR to pre-fill aircraft radar labels and achieved a technological readiness level before industrialization (TRL6). However, accurately recognizing infrequently used but highly informative domain-specific vocabulary is still an issue. This includes waypoint names specific to each airspace region and unique airline designators, e.g., “dexon” or “pobeda”. Traditionally, open-source ASR toolkits or large pre-trained models require substantial domain-specific transcribed speech data to adapt to specialized vocabularies. However, typically, a “universal” ASR engine capable of reliably recognizing a core dictionary of several hundreds of frequently used words suffices for ATM applications. The challenge lies in dynamically integrating the additional region-specific words used less frequently. These uncommon words are crucial for maintaining clear communication within the ATM environment. This paper proposes a novel approach that facilitates the dynamic integration of these new and specific word entities into the existing universal ASR system. This paves the way for “plug-and-play” customization with minimal expert intervention and eliminates the need for extensive fine-tuning of the universal ASR model. The proposed approach demonstrably improves the accuracy of these region-specific words by a factor of ≈7 (from 10% F1-score to 70%) for all rare words and ≈5 (from 13% F1-score to 64%) for waypoints. https://journals.open.tudelft.nl/ejtir/article/view/7531Speech RecognitionModel AdaptationIntegration of prior knowledgeCustomization of modelRare-word integration |
spellingShingle | Mrinmoy Bhattacharjee Petr Motlicek Srikanth Madikeri Hartmut Helmke Oliver Ohneiser Matthias Kleinert Heiko Ehr Minimum effort adaptation of automatic speech recognition system in air traffic management European Journal of Transport and Infrastructure Research Speech Recognition Model Adaptation Integration of prior knowledge Customization of model Rare-word integration |
title | Minimum effort adaptation of automatic speech recognition system in air traffic management |
title_full | Minimum effort adaptation of automatic speech recognition system in air traffic management |
title_fullStr | Minimum effort adaptation of automatic speech recognition system in air traffic management |
title_full_unstemmed | Minimum effort adaptation of automatic speech recognition system in air traffic management |
title_short | Minimum effort adaptation of automatic speech recognition system in air traffic management |
title_sort | minimum effort adaptation of automatic speech recognition system in air traffic management |
topic | Speech Recognition Model Adaptation Integration of prior knowledge Customization of model Rare-word integration |
url | https://journals.open.tudelft.nl/ejtir/article/view/7531 |
work_keys_str_mv | AT mrinmoybhattacharjee minimumeffortadaptationofautomaticspeechrecognitionsysteminairtrafficmanagement AT petrmotlicek minimumeffortadaptationofautomaticspeechrecognitionsysteminairtrafficmanagement AT srikanthmadikeri minimumeffortadaptationofautomaticspeechrecognitionsysteminairtrafficmanagement AT hartmuthelmke minimumeffortadaptationofautomaticspeechrecognitionsysteminairtrafficmanagement AT oliverohneiser minimumeffortadaptationofautomaticspeechrecognitionsysteminairtrafficmanagement AT matthiaskleinert minimumeffortadaptationofautomaticspeechrecognitionsysteminairtrafficmanagement AT heikoehr minimumeffortadaptationofautomaticspeechrecognitionsysteminairtrafficmanagement |