An air combat maneuver decision-making approach using coupled reward in deep reinforcement learning
Abstract In the domain of unmanned air combat, achieving efficient autonomous maneuvering decisions presents challenges. Deep Reinforcement learning(DRL) is one of the approaches to tackle this problem. The final performance of the DRL algorithm is directly affected by the design of the reward funct...
Saved in:
| Main Authors: | , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Springer
2025-06-01
|
| Series: | Complex & Intelligent Systems |
| Subjects: | |
| Online Access: | https://doi.org/10.1007/s40747-025-01992-9 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849341656564760576 |
|---|---|
| author | Jian Yang Liangpei Wang Jiale Han Changdi Chen Yinlong Yuan Zhu Liang Yu Guoli Yang |
| author_facet | Jian Yang Liangpei Wang Jiale Han Changdi Chen Yinlong Yuan Zhu Liang Yu Guoli Yang |
| author_sort | Jian Yang |
| collection | DOAJ |
| description | Abstract In the domain of unmanned air combat, achieving efficient autonomous maneuvering decisions presents challenges. Deep Reinforcement learning(DRL) is one of the approaches to tackle this problem. The final performance of the DRL algorithm is directly affected by the design of the reward functions. However, the performance and convergence speed of the models suffer from unreasonable reward weights. Therefore, a method named Coupled Reward-Deep Reinforcement Learning(CR-DRL) is introduced to deal with this problem. Specifically, we propose a novel coupled-weight reward function for DRL within the air combat framework. The novel reward function integrates angle and distance so that our DRL maneuver decision model can be trained faster and perform better compared to that of the models use conventional reward functions. Additionally, we establish a brand new competitive training framework designed to enhance the performance of our model against personalized opponents. The experimental results show that our CR-DRL model outperforms the traditional model that uses the fixed-weight reward functions in this training framework, with a 6.3% increase in average reward in fixed scenarios and a 22.8% increase in changeable scenarios. Moreover, the performance of our model continually improves with the increase of iterations, ultimately yielding a certain degree of generalization performance against similar opponents. Finally, we develop a simulation environment that supports real-time air combat based on Unity3D, called Airfightsim, to demonstrate the performance of the proposed algorithm. |
| format | Article |
| id | doaj-art-84b47cfc62f94e5189799a8c9ab797cf |
| institution | Kabale University |
| issn | 2199-4536 2198-6053 |
| language | English |
| publishDate | 2025-06-01 |
| publisher | Springer |
| record_format | Article |
| series | Complex & Intelligent Systems |
| spelling | doaj-art-84b47cfc62f94e5189799a8c9ab797cf2025-08-20T03:43:34ZengSpringerComplex & Intelligent Systems2199-45362198-60532025-06-0111811710.1007/s40747-025-01992-9An air combat maneuver decision-making approach using coupled reward in deep reinforcement learningJian Yang0Liangpei Wang1Jiale Han2Changdi Chen3Yinlong Yuan4Zhu Liang Yu5Guoli Yang6College of Automation Science and Engineering, South China University of TechnologyCollege of Automation Science and Engineering, South China University of TechnologyCollege of Automation Science and Engineering, South China University of TechnologyCollege of Automation Science and Engineering, South China University of TechnologySchool of Electrical Engineering, Nantong UniversityCollege of Automation Science and Engineering, South China University of TechnologyDepartment of Big Data Intelligence, Advanced Institute of Big DataAbstract In the domain of unmanned air combat, achieving efficient autonomous maneuvering decisions presents challenges. Deep Reinforcement learning(DRL) is one of the approaches to tackle this problem. The final performance of the DRL algorithm is directly affected by the design of the reward functions. However, the performance and convergence speed of the models suffer from unreasonable reward weights. Therefore, a method named Coupled Reward-Deep Reinforcement Learning(CR-DRL) is introduced to deal with this problem. Specifically, we propose a novel coupled-weight reward function for DRL within the air combat framework. The novel reward function integrates angle and distance so that our DRL maneuver decision model can be trained faster and perform better compared to that of the models use conventional reward functions. Additionally, we establish a brand new competitive training framework designed to enhance the performance of our model against personalized opponents. The experimental results show that our CR-DRL model outperforms the traditional model that uses the fixed-weight reward functions in this training framework, with a 6.3% increase in average reward in fixed scenarios and a 22.8% increase in changeable scenarios. Moreover, the performance of our model continually improves with the increase of iterations, ultimately yielding a certain degree of generalization performance against similar opponents. Finally, we develop a simulation environment that supports real-time air combat based on Unity3D, called Airfightsim, to demonstrate the performance of the proposed algorithm.https://doi.org/10.1007/s40747-025-01992-9Air combatManeuver decision-makingDeep reinforcement learning(DRL)Coupled reward |
| spellingShingle | Jian Yang Liangpei Wang Jiale Han Changdi Chen Yinlong Yuan Zhu Liang Yu Guoli Yang An air combat maneuver decision-making approach using coupled reward in deep reinforcement learning Complex & Intelligent Systems Air combat Maneuver decision-making Deep reinforcement learning(DRL) Coupled reward |
| title | An air combat maneuver decision-making approach using coupled reward in deep reinforcement learning |
| title_full | An air combat maneuver decision-making approach using coupled reward in deep reinforcement learning |
| title_fullStr | An air combat maneuver decision-making approach using coupled reward in deep reinforcement learning |
| title_full_unstemmed | An air combat maneuver decision-making approach using coupled reward in deep reinforcement learning |
| title_short | An air combat maneuver decision-making approach using coupled reward in deep reinforcement learning |
| title_sort | air combat maneuver decision making approach using coupled reward in deep reinforcement learning |
| topic | Air combat Maneuver decision-making Deep reinforcement learning(DRL) Coupled reward |
| url | https://doi.org/10.1007/s40747-025-01992-9 |
| work_keys_str_mv | AT jianyang anaircombatmaneuverdecisionmakingapproachusingcoupledrewardindeepreinforcementlearning AT liangpeiwang anaircombatmaneuverdecisionmakingapproachusingcoupledrewardindeepreinforcementlearning AT jialehan anaircombatmaneuverdecisionmakingapproachusingcoupledrewardindeepreinforcementlearning AT changdichen anaircombatmaneuverdecisionmakingapproachusingcoupledrewardindeepreinforcementlearning AT yinlongyuan anaircombatmaneuverdecisionmakingapproachusingcoupledrewardindeepreinforcementlearning AT zhuliangyu anaircombatmaneuverdecisionmakingapproachusingcoupledrewardindeepreinforcementlearning AT guoliyang anaircombatmaneuverdecisionmakingapproachusingcoupledrewardindeepreinforcementlearning AT jianyang aircombatmaneuverdecisionmakingapproachusingcoupledrewardindeepreinforcementlearning AT liangpeiwang aircombatmaneuverdecisionmakingapproachusingcoupledrewardindeepreinforcementlearning AT jialehan aircombatmaneuverdecisionmakingapproachusingcoupledrewardindeepreinforcementlearning AT changdichen aircombatmaneuverdecisionmakingapproachusingcoupledrewardindeepreinforcementlearning AT yinlongyuan aircombatmaneuverdecisionmakingapproachusingcoupledrewardindeepreinforcementlearning AT zhuliangyu aircombatmaneuverdecisionmakingapproachusingcoupledrewardindeepreinforcementlearning AT guoliyang aircombatmaneuverdecisionmakingapproachusingcoupledrewardindeepreinforcementlearning |