Real-Time Policy Optimization for UAV Swarms Based on Evolution Strategies
Multi-agent decision-making faces many challenges such as non-stationarity and sparse rewards, while the complexity and randomness of the real environment further complicate policy development. This paper addresses the high-dimensional policy optimization problems of unmanned aerial vehicle (UAV) sw...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2024-10-01
|
| Series: | Drones |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2504-446X/8/11/619 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1846153764179804160 |
|---|---|
| author | Zeyu Chen Haiying Liu Guohua Liu |
| author_facet | Zeyu Chen Haiying Liu Guohua Liu |
| author_sort | Zeyu Chen |
| collection | DOAJ |
| description | Multi-agent decision-making faces many challenges such as non-stationarity and sparse rewards, while the complexity and randomness of the real environment further complicate policy development. This paper addresses the high-dimensional policy optimization problems of unmanned aerial vehicle (UAV) swarms. By modeling the problem scenario as a Markov decision process, a real-time policy optimization algorithm based on evolution strategy (ES) pre-training is proposed. This approach combines decision-time planning with background planning to evaluate and integrate different sets of policy parameters in a temporal context. In the experimental phase, the policy network is trained using both ES and REINFORCE algorithms on a constructed simulation platform. Comparative experiments demonstrate the effectiveness of using ES for policy pre-training. Finally, the proposed real-time policy optimization algorithm further improves the performance of the swarm by approximately 10% in simulations, offering a feasible solution for adversarial games between swarms and extending the research scope of evolutionary algorithms. |
| format | Article |
| id | doaj-art-e9b9796d86014c948079ca2d2a6e624c |
| institution | Kabale University |
| issn | 2504-446X |
| language | English |
| publishDate | 2024-10-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Drones |
| spelling | doaj-art-e9b9796d86014c948079ca2d2a6e624c2024-11-26T18:00:35ZengMDPI AGDrones2504-446X2024-10-0181161910.3390/drones8110619Real-Time Policy Optimization for UAV Swarms Based on Evolution StrategiesZeyu Chen0Haiying Liu1Guohua Liu2School of Mathematics, Southeast University, Nanjing 211102, ChinaCollege of Astronautics, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, ChinaSchool of Mathematics, Southeast University, Nanjing 211102, ChinaMulti-agent decision-making faces many challenges such as non-stationarity and sparse rewards, while the complexity and randomness of the real environment further complicate policy development. This paper addresses the high-dimensional policy optimization problems of unmanned aerial vehicle (UAV) swarms. By modeling the problem scenario as a Markov decision process, a real-time policy optimization algorithm based on evolution strategy (ES) pre-training is proposed. This approach combines decision-time planning with background planning to evaluate and integrate different sets of policy parameters in a temporal context. In the experimental phase, the policy network is trained using both ES and REINFORCE algorithms on a constructed simulation platform. Comparative experiments demonstrate the effectiveness of using ES for policy pre-training. Finally, the proposed real-time policy optimization algorithm further improves the performance of the swarm by approximately 10% in simulations, offering a feasible solution for adversarial games between swarms and extending the research scope of evolutionary algorithms.https://www.mdpi.com/2504-446X/8/11/619UAV swarmreinforcement learningevolution strategiesreal-time optimization |
| spellingShingle | Zeyu Chen Haiying Liu Guohua Liu Real-Time Policy Optimization for UAV Swarms Based on Evolution Strategies Drones UAV swarm reinforcement learning evolution strategies real-time optimization |
| title | Real-Time Policy Optimization for UAV Swarms Based on Evolution Strategies |
| title_full | Real-Time Policy Optimization for UAV Swarms Based on Evolution Strategies |
| title_fullStr | Real-Time Policy Optimization for UAV Swarms Based on Evolution Strategies |
| title_full_unstemmed | Real-Time Policy Optimization for UAV Swarms Based on Evolution Strategies |
| title_short | Real-Time Policy Optimization for UAV Swarms Based on Evolution Strategies |
| title_sort | real time policy optimization for uav swarms based on evolution strategies |
| topic | UAV swarm reinforcement learning evolution strategies real-time optimization |
| url | https://www.mdpi.com/2504-446X/8/11/619 |
| work_keys_str_mv | AT zeyuchen realtimepolicyoptimizationforuavswarmsbasedonevolutionstrategies AT haiyingliu realtimepolicyoptimizationforuavswarmsbasedonevolutionstrategies AT guohualiu realtimepolicyoptimizationforuavswarmsbasedonevolutionstrategies |