Dual-timescale hierarchical MADDPG for Multi-UAV cooperative search
Abstract Cooperative exploration conducted by multiple unmanned aerial vehicles (UAVs) facilitates parallelized reconnaissance over expansive territories, thereby optimizing the efficiency of target localization. This study investigates the challenge of coordinated search for sparsely located, initi...
Saved in:
| Main Authors: | , , , , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Springer
2025-07-01
|
| Series: | Journal of King Saud University: Computer and Information Sciences |
| Subjects: | |
| Online Access: | https://doi.org/10.1007/s44443-025-00156-6 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849235705560039424 |
|---|---|
| author | Jiancheng Liu Siwen Wei Bo Li Tuo Wang Wanlong Qi Xingye Han Gang Hou Ke Li Yuqing Lin Dingrui Xue Kexin Wang |
| author_facet | Jiancheng Liu Siwen Wei Bo Li Tuo Wang Wanlong Qi Xingye Han Gang Hou Ke Li Yuqing Lin Dingrui Xue Kexin Wang |
| author_sort | Jiancheng Liu |
| collection | DOAJ |
| description | Abstract Cooperative exploration conducted by multiple unmanned aerial vehicles (UAVs) facilitates parallelized reconnaissance over expansive territories, thereby optimizing the efficiency of target localization. This study investigates the challenge of coordinated search for sparsely located, initially undiscovered stationary targets by a fleet of UAVs constrained by limited perceptual capabilities. Effective resolution of this issue is pivotal for attaining rapid situational awareness in expansive, time-sensitive missions such as disaster mitigation and strategic intelligence gathering. Nonetheless, prevailing methodologies for multi-UAV search frequently encounter limitations in concurrently achieving exhaustive spatial coverage and elevated target acquisition efficacy. To overcome these deficiencies, this study introduces a dual-timescale hierarchical reinforcement learning paradigm tailored for collaborative multi-UAV search missions. The proposed Dual-Timescale Hierarchical Multi-Agent Deep Deterministic Policy Gradient (DTH-MADDPG) architecture incorporates a high-level strategic controller and an array of low-level decentralized agents, thereby enabling temporally stratified policy optimization. This framework facilitates a more nuanced equilibrium between macro-scale environmental coverage and micro-scale target identification than monolithic architectures. Empirical evaluations within simulated operational environments reveal that DTH-MADDPG markedly surpasses contemporary benchmark algorithms, demonstrating superior scalability, accelerated convergence rates, and heightened resilience. |
| format | Article |
| id | doaj-art-b2c9f903656c4a429178b4bdc9ff124d |
| institution | Kabale University |
| issn | 1319-1578 2213-1248 |
| language | English |
| publishDate | 2025-07-01 |
| publisher | Springer |
| record_format | Article |
| series | Journal of King Saud University: Computer and Information Sciences |
| spelling | doaj-art-b2c9f903656c4a429178b4bdc9ff124d2025-08-20T04:02:41ZengSpringerJournal of King Saud University: Computer and Information Sciences1319-15782213-12482025-07-0137611710.1007/s44443-025-00156-6Dual-timescale hierarchical MADDPG for Multi-UAV cooperative searchJiancheng Liu0Siwen Wei1Bo Li2Tuo Wang3Wanlong Qi4Xingye Han5Gang Hou6Ke Li7Yuqing Lin8Dingrui Xue9Kexin Wang10Northwest Institute of Mechanical and Electrical EngineeringSchool of Computer Science and Technology, Xidian UniversityNorthwest Institute of Mechanical and Electrical EngineeringNorthwest Institute of Mechanical and Electrical EngineeringNorthwest Institute of Mechanical and Electrical EngineeringNorthwest Institute of Mechanical and Electrical EngineeringNorthwest Institute of Mechanical and Electrical EngineeringNorthwest Institute of Mechanical and Electrical EngineeringSchool of Information Engineering, Chang’an UniversitySchool of Information Engineering, Chang’an UniversitySchool of Computer Science and Technology, Xidian UniversityAbstract Cooperative exploration conducted by multiple unmanned aerial vehicles (UAVs) facilitates parallelized reconnaissance over expansive territories, thereby optimizing the efficiency of target localization. This study investigates the challenge of coordinated search for sparsely located, initially undiscovered stationary targets by a fleet of UAVs constrained by limited perceptual capabilities. Effective resolution of this issue is pivotal for attaining rapid situational awareness in expansive, time-sensitive missions such as disaster mitigation and strategic intelligence gathering. Nonetheless, prevailing methodologies for multi-UAV search frequently encounter limitations in concurrently achieving exhaustive spatial coverage and elevated target acquisition efficacy. To overcome these deficiencies, this study introduces a dual-timescale hierarchical reinforcement learning paradigm tailored for collaborative multi-UAV search missions. The proposed Dual-Timescale Hierarchical Multi-Agent Deep Deterministic Policy Gradient (DTH-MADDPG) architecture incorporates a high-level strategic controller and an array of low-level decentralized agents, thereby enabling temporally stratified policy optimization. This framework facilitates a more nuanced equilibrium between macro-scale environmental coverage and micro-scale target identification than monolithic architectures. Empirical evaluations within simulated operational environments reveal that DTH-MADDPG markedly surpasses contemporary benchmark algorithms, demonstrating superior scalability, accelerated convergence rates, and heightened resilience.https://doi.org/10.1007/s44443-025-00156-6Multi-agent reinforcement learningMarkov decision processMulti-UAV cooperative search |
| spellingShingle | Jiancheng Liu Siwen Wei Bo Li Tuo Wang Wanlong Qi Xingye Han Gang Hou Ke Li Yuqing Lin Dingrui Xue Kexin Wang Dual-timescale hierarchical MADDPG for Multi-UAV cooperative search Journal of King Saud University: Computer and Information Sciences Multi-agent reinforcement learning Markov decision process Multi-UAV cooperative search |
| title | Dual-timescale hierarchical MADDPG for Multi-UAV cooperative search |
| title_full | Dual-timescale hierarchical MADDPG for Multi-UAV cooperative search |
| title_fullStr | Dual-timescale hierarchical MADDPG for Multi-UAV cooperative search |
| title_full_unstemmed | Dual-timescale hierarchical MADDPG for Multi-UAV cooperative search |
| title_short | Dual-timescale hierarchical MADDPG for Multi-UAV cooperative search |
| title_sort | dual timescale hierarchical maddpg for multi uav cooperative search |
| topic | Multi-agent reinforcement learning Markov decision process Multi-UAV cooperative search |
| url | https://doi.org/10.1007/s44443-025-00156-6 |
| work_keys_str_mv | AT jianchengliu dualtimescalehierarchicalmaddpgformultiuavcooperativesearch AT siwenwei dualtimescalehierarchicalmaddpgformultiuavcooperativesearch AT boli dualtimescalehierarchicalmaddpgformultiuavcooperativesearch AT tuowang dualtimescalehierarchicalmaddpgformultiuavcooperativesearch AT wanlongqi dualtimescalehierarchicalmaddpgformultiuavcooperativesearch AT xingyehan dualtimescalehierarchicalmaddpgformultiuavcooperativesearch AT ganghou dualtimescalehierarchicalmaddpgformultiuavcooperativesearch AT keli dualtimescalehierarchicalmaddpgformultiuavcooperativesearch AT yuqinglin dualtimescalehierarchicalmaddpgformultiuavcooperativesearch AT dingruixue dualtimescalehierarchicalmaddpgformultiuavcooperativesearch AT kexinwang dualtimescalehierarchicalmaddpgformultiuavcooperativesearch |