Large-scale post-disaster user distributed coverage optimization based on multi-agent reinforcement learning
In order to quickly restore emergency communication services for large-scale post-disaster users, a distributed intellicise coverage optimization architecture based on multi-agent reinforcement learning (RL) was proposed, which could address the significant differences and dynamics of communication...
Saved in:
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | zho |
Published: |
Editorial Department of Journal on Communications
2022-08-01
|
Series: | Tongxin xuebao |
Subjects: | |
Online Access: | http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000-436x.2022131/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1841540001946402816 |
---|---|
author | Wenjun XU Silei WU Fengyu WANG Lan LIN Guojun LI Zhi ZHANG |
author_facet | Wenjun XU Silei WU Fengyu WANG Lan LIN Guojun LI Zhi ZHANG |
author_sort | Wenjun XU |
collection | DOAJ |
description | In order to quickly restore emergency communication services for large-scale post-disaster users, a distributed intellicise coverage optimization architecture based on multi-agent reinforcement learning (RL) was proposed, which could address the significant differences and dynamics of communication services caused by a large number of access users, and the difficulty of expansion caused by centralized algorithms.Specifically, a distributed k-sums clustering algorithm considering service differences of users was designed in the network characterization layer, which could make each unmanned aerial vehicle base station (UAV-BS) adjust the local networking natively and simply, and obtain states of cluster center for multi-agent RL.In the trajectory control layer, multi-agent soft actor critic (MASAC) with distributed-training-distributed-execution structure was designed for UAV-BS to control trajectory as intelligent nodes.Furthermore, ensemble learning and curriculum learning were integrated to improve the stability and convergence speed of training process.The simulation results show that the proposed distributed k-sums algorithm is superior to the k-means in terms of average load efficiency and clustering balance, and MASAC based trajectory control algorithm can effectively reduce communication interruptions and improve the spectrum efficiency, which outperforms the existing RL algorithms. |
format | Article |
id | doaj-art-eeac286b09364340911754262dcfd08a |
institution | Kabale University |
issn | 1000-436X |
language | zho |
publishDate | 2022-08-01 |
publisher | Editorial Department of Journal on Communications |
record_format | Article |
series | Tongxin xuebao |
spelling | doaj-art-eeac286b09364340911754262dcfd08a2025-01-14T06:28:53ZzhoEditorial Department of Journal on CommunicationsTongxin xuebao1000-436X2022-08-014311659392121Large-scale post-disaster user distributed coverage optimization based on multi-agent reinforcement learningWenjun XUSilei WUFengyu WANGLan LINGuojun LIZhi ZHANGIn order to quickly restore emergency communication services for large-scale post-disaster users, a distributed intellicise coverage optimization architecture based on multi-agent reinforcement learning (RL) was proposed, which could address the significant differences and dynamics of communication services caused by a large number of access users, and the difficulty of expansion caused by centralized algorithms.Specifically, a distributed k-sums clustering algorithm considering service differences of users was designed in the network characterization layer, which could make each unmanned aerial vehicle base station (UAV-BS) adjust the local networking natively and simply, and obtain states of cluster center for multi-agent RL.In the trajectory control layer, multi-agent soft actor critic (MASAC) with distributed-training-distributed-execution structure was designed for UAV-BS to control trajectory as intelligent nodes.Furthermore, ensemble learning and curriculum learning were integrated to improve the stability and convergence speed of training process.The simulation results show that the proposed distributed k-sums algorithm is superior to the k-means in terms of average load efficiency and clustering balance, and MASAC based trajectory control algorithm can effectively reduce communication interruptions and improve the spectrum efficiency, which outperforms the existing RL algorithms.http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000-436x.2022131/emergency communicationcoverage optimizationmulti-agent reinforcement learningdistributed training |
spellingShingle | Wenjun XU Silei WU Fengyu WANG Lan LIN Guojun LI Zhi ZHANG Large-scale post-disaster user distributed coverage optimization based on multi-agent reinforcement learning Tongxin xuebao emergency communication coverage optimization multi-agent reinforcement learning distributed training |
title | Large-scale post-disaster user distributed coverage optimization based on multi-agent reinforcement learning |
title_full | Large-scale post-disaster user distributed coverage optimization based on multi-agent reinforcement learning |
title_fullStr | Large-scale post-disaster user distributed coverage optimization based on multi-agent reinforcement learning |
title_full_unstemmed | Large-scale post-disaster user distributed coverage optimization based on multi-agent reinforcement learning |
title_short | Large-scale post-disaster user distributed coverage optimization based on multi-agent reinforcement learning |
title_sort | large scale post disaster user distributed coverage optimization based on multi agent reinforcement learning |
topic | emergency communication coverage optimization multi-agent reinforcement learning distributed training |
url | http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000-436x.2022131/ |
work_keys_str_mv | AT wenjunxu largescalepostdisasteruserdistributedcoverageoptimizationbasedonmultiagentreinforcementlearning AT sileiwu largescalepostdisasteruserdistributedcoverageoptimizationbasedonmultiagentreinforcementlearning AT fengyuwang largescalepostdisasteruserdistributedcoverageoptimizationbasedonmultiagentreinforcementlearning AT lanlin largescalepostdisasteruserdistributedcoverageoptimizationbasedonmultiagentreinforcementlearning AT guojunli largescalepostdisasteruserdistributedcoverageoptimizationbasedonmultiagentreinforcementlearning AT zhizhang largescalepostdisasteruserdistributedcoverageoptimizationbasedonmultiagentreinforcementlearning |