Large-scale post-disaster user distributed coverage optimization based on multi-agent reinforcement learning

In order to quickly restore emergency communication services for large-scale post-disaster users, a distributed intellicise coverage optimization architecture based on multi-agent reinforcement learning (RL) was proposed, which could address the significant differences and dynamics of communication...

Full description

Saved in:
Bibliographic Details
Main Authors: Wenjun XU, Silei WU, Fengyu WANG, Lan LIN, Guojun LI, Zhi ZHANG
Format: Article
Language:zho
Published: Editorial Department of Journal on Communications 2022-08-01
Series:Tongxin xuebao
Subjects:
Online Access:http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000-436x.2022131/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841540001946402816
author Wenjun XU
Silei WU
Fengyu WANG
Lan LIN
Guojun LI
Zhi ZHANG
author_facet Wenjun XU
Silei WU
Fengyu WANG
Lan LIN
Guojun LI
Zhi ZHANG
author_sort Wenjun XU
collection DOAJ
description In order to quickly restore emergency communication services for large-scale post-disaster users, a distributed intellicise coverage optimization architecture based on multi-agent reinforcement learning (RL) was proposed, which could address the significant differences and dynamics of communication services caused by a large number of access users, and the difficulty of expansion caused by centralized algorithms.Specifically, a distributed k-sums clustering algorithm considering service differences of users was designed in the network characterization layer, which could make each unmanned aerial vehicle base station (UAV-BS) adjust the local networking natively and simply, and obtain states of cluster center for multi-agent RL.In the trajectory control layer, multi-agent soft actor critic (MASAC) with distributed-training-distributed-execution structure was designed for UAV-BS to control trajectory as intelligent nodes.Furthermore, ensemble learning and curriculum learning were integrated to improve the stability and convergence speed of training process.The simulation results show that the proposed distributed k-sums algorithm is superior to the k-means in terms of average load efficiency and clustering balance, and MASAC based trajectory control algorithm can effectively reduce communication interruptions and improve the spectrum efficiency, which outperforms the existing RL algorithms.
format Article
id doaj-art-eeac286b09364340911754262dcfd08a
institution Kabale University
issn 1000-436X
language zho
publishDate 2022-08-01
publisher Editorial Department of Journal on Communications
record_format Article
series Tongxin xuebao
spelling doaj-art-eeac286b09364340911754262dcfd08a2025-01-14T06:28:53ZzhoEditorial Department of Journal on CommunicationsTongxin xuebao1000-436X2022-08-014311659392121Large-scale post-disaster user distributed coverage optimization based on multi-agent reinforcement learningWenjun XUSilei WUFengyu WANGLan LINGuojun LIZhi ZHANGIn order to quickly restore emergency communication services for large-scale post-disaster users, a distributed intellicise coverage optimization architecture based on multi-agent reinforcement learning (RL) was proposed, which could address the significant differences and dynamics of communication services caused by a large number of access users, and the difficulty of expansion caused by centralized algorithms.Specifically, a distributed k-sums clustering algorithm considering service differences of users was designed in the network characterization layer, which could make each unmanned aerial vehicle base station (UAV-BS) adjust the local networking natively and simply, and obtain states of cluster center for multi-agent RL.In the trajectory control layer, multi-agent soft actor critic (MASAC) with distributed-training-distributed-execution structure was designed for UAV-BS to control trajectory as intelligent nodes.Furthermore, ensemble learning and curriculum learning were integrated to improve the stability and convergence speed of training process.The simulation results show that the proposed distributed k-sums algorithm is superior to the k-means in terms of average load efficiency and clustering balance, and MASAC based trajectory control algorithm can effectively reduce communication interruptions and improve the spectrum efficiency, which outperforms the existing RL algorithms.http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000-436x.2022131/emergency communicationcoverage optimizationmulti-agent reinforcement learningdistributed training
spellingShingle Wenjun XU
Silei WU
Fengyu WANG
Lan LIN
Guojun LI
Zhi ZHANG
Large-scale post-disaster user distributed coverage optimization based on multi-agent reinforcement learning
Tongxin xuebao
emergency communication
coverage optimization
multi-agent reinforcement learning
distributed training
title Large-scale post-disaster user distributed coverage optimization based on multi-agent reinforcement learning
title_full Large-scale post-disaster user distributed coverage optimization based on multi-agent reinforcement learning
title_fullStr Large-scale post-disaster user distributed coverage optimization based on multi-agent reinforcement learning
title_full_unstemmed Large-scale post-disaster user distributed coverage optimization based on multi-agent reinforcement learning
title_short Large-scale post-disaster user distributed coverage optimization based on multi-agent reinforcement learning
title_sort large scale post disaster user distributed coverage optimization based on multi agent reinforcement learning
topic emergency communication
coverage optimization
multi-agent reinforcement learning
distributed training
url http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000-436x.2022131/
work_keys_str_mv AT wenjunxu largescalepostdisasteruserdistributedcoverageoptimizationbasedonmultiagentreinforcementlearning
AT sileiwu largescalepostdisasteruserdistributedcoverageoptimizationbasedonmultiagentreinforcementlearning
AT fengyuwang largescalepostdisasteruserdistributedcoverageoptimizationbasedonmultiagentreinforcementlearning
AT lanlin largescalepostdisasteruserdistributedcoverageoptimizationbasedonmultiagentreinforcementlearning
AT guojunli largescalepostdisasteruserdistributedcoverageoptimizationbasedonmultiagentreinforcementlearning
AT zhizhang largescalepostdisasteruserdistributedcoverageoptimizationbasedonmultiagentreinforcementlearning