Deep Reinforcement Learning Algorithm with Long Short-Term Memory Network for Optimizing Unmanned Aerial Vehicle Information Transmission
The optimization of information transmission in unmanned aerial vehicles (UAVs) is essential for enhancing their operational efficiency across various applications. This issue is framed as a mixed-integer nonconvex optimization challenge, which traditional optimization algorithms and reinforcement l...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2024-12-01
|
Series: | Mathematics |
Subjects: | |
Online Access: | https://www.mdpi.com/2227-7390/13/1/46 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1841549156822286336 |
---|---|
author | Yufei He Ruiqi Hu Kewei Liang Yonghong Liu Zhiyuan Zhou |
author_facet | Yufei He Ruiqi Hu Kewei Liang Yonghong Liu Zhiyuan Zhou |
author_sort | Yufei He |
collection | DOAJ |
description | The optimization of information transmission in unmanned aerial vehicles (UAVs) is essential for enhancing their operational efficiency across various applications. This issue is framed as a mixed-integer nonconvex optimization challenge, which traditional optimization algorithms and reinforcement learning (RL) methods often struggle to address effectively. In this paper, we propose a novel deep reinforcement learning algorithm that utilizes a hybrid discrete–continuous action space. To address the long-term dependency issues inherent in UAV operations, we incorporate a long short-term memory (LSTM) network. Our approach accounts for the specific flight constraints of fixed-wing UAVs and employs a continuous policy network to facilitate real-time flight path planning. A non-sparse reward function is designed to maximize data collection from internet of things (IoT) devices, thus guiding the UAV to optimize its operational efficiency. Experimental results demonstrate that the proposed algorithm yields near-optimal flight paths and significantly improves data collection capabilities, compared to conventional heuristic methods, achieving an improvement of up to 10.76%. Validation through simulations confirms the effectiveness and practicality of the proposed approach in real-world scenarios. |
format | Article |
id | doaj-art-890fa7c094e442f69a128c6f48ea4737 |
institution | Kabale University |
issn | 2227-7390 |
language | English |
publishDate | 2024-12-01 |
publisher | MDPI AG |
record_format | Article |
series | Mathematics |
spelling | doaj-art-890fa7c094e442f69a128c6f48ea47372025-01-10T13:18:05ZengMDPI AGMathematics2227-73902024-12-011314610.3390/math13010046Deep Reinforcement Learning Algorithm with Long Short-Term Memory Network for Optimizing Unmanned Aerial Vehicle Information TransmissionYufei He0Ruiqi Hu1Kewei Liang2Yonghong Liu3Zhiyuan Zhou4Polytechnic Institute, Zhejiang University, Hangzhou 310015, ChinaDepartment of Applied Mathematics, Hong Kong Polytechnic University, Hong Kong, ChinaSchool of Mathematical Sciences, Zhejiang University, Hangzhou 310058, ChinaSchool of Mathematical Sciences, Zhejiang University, Hangzhou 310058, ChinaApplied Mathematics, Beijing Normal University—Hong Kong Baptist University United International College, Zhuhai 519087, ChinaThe optimization of information transmission in unmanned aerial vehicles (UAVs) is essential for enhancing their operational efficiency across various applications. This issue is framed as a mixed-integer nonconvex optimization challenge, which traditional optimization algorithms and reinforcement learning (RL) methods often struggle to address effectively. In this paper, we propose a novel deep reinforcement learning algorithm that utilizes a hybrid discrete–continuous action space. To address the long-term dependency issues inherent in UAV operations, we incorporate a long short-term memory (LSTM) network. Our approach accounts for the specific flight constraints of fixed-wing UAVs and employs a continuous policy network to facilitate real-time flight path planning. A non-sparse reward function is designed to maximize data collection from internet of things (IoT) devices, thus guiding the UAV to optimize its operational efficiency. Experimental results demonstrate that the proposed algorithm yields near-optimal flight paths and significantly improves data collection capabilities, compared to conventional heuristic methods, achieving an improvement of up to 10.76%. Validation through simulations confirms the effectiveness and practicality of the proposed approach in real-world scenarios.https://www.mdpi.com/2227-7390/13/1/46unmanned aerial vehicle (UAV)deep reinforcement learning (DRL)long short-term memory (LSTM)optimal controlnonconvex optimization |
spellingShingle | Yufei He Ruiqi Hu Kewei Liang Yonghong Liu Zhiyuan Zhou Deep Reinforcement Learning Algorithm with Long Short-Term Memory Network for Optimizing Unmanned Aerial Vehicle Information Transmission Mathematics unmanned aerial vehicle (UAV) deep reinforcement learning (DRL) long short-term memory (LSTM) optimal control nonconvex optimization |
title | Deep Reinforcement Learning Algorithm with Long Short-Term Memory Network for Optimizing Unmanned Aerial Vehicle Information Transmission |
title_full | Deep Reinforcement Learning Algorithm with Long Short-Term Memory Network for Optimizing Unmanned Aerial Vehicle Information Transmission |
title_fullStr | Deep Reinforcement Learning Algorithm with Long Short-Term Memory Network for Optimizing Unmanned Aerial Vehicle Information Transmission |
title_full_unstemmed | Deep Reinforcement Learning Algorithm with Long Short-Term Memory Network for Optimizing Unmanned Aerial Vehicle Information Transmission |
title_short | Deep Reinforcement Learning Algorithm with Long Short-Term Memory Network for Optimizing Unmanned Aerial Vehicle Information Transmission |
title_sort | deep reinforcement learning algorithm with long short term memory network for optimizing unmanned aerial vehicle information transmission |
topic | unmanned aerial vehicle (UAV) deep reinforcement learning (DRL) long short-term memory (LSTM) optimal control nonconvex optimization |
url | https://www.mdpi.com/2227-7390/13/1/46 |
work_keys_str_mv | AT yufeihe deepreinforcementlearningalgorithmwithlongshorttermmemorynetworkforoptimizingunmannedaerialvehicleinformationtransmission AT ruiqihu deepreinforcementlearningalgorithmwithlongshorttermmemorynetworkforoptimizingunmannedaerialvehicleinformationtransmission AT keweiliang deepreinforcementlearningalgorithmwithlongshorttermmemorynetworkforoptimizingunmannedaerialvehicleinformationtransmission AT yonghongliu deepreinforcementlearningalgorithmwithlongshorttermmemorynetworkforoptimizingunmannedaerialvehicleinformationtransmission AT zhiyuanzhou deepreinforcementlearningalgorithmwithlongshorttermmemorynetworkforoptimizingunmannedaerialvehicleinformationtransmission |