Path Planning of Mobile Robot in Dynamic Obstacle Avoidance Environment Based on Deep Reinforcement Learning

In this study, to address the issues faced by mobile robots in complex environments, such as sparse rewards caused by limited effective experience, slow learning efficiency in the early stages of training, as well as poor obstacle avoidance performance in environments with dynamic obstacles, the aut...

Full description

Saved in:
Bibliographic Details
Main Authors: Qingfeng Zhang, Wenpeng Ma, Qingchun Zheng, Xiaofan Zhai, Wenqian Zhang, Tianchang Zhang, Shuo Wang
Format: Article
Language:English
Published: IEEE 2024-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10769446/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1846113859271655424
author Qingfeng Zhang
Wenpeng Ma
Qingchun Zheng
Xiaofan Zhai
Wenqian Zhang
Tianchang Zhang
Shuo Wang
author_facet Qingfeng Zhang
Wenpeng Ma
Qingchun Zheng
Xiaofan Zhai
Wenqian Zhang
Tianchang Zhang
Shuo Wang
author_sort Qingfeng Zhang
collection DOAJ
description In this study, to address the issues faced by mobile robots in complex environments, such as sparse rewards caused by limited effective experience, slow learning efficiency in the early stages of training, as well as poor obstacle avoidance performance in environments with dynamic obstacles, the authors proposed a new path planning algorithm for mobile robots by introducing Intrinsic Curiosity Module (ICM) and Long Short-Term Memory (LSTM) into the Proximal Policy Optimization (PPO) algorithm. ICM provided intrinsic rewards in addition to external rewards, accelerating the initial convergence speed. And the Actor-Critic network was optimized by employing a LSTM-based neural network to enhance the performance of avoiding dynamic obstacles. Then, various simulation experiments were conducted in Gazebo with scenarios featuring both static and dynamic obstacles, and the TurtleBot3 mobile robot was used for experimental verification. The experiments demonstrate that the proposed algorithm significantly accelerates convergence in environments with sparse rewards compared to traditional algorithms, and the robot can find more target points within a single episode, indicating more effective path planning ability. Additionally, it can avoid obstacles in various states. Finally, the effectiveness of the algorithm was validated using the TurtleBot3 physical robot in real-world scenarios. Results show that compared to Deep Deterministic Policy Gradient (DDPG), PPO, LSTM-PPO and ICM-PPO, the success rate of the proposed algorithm in path planning increases by 9%, 7.2%, 4% and 4.8% in the most complex simulation environment and by 20%, 16%, 10% and 12% in the physical environment, respectively.
format Article
id doaj-art-512df81fc2624bbeb59d1936dc10bbd8
institution Kabale University
issn 2169-3536
language English
publishDate 2024-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-512df81fc2624bbeb59d1936dc10bbd82024-12-21T00:01:00ZengIEEEIEEE Access2169-35362024-01-011218913618915210.1109/ACCESS.2024.350701610769446Path Planning of Mobile Robot in Dynamic Obstacle Avoidance Environment Based on Deep Reinforcement LearningQingfeng Zhang0https://orcid.org/0009-0004-6895-5353Wenpeng Ma1https://orcid.org/0000-0002-9714-2544Qingchun Zheng2Xiaofan Zhai3Wenqian Zhang4Tianchang Zhang5Shuo Wang6Tianjin Key Laboratory for Advanced Mechatronic System Design and Intelligent Control, School of Mechanical Engineering, Tianjin University of Technology, Tianjin, ChinaTianjin Key Laboratory for Advanced Mechatronic System Design and Intelligent Control, School of Mechanical Engineering, Tianjin University of Technology, Tianjin, ChinaTianjin Key Laboratory for Advanced Mechatronic System Design and Intelligent Control, School of Mechanical Engineering, Tianjin University of Technology, Tianjin, ChinaTianjin Key Laboratory for Advanced Mechatronic System Design and Intelligent Control, School of Mechanical Engineering, Tianjin University of Technology, Tianjin, ChinaTianjin Key Laboratory for Advanced Mechatronic System Design and Intelligent Control, School of Mechanical Engineering, Tianjin University of Technology, Tianjin, ChinaTianjin Key Laboratory for Advanced Mechatronic System Design and Intelligent Control, School of Mechanical Engineering, Tianjin University of Technology, Tianjin, ChinaTianjin Key Laboratory for Advanced Mechatronic System Design and Intelligent Control, School of Mechanical Engineering, Tianjin University of Technology, Tianjin, ChinaIn this study, to address the issues faced by mobile robots in complex environments, such as sparse rewards caused by limited effective experience, slow learning efficiency in the early stages of training, as well as poor obstacle avoidance performance in environments with dynamic obstacles, the authors proposed a new path planning algorithm for mobile robots by introducing Intrinsic Curiosity Module (ICM) and Long Short-Term Memory (LSTM) into the Proximal Policy Optimization (PPO) algorithm. ICM provided intrinsic rewards in addition to external rewards, accelerating the initial convergence speed. And the Actor-Critic network was optimized by employing a LSTM-based neural network to enhance the performance of avoiding dynamic obstacles. Then, various simulation experiments were conducted in Gazebo with scenarios featuring both static and dynamic obstacles, and the TurtleBot3 mobile robot was used for experimental verification. The experiments demonstrate that the proposed algorithm significantly accelerates convergence in environments with sparse rewards compared to traditional algorithms, and the robot can find more target points within a single episode, indicating more effective path planning ability. Additionally, it can avoid obstacles in various states. Finally, the effectiveness of the algorithm was validated using the TurtleBot3 physical robot in real-world scenarios. Results show that compared to Deep Deterministic Policy Gradient (DDPG), PPO, LSTM-PPO and ICM-PPO, the success rate of the proposed algorithm in path planning increases by 9%, 7.2%, 4% and 4.8% in the most complex simulation environment and by 20%, 16%, 10% and 12% in the physical environment, respectively.https://ieeexplore.ieee.org/document/10769446/Intrinsic curiosity moduleLSTM networkmobile robotpath planningreinforcement learning
spellingShingle Qingfeng Zhang
Wenpeng Ma
Qingchun Zheng
Xiaofan Zhai
Wenqian Zhang
Tianchang Zhang
Shuo Wang
Path Planning of Mobile Robot in Dynamic Obstacle Avoidance Environment Based on Deep Reinforcement Learning
IEEE Access
Intrinsic curiosity module
LSTM network
mobile robot
path planning
reinforcement learning
title Path Planning of Mobile Robot in Dynamic Obstacle Avoidance Environment Based on Deep Reinforcement Learning
title_full Path Planning of Mobile Robot in Dynamic Obstacle Avoidance Environment Based on Deep Reinforcement Learning
title_fullStr Path Planning of Mobile Robot in Dynamic Obstacle Avoidance Environment Based on Deep Reinforcement Learning
title_full_unstemmed Path Planning of Mobile Robot in Dynamic Obstacle Avoidance Environment Based on Deep Reinforcement Learning
title_short Path Planning of Mobile Robot in Dynamic Obstacle Avoidance Environment Based on Deep Reinforcement Learning
title_sort path planning of mobile robot in dynamic obstacle avoidance environment based on deep reinforcement learning
topic Intrinsic curiosity module
LSTM network
mobile robot
path planning
reinforcement learning
url https://ieeexplore.ieee.org/document/10769446/
work_keys_str_mv AT qingfengzhang pathplanningofmobilerobotindynamicobstacleavoidanceenvironmentbasedondeepreinforcementlearning
AT wenpengma pathplanningofmobilerobotindynamicobstacleavoidanceenvironmentbasedondeepreinforcementlearning
AT qingchunzheng pathplanningofmobilerobotindynamicobstacleavoidanceenvironmentbasedondeepreinforcementlearning
AT xiaofanzhai pathplanningofmobilerobotindynamicobstacleavoidanceenvironmentbasedondeepreinforcementlearning
AT wenqianzhang pathplanningofmobilerobotindynamicobstacleavoidanceenvironmentbasedondeepreinforcementlearning
AT tianchangzhang pathplanningofmobilerobotindynamicobstacleavoidanceenvironmentbasedondeepreinforcementlearning
AT shuowang pathplanningofmobilerobotindynamicobstacleavoidanceenvironmentbasedondeepreinforcementlearning