Path Planning of Mobile Robot in Dynamic Obstacle Avoidance Environment Based on Deep Reinforcement Learning

In this study, to address the issues faced by mobile robots in complex environments, such as sparse rewards caused by limited effective experience, slow learning efficiency in the early stages of training, as well as poor obstacle avoidance performance in environments with dynamic obstacles, the aut...

Full description

Saved in:

Bibliographic Details
Main Authors:	Qingfeng Zhang, Wenpeng Ma, Qingchun Zheng, Xiaofan Zhai, Wenqian Zhang, Tianchang Zhang, Shuo Wang
Format:	Article
Language:	English
Published:	IEEE 2024-01-01
Series:	IEEE Access
Subjects:	Intrinsic curiosity module LSTM network mobile robot path planning reinforcement learning
Online Access:	https://ieeexplore.ieee.org/document/10769446/
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1846113859271655424
author	Qingfeng Zhang Wenpeng Ma Qingchun Zheng Xiaofan Zhai Wenqian Zhang Tianchang Zhang Shuo Wang
author_facet	Qingfeng Zhang Wenpeng Ma Qingchun Zheng Xiaofan Zhai Wenqian Zhang Tianchang Zhang Shuo Wang
author_sort	Qingfeng Zhang
collection	DOAJ
description	In this study, to address the issues faced by mobile robots in complex environments, such as sparse rewards caused by limited effective experience, slow learning efficiency in the early stages of training, as well as poor obstacle avoidance performance in environments with dynamic obstacles, the authors proposed a new path planning algorithm for mobile robots by introducing Intrinsic Curiosity Module (ICM) and Long Short-Term Memory (LSTM) into the Proximal Policy Optimization (PPO) algorithm. ICM provided intrinsic rewards in addition to external rewards, accelerating the initial convergence speed. And the Actor-Critic network was optimized by employing a LSTM-based neural network to enhance the performance of avoiding dynamic obstacles. Then, various simulation experiments were conducted in Gazebo with scenarios featuring both static and dynamic obstacles, and the TurtleBot3 mobile robot was used for experimental verification. The experiments demonstrate that the proposed algorithm significantly accelerates convergence in environments with sparse rewards compared to traditional algorithms, and the robot can find more target points within a single episode, indicating more effective path planning ability. Additionally, it can avoid obstacles in various states. Finally, the effectiveness of the algorithm was validated using the TurtleBot3 physical robot in real-world scenarios. Results show that compared to Deep Deterministic Policy Gradient (DDPG), PPO, LSTM-PPO and ICM-PPO, the success rate of the proposed algorithm in path planning increases by 9%, 7.2%, 4% and 4.8% in the most complex simulation environment and by 20%, 16%, 10% and 12% in the physical environment, respectively.
format	Article
id	doaj-art-512df81fc2624bbeb59d1936dc10bbd8
institution	Kabale University
issn	2169-3536
language	English
publishDate	2024-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj-art-512df81fc2624bbeb59d1936dc10bbd82024-12-21T00:01:00ZengIEEEIEEE Access2169-35362024-01-011218913618915210.1109/ACCESS.2024.350701610769446Path Planning of Mobile Robot in Dynamic Obstacle Avoidance Environment Based on Deep Reinforcement LearningQingfeng Zhang0https://orcid.org/0009-0004-6895-5353Wenpeng Ma1https://orcid.org/0000-0002-9714-2544Qingchun Zheng2Xiaofan Zhai3Wenqian Zhang4Tianchang Zhang5Shuo Wang6Tianjin Key Laboratory for Advanced Mechatronic System Design and Intelligent Control, School of Mechanical Engineering, Tianjin University of Technology, Tianjin, ChinaTianjin Key Laboratory for Advanced Mechatronic System Design and Intelligent Control, School of Mechanical Engineering, Tianjin University of Technology, Tianjin, ChinaTianjin Key Laboratory for Advanced Mechatronic System Design and Intelligent Control, School of Mechanical Engineering, Tianjin University of Technology, Tianjin, ChinaTianjin Key Laboratory for Advanced Mechatronic System Design and Intelligent Control, School of Mechanical Engineering, Tianjin University of Technology, Tianjin, ChinaTianjin Key Laboratory for Advanced Mechatronic System Design and Intelligent Control, School of Mechanical Engineering, Tianjin University of Technology, Tianjin, ChinaTianjin Key Laboratory for Advanced Mechatronic System Design and Intelligent Control, School of Mechanical Engineering, Tianjin University of Technology, Tianjin, ChinaTianjin Key Laboratory for Advanced Mechatronic System Design and Intelligent Control, School of Mechanical Engineering, Tianjin University of Technology, Tianjin, ChinaIn this study, to address the issues faced by mobile robots in complex environments, such as sparse rewards caused by limited effective experience, slow learning efficiency in the early stages of training, as well as poor obstacle avoidance performance in environments with dynamic obstacles, the authors proposed a new path planning algorithm for mobile robots by introducing Intrinsic Curiosity Module (ICM) and Long Short-Term Memory (LSTM) into the Proximal Policy Optimization (PPO) algorithm. ICM provided intrinsic rewards in addition to external rewards, accelerating the initial convergence speed. And the Actor-Critic network was optimized by employing a LSTM-based neural network to enhance the performance of avoiding dynamic obstacles. Then, various simulation experiments were conducted in Gazebo with scenarios featuring both static and dynamic obstacles, and the TurtleBot3 mobile robot was used for experimental verification. The experiments demonstrate that the proposed algorithm significantly accelerates convergence in environments with sparse rewards compared to traditional algorithms, and the robot can find more target points within a single episode, indicating more effective path planning ability. Additionally, it can avoid obstacles in various states. Finally, the effectiveness of the algorithm was validated using the TurtleBot3 physical robot in real-world scenarios. Results show that compared to Deep Deterministic Policy Gradient (DDPG), PPO, LSTM-PPO and ICM-PPO, the success rate of the proposed algorithm in path planning increases by 9%, 7.2%, 4% and 4.8% in the most complex simulation environment and by 20%, 16%, 10% and 12% in the physical environment, respectively.https://ieeexplore.ieee.org/document/10769446/Intrinsic curiosity moduleLSTM networkmobile robotpath planningreinforcement learning
spellingShingle	Qingfeng Zhang Wenpeng Ma Qingchun Zheng Xiaofan Zhai Wenqian Zhang Tianchang Zhang Shuo Wang Path Planning of Mobile Robot in Dynamic Obstacle Avoidance Environment Based on Deep Reinforcement Learning IEEE Access Intrinsic curiosity module LSTM network mobile robot path planning reinforcement learning
title	Path Planning of Mobile Robot in Dynamic Obstacle Avoidance Environment Based on Deep Reinforcement Learning
title_full	Path Planning of Mobile Robot in Dynamic Obstacle Avoidance Environment Based on Deep Reinforcement Learning
title_fullStr	Path Planning of Mobile Robot in Dynamic Obstacle Avoidance Environment Based on Deep Reinforcement Learning
title_full_unstemmed	Path Planning of Mobile Robot in Dynamic Obstacle Avoidance Environment Based on Deep Reinforcement Learning
title_short	Path Planning of Mobile Robot in Dynamic Obstacle Avoidance Environment Based on Deep Reinforcement Learning
title_sort	path planning of mobile robot in dynamic obstacle avoidance environment based on deep reinforcement learning
topic	Intrinsic curiosity module LSTM network mobile robot path planning reinforcement learning
url	https://ieeexplore.ieee.org/document/10769446/
work_keys_str_mv	AT qingfengzhang pathplanningofmobilerobotindynamicobstacleavoidanceenvironmentbasedondeepreinforcementlearning AT wenpengma pathplanningofmobilerobotindynamicobstacleavoidanceenvironmentbasedondeepreinforcementlearning AT qingchunzheng pathplanningofmobilerobotindynamicobstacleavoidanceenvironmentbasedondeepreinforcementlearning AT xiaofanzhai pathplanningofmobilerobotindynamicobstacleavoidanceenvironmentbasedondeepreinforcementlearning AT wenqianzhang pathplanningofmobilerobotindynamicobstacleavoidanceenvironmentbasedondeepreinforcementlearning AT tianchangzhang pathplanningofmobilerobotindynamicobstacleavoidanceenvironmentbasedondeepreinforcementlearning AT shuowang pathplanningofmobilerobotindynamicobstacleavoidanceenvironmentbasedondeepreinforcementlearning

Path Planning of Mobile Robot in Dynamic Obstacle Avoidance Environment Based on Deep Reinforcement Learning

Similar Items