Dynamic Path Planning for Vehicles Based on Causal State-Masking Deep Reinforcement Learning

Dynamic path planning enables vehicles to autonomously navigate in unknown or continuously changing environments, thereby reducing reliance on fixed maps. Deep reinforcement learning (DRL), with its superior performance in handling high-dimensional state spaces and complex dynamic environments, has...

Full description

Saved in:
Bibliographic Details
Main Authors: Xia Hua, Tengteng Zhang, Jun Cao
Format: Article
Language:English
Published: MDPI AG 2025-03-01
Series:Algorithms
Subjects:
Online Access:https://www.mdpi.com/1999-4893/18/3/146
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Dynamic path planning enables vehicles to autonomously navigate in unknown or continuously changing environments, thereby reducing reliance on fixed maps. Deep reinforcement learning (DRL), with its superior performance in handling high-dimensional state spaces and complex dynamic environments, has been widely applied to dynamic path planning. Traditional DRL methods are prone to capturing unnecessary noise information and irrelevant features during the training process, leading to instability and decreased adaptability of models in complex dynamic environments. To address this challenge, we propose a dynamic path-planning method based on our Causal State-Masking Twin-delayed Deep Deterministic Policy Gradient (CSM-TD3) algorithm. CSM-TD3 integrates a causal inference mechanism by introducing dynamic state masks and intervention mechanisms, allowing the policy network to focus on genuine causal features for decision optimization and thereby enhancing the convergence speed and generalization capabilities of the agent. Furthermore, causal state-masking DRL allows the system to learn the optimal mask configurations through backpropagation, enabling the model to adaptively adjust the causal features of interest. Extensive experimental results demonstrate that this method significantly enhances the convergence of the TD3 algorithm and effectively improves its performance in dynamic path planning.
ISSN:1999-4893