A Fast Adaptive AUV Control Policy Based on Progressive Networks with Context Information

Deep reinforcement learning models have the advantage of being able to control nonlinear systems in an end-to-end manner. However, reinforcement learning controllers trained in simulation environments often perform poorly with real robots and are unable to cope with situations where the dynamics of...

Full description

Saved in:
Bibliographic Details
Main Authors: Chunhui Xu, Tian Fang, Desheng Xu, Shilin Yang, Qifeng Zhang, Shuo Li
Format: Article
Language:English
Published: MDPI AG 2024-11-01
Series:Journal of Marine Science and Engineering
Subjects:
Online Access:https://www.mdpi.com/2077-1312/12/12/2159
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1846104163946070016
author Chunhui Xu
Tian Fang
Desheng Xu
Shilin Yang
Qifeng Zhang
Shuo Li
author_facet Chunhui Xu
Tian Fang
Desheng Xu
Shilin Yang
Qifeng Zhang
Shuo Li
author_sort Chunhui Xu
collection DOAJ
description Deep reinforcement learning models have the advantage of being able to control nonlinear systems in an end-to-end manner. However, reinforcement learning controllers trained in simulation environments often perform poorly with real robots and are unable to cope with situations where the dynamics of the controlled object change. In this paper, we propose a DRL control algorithm that combines progressive networks and context as a depth tracking controller for AUVs. Firstly, an embedding network that maps interaction history sequence data onto latent variables is connected to the input of the policy network, and the context generated by the network gives the DRL agent the ability to adapt to the environment online. Then, the model can be rapidly adapted to a new dynamic environment, which was represented by the presence of generalized force disturbances and changes in the mass of the AUV, through a two-stage training mechanism based on progressive neural networks. The results showed that the proposed algorithm was able to improve the robustness of the controller to environmental disturbances and achieve fast adaptation when there were differences in the dynamics.
format Article
id doaj-art-b01c834a635c4c868e254320194e46cc
institution Kabale University
issn 2077-1312
language English
publishDate 2024-11-01
publisher MDPI AG
record_format Article
series Journal of Marine Science and Engineering
spelling doaj-art-b01c834a635c4c868e254320194e46cc2024-12-27T14:33:07ZengMDPI AGJournal of Marine Science and Engineering2077-13122024-11-011212215910.3390/jmse12122159A Fast Adaptive AUV Control Policy Based on Progressive Networks with Context InformationChunhui Xu0Tian Fang1Desheng Xu2Shilin Yang3Qifeng Zhang4Shuo Li5State Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, ChinaState Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, ChinaState Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, ChinaState Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, ChinaState Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, ChinaState Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, ChinaDeep reinforcement learning models have the advantage of being able to control nonlinear systems in an end-to-end manner. However, reinforcement learning controllers trained in simulation environments often perform poorly with real robots and are unable to cope with situations where the dynamics of the controlled object change. In this paper, we propose a DRL control algorithm that combines progressive networks and context as a depth tracking controller for AUVs. Firstly, an embedding network that maps interaction history sequence data onto latent variables is connected to the input of the policy network, and the context generated by the network gives the DRL agent the ability to adapt to the environment online. Then, the model can be rapidly adapted to a new dynamic environment, which was represented by the presence of generalized force disturbances and changes in the mass of the AUV, through a two-stage training mechanism based on progressive neural networks. The results showed that the proposed algorithm was able to improve the robustness of the controller to environmental disturbances and achieve fast adaptation when there were differences in the dynamics.https://www.mdpi.com/2077-1312/12/12/2159reinforcement learningAUVintelligent controlprogressive networkFARPPO
spellingShingle Chunhui Xu
Tian Fang
Desheng Xu
Shilin Yang
Qifeng Zhang
Shuo Li
A Fast Adaptive AUV Control Policy Based on Progressive Networks with Context Information
Journal of Marine Science and Engineering
reinforcement learning
AUV
intelligent control
progressive network
FARPPO
title A Fast Adaptive AUV Control Policy Based on Progressive Networks with Context Information
title_full A Fast Adaptive AUV Control Policy Based on Progressive Networks with Context Information
title_fullStr A Fast Adaptive AUV Control Policy Based on Progressive Networks with Context Information
title_full_unstemmed A Fast Adaptive AUV Control Policy Based on Progressive Networks with Context Information
title_short A Fast Adaptive AUV Control Policy Based on Progressive Networks with Context Information
title_sort fast adaptive auv control policy based on progressive networks with context information
topic reinforcement learning
AUV
intelligent control
progressive network
FARPPO
url https://www.mdpi.com/2077-1312/12/12/2159
work_keys_str_mv AT chunhuixu afastadaptiveauvcontrolpolicybasedonprogressivenetworkswithcontextinformation
AT tianfang afastadaptiveauvcontrolpolicybasedonprogressivenetworkswithcontextinformation
AT deshengxu afastadaptiveauvcontrolpolicybasedonprogressivenetworkswithcontextinformation
AT shilinyang afastadaptiveauvcontrolpolicybasedonprogressivenetworkswithcontextinformation
AT qifengzhang afastadaptiveauvcontrolpolicybasedonprogressivenetworkswithcontextinformation
AT shuoli afastadaptiveauvcontrolpolicybasedonprogressivenetworkswithcontextinformation
AT chunhuixu fastadaptiveauvcontrolpolicybasedonprogressivenetworkswithcontextinformation
AT tianfang fastadaptiveauvcontrolpolicybasedonprogressivenetworkswithcontextinformation
AT deshengxu fastadaptiveauvcontrolpolicybasedonprogressivenetworkswithcontextinformation
AT shilinyang fastadaptiveauvcontrolpolicybasedonprogressivenetworkswithcontextinformation
AT qifengzhang fastadaptiveauvcontrolpolicybasedonprogressivenetworkswithcontextinformation
AT shuoli fastadaptiveauvcontrolpolicybasedonprogressivenetworkswithcontextinformation