A Fast Adaptive AUV Control Policy Based on Progressive Networks with Context Information
Deep reinforcement learning models have the advantage of being able to control nonlinear systems in an end-to-end manner. However, reinforcement learning controllers trained in simulation environments often perform poorly with real robots and are unable to cope with situations where the dynamics of...
        Saved in:
      
    
          | Main Authors: | , , , , , | 
|---|---|
| Format: | Article | 
| Language: | English | 
| Published: | MDPI AG
    
        2024-11-01 | 
| Series: | Journal of Marine Science and Engineering | 
| Subjects: | |
| Online Access: | https://www.mdpi.com/2077-1312/12/12/2159 | 
| Tags: | Add Tag 
      No Tags, Be the first to tag this record!
   | 
| _version_ | 1846104163946070016 | 
|---|---|
| author | Chunhui Xu Tian Fang Desheng Xu Shilin Yang Qifeng Zhang Shuo Li | 
| author_facet | Chunhui Xu Tian Fang Desheng Xu Shilin Yang Qifeng Zhang Shuo Li | 
| author_sort | Chunhui Xu | 
| collection | DOAJ | 
| description | Deep reinforcement learning models have the advantage of being able to control nonlinear systems in an end-to-end manner. However, reinforcement learning controllers trained in simulation environments often perform poorly with real robots and are unable to cope with situations where the dynamics of the controlled object change. In this paper, we propose a DRL control algorithm that combines progressive networks and context as a depth tracking controller for AUVs. Firstly, an embedding network that maps interaction history sequence data onto latent variables is connected to the input of the policy network, and the context generated by the network gives the DRL agent the ability to adapt to the environment online. Then, the model can be rapidly adapted to a new dynamic environment, which was represented by the presence of generalized force disturbances and changes in the mass of the AUV, through a two-stage training mechanism based on progressive neural networks. The results showed that the proposed algorithm was able to improve the robustness of the controller to environmental disturbances and achieve fast adaptation when there were differences in the dynamics. | 
| format | Article | 
| id | doaj-art-b01c834a635c4c868e254320194e46cc | 
| institution | Kabale University | 
| issn | 2077-1312 | 
| language | English | 
| publishDate | 2024-11-01 | 
| publisher | MDPI AG | 
| record_format | Article | 
| series | Journal of Marine Science and Engineering | 
| spelling | doaj-art-b01c834a635c4c868e254320194e46cc2024-12-27T14:33:07ZengMDPI AGJournal of Marine Science and Engineering2077-13122024-11-011212215910.3390/jmse12122159A Fast Adaptive AUV Control Policy Based on Progressive Networks with Context InformationChunhui Xu0Tian Fang1Desheng Xu2Shilin Yang3Qifeng Zhang4Shuo Li5State Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, ChinaState Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, ChinaState Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, ChinaState Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, ChinaState Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, ChinaState Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, ChinaDeep reinforcement learning models have the advantage of being able to control nonlinear systems in an end-to-end manner. However, reinforcement learning controllers trained in simulation environments often perform poorly with real robots and are unable to cope with situations where the dynamics of the controlled object change. In this paper, we propose a DRL control algorithm that combines progressive networks and context as a depth tracking controller for AUVs. Firstly, an embedding network that maps interaction history sequence data onto latent variables is connected to the input of the policy network, and the context generated by the network gives the DRL agent the ability to adapt to the environment online. Then, the model can be rapidly adapted to a new dynamic environment, which was represented by the presence of generalized force disturbances and changes in the mass of the AUV, through a two-stage training mechanism based on progressive neural networks. The results showed that the proposed algorithm was able to improve the robustness of the controller to environmental disturbances and achieve fast adaptation when there were differences in the dynamics.https://www.mdpi.com/2077-1312/12/12/2159reinforcement learningAUVintelligent controlprogressive networkFARPPO | 
| spellingShingle | Chunhui Xu Tian Fang Desheng Xu Shilin Yang Qifeng Zhang Shuo Li A Fast Adaptive AUV Control Policy Based on Progressive Networks with Context Information Journal of Marine Science and Engineering reinforcement learning AUV intelligent control progressive network FARPPO | 
| title | A Fast Adaptive AUV Control Policy Based on Progressive Networks with Context Information | 
| title_full | A Fast Adaptive AUV Control Policy Based on Progressive Networks with Context Information | 
| title_fullStr | A Fast Adaptive AUV Control Policy Based on Progressive Networks with Context Information | 
| title_full_unstemmed | A Fast Adaptive AUV Control Policy Based on Progressive Networks with Context Information | 
| title_short | A Fast Adaptive AUV Control Policy Based on Progressive Networks with Context Information | 
| title_sort | fast adaptive auv control policy based on progressive networks with context information | 
| topic | reinforcement learning AUV intelligent control progressive network FARPPO | 
| url | https://www.mdpi.com/2077-1312/12/12/2159 | 
| work_keys_str_mv | AT chunhuixu afastadaptiveauvcontrolpolicybasedonprogressivenetworkswithcontextinformation AT tianfang afastadaptiveauvcontrolpolicybasedonprogressivenetworkswithcontextinformation AT deshengxu afastadaptiveauvcontrolpolicybasedonprogressivenetworkswithcontextinformation AT shilinyang afastadaptiveauvcontrolpolicybasedonprogressivenetworkswithcontextinformation AT qifengzhang afastadaptiveauvcontrolpolicybasedonprogressivenetworkswithcontextinformation AT shuoli afastadaptiveauvcontrolpolicybasedonprogressivenetworkswithcontextinformation AT chunhuixu fastadaptiveauvcontrolpolicybasedonprogressivenetworkswithcontextinformation AT tianfang fastadaptiveauvcontrolpolicybasedonprogressivenetworkswithcontextinformation AT deshengxu fastadaptiveauvcontrolpolicybasedonprogressivenetworkswithcontextinformation AT shilinyang fastadaptiveauvcontrolpolicybasedonprogressivenetworkswithcontextinformation AT qifengzhang fastadaptiveauvcontrolpolicybasedonprogressivenetworkswithcontextinformation AT shuoli fastadaptiveauvcontrolpolicybasedonprogressivenetworkswithcontextinformation | 
 
       