Actor-critic algorithm with incremental dual natural policy gradient

The existed algorithms for continuous action space failed to consider the way of selecting optimal action and utilizing the knowledge of the action space,so an efficient actor-critic algorithm was proposed by improving the natural gradient.The objective of the proposed algorithm was to maximize the...

Full description

Saved in:

Bibliographic Details
Main Authors:	Peng ZHANG, Quan LIU, Shan ZHONG, Jian-wei ZHAI, Wei-sheng QIAN
Format:	Article
Language:	zho
Published:	Editorial Department of Journal on Communications 2017-04-01
Series:	Tongxin xuebao
Subjects:	reinforcement learning natural gradient actor-critic continuous space
Online Access:	http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000-436x.2017089/
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1841539528351809536
author	Peng ZHANG Quan LIU Shan ZHONG Jian-wei ZHAI Wei-sheng QIAN
author_facet	Peng ZHANG Quan LIU Shan ZHONG Jian-wei ZHAI Wei-sheng QIAN
author_sort	Peng ZHANG
collection	DOAJ
description	The existed algorithms for continuous action space failed to consider the way of selecting optimal action and utilizing the knowledge of the action space,so an efficient actor-critic algorithm was proposed by improving the natural gradient.The objective of the proposed algorithm was to maximize the expected return.Upper and the lower bounds of the action range were weighted to obtain the optimal action.The two bounds were approximated by linear function.Afterward,the problem of obtaining the optimal action was transferred to the learning of double policy parameter vectors.To speed the learning,the incremental Fisher information matrix and the eligibilities of both bounds were designed.At three reinforcement learning problems,compared with other representative methods with continuous action space,the simulation results show that the proposed algorithm has the advantages of rapid convergence rate and high convergence stability.
format	Article
id	doaj-art-a7c5f9298dbb44828af3d720c4972c3e
institution	Kabale University
issn	1000-436X
language	zho
publishDate	2017-04-01
publisher	Editorial Department of Journal on Communications
record_format	Article
series	Tongxin xuebao
spelling	doaj-art-a7c5f9298dbb44828af3d720c4972c3e2025-01-14T07:12:06ZzhoEditorial Department of Journal on CommunicationsTongxin xuebao1000-436X2017-04-013816617759709336Actor-critic algorithm with incremental dual natural policy gradientPeng ZHANGQuan LIUShan ZHONGJian-wei ZHAIWei-sheng QIANThe existed algorithms for continuous action space failed to consider the way of selecting optimal action and utilizing the knowledge of the action space,so an efficient actor-critic algorithm was proposed by improving the natural gradient.The objective of the proposed algorithm was to maximize the expected return.Upper and the lower bounds of the action range were weighted to obtain the optimal action.The two bounds were approximated by linear function.Afterward,the problem of obtaining the optimal action was transferred to the learning of double policy parameter vectors.To speed the learning,the incremental Fisher information matrix and the eligibilities of both bounds were designed.At three reinforcement learning problems,compared with other representative methods with continuous action space,the simulation results show that the proposed algorithm has the advantages of rapid convergence rate and high convergence stability.http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000-436x.2017089/reinforcement learningnatural gradientactor-criticcontinuous space
spellingShingle	Peng ZHANG Quan LIU Shan ZHONG Jian-wei ZHAI Wei-sheng QIAN Actor-critic algorithm with incremental dual natural policy gradient Tongxin xuebao reinforcement learning natural gradient actor-critic continuous space
title	Actor-critic algorithm with incremental dual natural policy gradient
title_full	Actor-critic algorithm with incremental dual natural policy gradient
title_fullStr	Actor-critic algorithm with incremental dual natural policy gradient
title_full_unstemmed	Actor-critic algorithm with incremental dual natural policy gradient
title_short	Actor-critic algorithm with incremental dual natural policy gradient
title_sort	actor critic algorithm with incremental dual natural policy gradient
topic	reinforcement learning natural gradient actor-critic continuous space
url	http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000-436x.2017089/
work_keys_str_mv	AT pengzhang actorcriticalgorithmwithincrementaldualnaturalpolicygradient AT quanliu actorcriticalgorithmwithincrementaldualnaturalpolicygradient AT shanzhong actorcriticalgorithmwithincrementaldualnaturalpolicygradient AT jianweizhai actorcriticalgorithmwithincrementaldualnaturalpolicygradient AT weishengqian actorcriticalgorithmwithincrementaldualnaturalpolicygradient

Actor-critic algorithm with incremental dual natural policy gradient

Similar Items