Actor-critic algorithm with incremental dual natural policy gradient

The existed algorithms for continuous action space failed to consider the way of selecting optimal action and utilizing the knowledge of the action space,so an efficient actor-critic algorithm was proposed by improving the natural gradient.The objective of the proposed algorithm was to maximize the...

Full description

Saved in:

Bibliographic Details
Main Authors:	Peng ZHANG, Quan LIU, Shan ZHONG, Jian-wei ZHAI, Wei-sheng QIAN
Format:	Article
Language:	zho
Published:	Editorial Department of Journal on Communications 2017-04-01
Series:	Tongxin xuebao
Subjects:	reinforcement learning natural gradient actor-critic continuous space
Online Access:	http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000-436x.2017089/
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	The existed algorithms for continuous action space failed to consider the way of selecting optimal action and utilizing the knowledge of the action space,so an efficient actor-critic algorithm was proposed by improving the natural gradient.The objective of the proposed algorithm was to maximize the expected return.Upper and the lower bounds of the action range were weighted to obtain the optimal action.The two bounds were approximated by linear function.Afterward,the problem of obtaining the optimal action was transferred to the learning of double policy parameter vectors.To speed the learning,the incremental Fisher information matrix and the eligibilities of both bounds were designed.At three reinforcement learning problems,compared with other representative methods with continuous action space,the simulation results show that the proposed algorithm has the advantages of rapid convergence rate and high convergence stability.
ISSN:	1000-436X

Actor-critic algorithm with incremental dual natural policy gradient

Similar Items