Gradient descent Sarsa(?)algorithm based on the adaptive potential function shaping reward mechanism

In the reinforcement leaning tasks with continuous state spaces,the algorithms are usually facing the problems of ill initial performance and low convergence speed.In order to solve these problems,the potential function shaping reward mechanism was proposed to improve the reinforcement learning algo...

Full description

Saved in:
Bibliographic Details
Main Authors: Fei XIAO, Quan LIU, Qi-ming FU, Hong-kun SUN, Long GAO
Format: Article
Language:zho
Published: Editorial Department of Journal on Communications 2013-01-01
Series:Tongxin xuebao
Subjects:
Online Access:http://www.joconline.com.cn/zh/article/doi/1000-436X(2013)01-0077-12/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841539878569902080
author Fei XIAO
Quan LIU
Qi-ming FU
Hong-kun SUN
Long GAO
author_facet Fei XIAO
Quan LIU
Qi-ming FU
Hong-kun SUN
Long GAO
author_sort Fei XIAO
collection DOAJ
description In the reinforcement leaning tasks with continuous state spaces,the algorithms are usually facing the problems of ill initial performance and low convergence speed.In order to solve these problems,the potential function shaping reward mechanism was proposed to improve the reinforcement learning algorithms.This mechanism propagates model knowledge to the learner adaptively in the form of the additional reward signal,so that the initial performance and convergence speed could be improved effectively.In view of the good performance and existing problems of the radial basis function (RBF) network,the adaptive normalized RBF (ANRBF) network was put forward to use as a potential function to generate the shaping rewards.A gradient descent (GD)algorithm named ANRBF-GD-Sarsa(?) was proposed based on the ANRBF network.The convergence of ANRBF-GD-Sarsa(?) algorithm was analyzed theoretically.Extensive experiments are conducted to show the good initial performance and high convergence speed of the proposed algorithm.
format Article
id doaj-art-c2f3497b1e744b6b9711b5b1ebad8be7
institution Kabale University
issn 1000-436X
language zho
publishDate 2013-01-01
publisher Editorial Department of Journal on Communications
record_format Article
series Tongxin xuebao
spelling doaj-art-c2f3497b1e744b6b9711b5b1ebad8be72025-01-14T06:34:08ZzhoEditorial Department of Journal on CommunicationsTongxin xuebao1000-436X2013-01-0134778959668225Gradient descent Sarsa(?)algorithm based on the adaptive potential function shaping reward mechanismFei XIAOQuan LIUQi-ming FUHong-kun SUNLong GAOIn the reinforcement leaning tasks with continuous state spaces,the algorithms are usually facing the problems of ill initial performance and low convergence speed.In order to solve these problems,the potential function shaping reward mechanism was proposed to improve the reinforcement learning algorithms.This mechanism propagates model knowledge to the learner adaptively in the form of the additional reward signal,so that the initial performance and convergence speed could be improved effectively.In view of the good performance and existing problems of the radial basis function (RBF) network,the adaptive normalized RBF (ANRBF) network was put forward to use as a potential function to generate the shaping rewards.A gradient descent (GD)algorithm named ANRBF-GD-Sarsa(?) was proposed based on the ANRBF network.The convergence of ANRBF-GD-Sarsa(?) algorithm was analyzed theoretically.Extensive experiments are conducted to show the good initial performance and high convergence speed of the proposed algorithm.http://www.joconline.com.cn/zh/article/doi/1000-436X(2013)01-0077-12/reinforcement learningSarsa(?)gradient descentpotential functionshaping reward
spellingShingle Fei XIAO
Quan LIU
Qi-ming FU
Hong-kun SUN
Long GAO
Gradient descent Sarsa(?)algorithm based on the adaptive potential function shaping reward mechanism
Tongxin xuebao
reinforcement learning
Sarsa(?)
gradient descent
potential function
shaping reward
title Gradient descent Sarsa(?)algorithm based on the adaptive potential function shaping reward mechanism
title_full Gradient descent Sarsa(?)algorithm based on the adaptive potential function shaping reward mechanism
title_fullStr Gradient descent Sarsa(?)algorithm based on the adaptive potential function shaping reward mechanism
title_full_unstemmed Gradient descent Sarsa(?)algorithm based on the adaptive potential function shaping reward mechanism
title_short Gradient descent Sarsa(?)algorithm based on the adaptive potential function shaping reward mechanism
title_sort gradient descent sarsa algorithm based on the adaptive potential function shaping reward mechanism
topic reinforcement learning
Sarsa(?)
gradient descent
potential function
shaping reward
url http://www.joconline.com.cn/zh/article/doi/1000-436X(2013)01-0077-12/
work_keys_str_mv AT feixiao gradientdescentsarsaalgorithmbasedontheadaptivepotentialfunctionshapingrewardmechanism
AT quanliu gradientdescentsarsaalgorithmbasedontheadaptivepotentialfunctionshapingrewardmechanism
AT qimingfu gradientdescentsarsaalgorithmbasedontheadaptivepotentialfunctionshapingrewardmechanism
AT hongkunsun gradientdescentsarsaalgorithmbasedontheadaptivepotentialfunctionshapingrewardmechanism
AT longgao gradientdescentsarsaalgorithmbasedontheadaptivepotentialfunctionshapingrewardmechanism