Gradient descent Sarsa(?)algorithm based on the adaptive potential function shaping reward mechanism
In the reinforcement leaning tasks with continuous state spaces,the algorithms are usually facing the problems of ill initial performance and low convergence speed.In order to solve these problems,the potential function shaping reward mechanism was proposed to improve the reinforcement learning algo...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | zho |
Published: |
Editorial Department of Journal on Communications
2013-01-01
|
Series: | Tongxin xuebao |
Subjects: | |
Online Access: | http://www.joconline.com.cn/zh/article/doi/1000-436X(2013)01-0077-12/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1841539878569902080 |
---|---|
author | Fei XIAO Quan LIU Qi-ming FU Hong-kun SUN Long GAO |
author_facet | Fei XIAO Quan LIU Qi-ming FU Hong-kun SUN Long GAO |
author_sort | Fei XIAO |
collection | DOAJ |
description | In the reinforcement leaning tasks with continuous state spaces,the algorithms are usually facing the problems of ill initial performance and low convergence speed.In order to solve these problems,the potential function shaping reward mechanism was proposed to improve the reinforcement learning algorithms.This mechanism propagates model knowledge to the learner adaptively in the form of the additional reward signal,so that the initial performance and convergence speed could be improved effectively.In view of the good performance and existing problems of the radial basis function (RBF) network,the adaptive normalized RBF (ANRBF) network was put forward to use as a potential function to generate the shaping rewards.A gradient descent (GD)algorithm named ANRBF-GD-Sarsa(?) was proposed based on the ANRBF network.The convergence of ANRBF-GD-Sarsa(?) algorithm was analyzed theoretically.Extensive experiments are conducted to show the good initial performance and high convergence speed of the proposed algorithm. |
format | Article |
id | doaj-art-c2f3497b1e744b6b9711b5b1ebad8be7 |
institution | Kabale University |
issn | 1000-436X |
language | zho |
publishDate | 2013-01-01 |
publisher | Editorial Department of Journal on Communications |
record_format | Article |
series | Tongxin xuebao |
spelling | doaj-art-c2f3497b1e744b6b9711b5b1ebad8be72025-01-14T06:34:08ZzhoEditorial Department of Journal on CommunicationsTongxin xuebao1000-436X2013-01-0134778959668225Gradient descent Sarsa(?)algorithm based on the adaptive potential function shaping reward mechanismFei XIAOQuan LIUQi-ming FUHong-kun SUNLong GAOIn the reinforcement leaning tasks with continuous state spaces,the algorithms are usually facing the problems of ill initial performance and low convergence speed.In order to solve these problems,the potential function shaping reward mechanism was proposed to improve the reinforcement learning algorithms.This mechanism propagates model knowledge to the learner adaptively in the form of the additional reward signal,so that the initial performance and convergence speed could be improved effectively.In view of the good performance and existing problems of the radial basis function (RBF) network,the adaptive normalized RBF (ANRBF) network was put forward to use as a potential function to generate the shaping rewards.A gradient descent (GD)algorithm named ANRBF-GD-Sarsa(?) was proposed based on the ANRBF network.The convergence of ANRBF-GD-Sarsa(?) algorithm was analyzed theoretically.Extensive experiments are conducted to show the good initial performance and high convergence speed of the proposed algorithm.http://www.joconline.com.cn/zh/article/doi/1000-436X(2013)01-0077-12/reinforcement learningSarsa(?)gradient descentpotential functionshaping reward |
spellingShingle | Fei XIAO Quan LIU Qi-ming FU Hong-kun SUN Long GAO Gradient descent Sarsa(?)algorithm based on the adaptive potential function shaping reward mechanism Tongxin xuebao reinforcement learning Sarsa(?) gradient descent potential function shaping reward |
title | Gradient descent Sarsa(?)algorithm based on the adaptive potential function shaping reward mechanism |
title_full | Gradient descent Sarsa(?)algorithm based on the adaptive potential function shaping reward mechanism |
title_fullStr | Gradient descent Sarsa(?)algorithm based on the adaptive potential function shaping reward mechanism |
title_full_unstemmed | Gradient descent Sarsa(?)algorithm based on the adaptive potential function shaping reward mechanism |
title_short | Gradient descent Sarsa(?)algorithm based on the adaptive potential function shaping reward mechanism |
title_sort | gradient descent sarsa algorithm based on the adaptive potential function shaping reward mechanism |
topic | reinforcement learning Sarsa(?) gradient descent potential function shaping reward |
url | http://www.joconline.com.cn/zh/article/doi/1000-436X(2013)01-0077-12/ |
work_keys_str_mv | AT feixiao gradientdescentsarsaalgorithmbasedontheadaptivepotentialfunctionshapingrewardmechanism AT quanliu gradientdescentsarsaalgorithmbasedontheadaptivepotentialfunctionshapingrewardmechanism AT qimingfu gradientdescentsarsaalgorithmbasedontheadaptivepotentialfunctionshapingrewardmechanism AT hongkunsun gradientdescentsarsaalgorithmbasedontheadaptivepotentialfunctionshapingrewardmechanism AT longgao gradientdescentsarsaalgorithmbasedontheadaptivepotentialfunctionshapingrewardmechanism |