Function approximation method based on weights gradient descent in reinforcement learning

Function approximation has gained significant attention in reinforcement learning research as it effectively addresses problems with large-scale, continuous state, and action space.Although the function approximation algorithm based on gradient descent method is one of the most widely used methods i...

Full description

Saved in:

Bibliographic Details
Main Authors:	Xiaoyan QIN, Yuhan LIU, Yunlong XU, Bin LI
Format:	Article
Language:	English
Published:	POSTS&TELECOM PRESS Co., LTD 2023-08-01
Series:	网络与信息安全学报
Subjects:	function approximation reinforcement learning gradient descent least-squares weights gradient descent
Online Access:	http://www.cjnis.com.cn/thesisDetails#10.11959/j.issn.2096-109x.2023050
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1841529617597333504
author	Xiaoyan QIN Yuhan LIU Yunlong XU Bin LI
author_facet	Xiaoyan QIN Yuhan LIU Yunlong XU Bin LI
author_sort	Xiaoyan QIN
collection	DOAJ
description	Function approximation has gained significant attention in reinforcement learning research as it effectively addresses problems with large-scale, continuous state, and action space.Although the function approximation algorithm based on gradient descent method is one of the most widely used methods in reinforcement learning, it requires careful tuning of the step size parameter as an inappropriate value can lead to slow convergence, unstable convergence, or even divergence.To address these issues, an improvement was made around the temporal-difference (TD) algorithm based on function approximation.The weight update method was enhanced using both the least squares method and gradient descent, resulting in the proposed weights gradient descent (WGD) method.The least squares were used to calculate the weights, combining the ideas of TD and gradient descent to find the error between the weights.And this error was used to directly update the weights.By this method, the weights were updated in a new manner, effectively reducing the consumption of computing resources by the algorithm enhancing other gradient descent-based function approximation algorithms.The WGD method is widely applicable in various gradient descent-based reinforcement learning algorithms.The results show that WGD method can adjust parameters within a wider space, effectively reducing the possibility of algorithm divergence.Additionally, it achieves better performance while improving the convergence speed of the algorithm.
format	Article
id	doaj-art-7b09051c0c7d4edfb58afbf55cbc9f0f
institution	Kabale University
issn	2096-109X
language	English
publishDate	2023-08-01
publisher	POSTS&TELECOM PRESS Co., LTD
record_format	Article
series	网络与信息安全学报
spelling	doaj-art-7b09051c0c7d4edfb58afbf55cbc9f0f2025-01-15T03:16:43ZengPOSTS&TELECOM PRESS Co., LTD网络与信息安全学报2096-109X2023-08-019162859579211Function approximation method based on weights gradient descent in reinforcement learningXiaoyan QINYuhan LIUYunlong XUBin LIFunction approximation has gained significant attention in reinforcement learning research as it effectively addresses problems with large-scale, continuous state, and action space.Although the function approximation algorithm based on gradient descent method is one of the most widely used methods in reinforcement learning, it requires careful tuning of the step size parameter as an inappropriate value can lead to slow convergence, unstable convergence, or even divergence.To address these issues, an improvement was made around the temporal-difference (TD) algorithm based on function approximation.The weight update method was enhanced using both the least squares method and gradient descent, resulting in the proposed weights gradient descent (WGD) method.The least squares were used to calculate the weights, combining the ideas of TD and gradient descent to find the error between the weights.And this error was used to directly update the weights.By this method, the weights were updated in a new manner, effectively reducing the consumption of computing resources by the algorithm enhancing other gradient descent-based function approximation algorithms.The WGD method is widely applicable in various gradient descent-based reinforcement learning algorithms.The results show that WGD method can adjust parameters within a wider space, effectively reducing the possibility of algorithm divergence.Additionally, it achieves better performance while improving the convergence speed of the algorithm.http://www.cjnis.com.cn/thesisDetails#10.11959/j.issn.2096-109x.2023050function approximationreinforcement learninggradient descentleast-squaresweights gradient descent
spellingShingle	Xiaoyan QIN Yuhan LIU Yunlong XU Bin LI Function approximation method based on weights gradient descent in reinforcement learning 网络与信息安全学报 function approximation reinforcement learning gradient descent least-squares weights gradient descent
title	Function approximation method based on weights gradient descent in reinforcement learning
title_full	Function approximation method based on weights gradient descent in reinforcement learning
title_fullStr	Function approximation method based on weights gradient descent in reinforcement learning
title_full_unstemmed	Function approximation method based on weights gradient descent in reinforcement learning
title_short	Function approximation method based on weights gradient descent in reinforcement learning
title_sort	function approximation method based on weights gradient descent in reinforcement learning
topic	function approximation reinforcement learning gradient descent least-squares weights gradient descent
url	http://www.cjnis.com.cn/thesisDetails#10.11959/j.issn.2096-109x.2023050
work_keys_str_mv	AT xiaoyanqin functionapproximationmethodbasedonweightsgradientdescentinreinforcementlearning AT yuhanliu functionapproximationmethodbasedonweightsgradientdescentinreinforcementlearning AT yunlongxu functionapproximationmethodbasedonweightsgradientdescentinreinforcementlearning AT binli functionapproximationmethodbasedonweightsgradientdescentinreinforcementlearning

Function approximation method based on weights gradient descent in reinforcement learning

Similar Items