TD algorithm based on double-layer fuzzy partitioning
When dealing with the continuous space problems,the traditional Q-iteration algorithms based on lookup-table or function approximation converge slowly and are diff lt to get a continuous policy.To overcome the above weak-nesses,an on-policy TD algorithm named DFP-OPTD was proposed based on double-la...
Saved in:
Main Authors: | Xiang MU, Quan LIU, Qi-ming FU, Hong-kun SUN, Xin ZHOU |
---|---|
Format: | Article |
Language: | zho |
Published: |
Editorial Department of Journal on Communications
2013-10-01
|
Series: | Tongxin xuebao |
Subjects: | |
Online Access: | http://www.joconline.com.cn/zh/article/doi/10.3969/j.issn.1000-436x.2013.10.011/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
-
Function approximation method based on weights gradient descent in reinforcement learning
by: Xiaoyan QIN, et al.
Published: (2023-08-01) -
Gradient descent Sarsa(?)algorithm based on the adaptive potential function shaping reward mechanism
by: Fei XIAO, et al.
Published: (2013-01-01) -
Comparison of the efficiency of zero and first order minimization methods in neural networks
by: E. A. Gubareva, et al.
Published: (2022-12-01) -
Research of the Adaptive Fuzzy PID Control System of New Type Double Conical Continuously Variable Transmission
by: Li Jingkui, et al.
Published: (2016-01-01) -
Deep Reinforcement Learning-Based Task Partitioning Ratio Decision Mechanism in High-Speed Rail Environments with Mobile Edge Computing Server
by: Seolwon Koo, et al.
Published: (2025-01-01)