Online hierarchical reinforcement learning based on interrupting Option
Aiming at dealing with volume of big data,an on-line updating algorithm,named by Macro-Q with in-place updating (MQIU),which was based on Macro-Q algorithm and takes advantage of in-place updating approach,was proposed.The MQIU algorithm updates both the value function of abstract action and the val...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | zho |
Published: |
Editorial Department of Journal on Communications
2016-06-01
|
Series: | Tongxin xuebao |
Subjects: | |
Online Access: | http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000-436x.2016117/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Aiming at dealing with volume of big data,an on-line updating algorithm,named by Macro-Q with in-place updating (MQIU),which was based on Macro-Q algorithm and takes advantage of in-place updating approach,was proposed.The MQIU algorithm updates both the value function of abstract action and the value function of primitive action,and hence speeds up the convergence rate.By introducing the interruption mechanism,a model-free interrupting Macro-Q Option learning algorithm(IMQ),which was based on hierarchical reinforcement learning,was also introduced to order to handle the variability which was hard to process by the conventional Markov decision process model and abstract action so that IMQ was able to learn and improve control strategies in a dynamic environment.Simulations verify the MQIU algorithm speeds up the convergence rate so that it is able to do with the larger scale of data,and the IMQ algorithm solves the task faster with a stable learning performance. |
---|---|
ISSN: | 1000-436X |