Online hierarchical reinforcement learning based on interrupting Option
Aiming at dealing with volume of big data,an on-line updating algorithm,named by Macro-Q with in-place updating (MQIU),which was based on Macro-Q algorithm and takes advantage of in-place updating approach,was proposed.The MQIU algorithm updates both the value function of abstract action and the val...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | zho |
Published: |
Editorial Department of Journal on Communications
2016-06-01
|
Series: | Tongxin xuebao |
Subjects: | |
Online Access: | http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000-436x.2016117/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1841539566219034624 |
---|---|
author | Fei ZHU Zhi-peng XU Quan LIU Yu-chen FU Hui WANG |
author_facet | Fei ZHU Zhi-peng XU Quan LIU Yu-chen FU Hui WANG |
author_sort | Fei ZHU |
collection | DOAJ |
description | Aiming at dealing with volume of big data,an on-line updating algorithm,named by Macro-Q with in-place updating (MQIU),which was based on Macro-Q algorithm and takes advantage of in-place updating approach,was proposed.The MQIU algorithm updates both the value function of abstract action and the value function of primitive action,and hence speeds up the convergence rate.By introducing the interruption mechanism,a model-free interrupting Macro-Q Option learning algorithm(IMQ),which was based on hierarchical reinforcement learning,was also introduced to order to handle the variability which was hard to process by the conventional Markov decision process model and abstract action so that IMQ was able to learn and improve control strategies in a dynamic environment.Simulations verify the MQIU algorithm speeds up the convergence rate so that it is able to do with the larger scale of data,and the IMQ algorithm solves the task faster with a stable learning performance. |
format | Article |
id | doaj-art-3cc2437dc4f14f9795864b5c325d0ed7 |
institution | Kabale University |
issn | 1000-436X |
language | zho |
publishDate | 2016-06-01 |
publisher | Editorial Department of Journal on Communications |
record_format | Article |
series | Tongxin xuebao |
spelling | doaj-art-3cc2437dc4f14f9795864b5c325d0ed72025-01-14T06:55:34ZzhoEditorial Department of Journal on CommunicationsTongxin xuebao1000-436X2016-06-0137657459701570Online hierarchical reinforcement learning based on interrupting OptionFei ZHUZhi-peng XUQuan LIUYu-chen FUHui WANGAiming at dealing with volume of big data,an on-line updating algorithm,named by Macro-Q with in-place updating (MQIU),which was based on Macro-Q algorithm and takes advantage of in-place updating approach,was proposed.The MQIU algorithm updates both the value function of abstract action and the value function of primitive action,and hence speeds up the convergence rate.By introducing the interruption mechanism,a model-free interrupting Macro-Q Option learning algorithm(IMQ),which was based on hierarchical reinforcement learning,was also introduced to order to handle the variability which was hard to process by the conventional Markov decision process model and abstract action so that IMQ was able to learn and improve control strategies in a dynamic environment.Simulations verify the MQIU algorithm speeds up the convergence rate so that it is able to do with the larger scale of data,and the IMQ algorithm solves the task faster with a stable learning performance.http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000-436x.2016117/big datareinforcement learninghierarchical reinforcement learningOptiononline learning |
spellingShingle | Fei ZHU Zhi-peng XU Quan LIU Yu-chen FU Hui WANG Online hierarchical reinforcement learning based on interrupting Option Tongxin xuebao big data reinforcement learning hierarchical reinforcement learning Option online learning |
title | Online hierarchical reinforcement learning based on interrupting Option |
title_full | Online hierarchical reinforcement learning based on interrupting Option |
title_fullStr | Online hierarchical reinforcement learning based on interrupting Option |
title_full_unstemmed | Online hierarchical reinforcement learning based on interrupting Option |
title_short | Online hierarchical reinforcement learning based on interrupting Option |
title_sort | online hierarchical reinforcement learning based on interrupting option |
topic | big data reinforcement learning hierarchical reinforcement learning Option online learning |
url | http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000-436x.2016117/ |
work_keys_str_mv | AT feizhu onlinehierarchicalreinforcementlearningbasedoninterruptingoption AT zhipengxu onlinehierarchicalreinforcementlearningbasedoninterruptingoption AT quanliu onlinehierarchicalreinforcementlearningbasedoninterruptingoption AT yuchenfu onlinehierarchicalreinforcementlearningbasedoninterruptingoption AT huiwang onlinehierarchicalreinforcementlearningbasedoninterruptingoption |