Online hierarchical reinforcement learning based on interrupting Option

Aiming at dealing with volume of big data,an on-line updating algorithm,named by Macro-Q with in-place updating (MQIU),which was based on Macro-Q algorithm and takes advantage of in-place updating approach,was proposed.The MQIU algorithm updates both the value function of abstract action and the val...

Full description

Saved in:
Bibliographic Details
Main Authors: Fei ZHU, Zhi-peng XU, Quan LIU, Yu-chen FU, Hui WANG
Format: Article
Language:zho
Published: Editorial Department of Journal on Communications 2016-06-01
Series:Tongxin xuebao
Subjects:
Online Access:http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000-436x.2016117/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841539566219034624
author Fei ZHU
Zhi-peng XU
Quan LIU
Yu-chen FU
Hui WANG
author_facet Fei ZHU
Zhi-peng XU
Quan LIU
Yu-chen FU
Hui WANG
author_sort Fei ZHU
collection DOAJ
description Aiming at dealing with volume of big data,an on-line updating algorithm,named by Macro-Q with in-place updating (MQIU),which was based on Macro-Q algorithm and takes advantage of in-place updating approach,was proposed.The MQIU algorithm updates both the value function of abstract action and the value function of primitive action,and hence speeds up the convergence rate.By introducing the interruption mechanism,a model-free interrupting Macro-Q Option learning algorithm(IMQ),which was based on hierarchical reinforcement learning,was also introduced to order to handle the variability which was hard to process by the conventional Markov decision process model and abstract action so that IMQ was able to learn and improve control strategies in a dynamic environment.Simulations verify the MQIU algorithm speeds up the convergence rate so that it is able to do with the larger scale of data,and the IMQ algorithm solves the task faster with a stable learning performance.
format Article
id doaj-art-3cc2437dc4f14f9795864b5c325d0ed7
institution Kabale University
issn 1000-436X
language zho
publishDate 2016-06-01
publisher Editorial Department of Journal on Communications
record_format Article
series Tongxin xuebao
spelling doaj-art-3cc2437dc4f14f9795864b5c325d0ed72025-01-14T06:55:34ZzhoEditorial Department of Journal on CommunicationsTongxin xuebao1000-436X2016-06-0137657459701570Online hierarchical reinforcement learning based on interrupting OptionFei ZHUZhi-peng XUQuan LIUYu-chen FUHui WANGAiming at dealing with volume of big data,an on-line updating algorithm,named by Macro-Q with in-place updating (MQIU),which was based on Macro-Q algorithm and takes advantage of in-place updating approach,was proposed.The MQIU algorithm updates both the value function of abstract action and the value function of primitive action,and hence speeds up the convergence rate.By introducing the interruption mechanism,a model-free interrupting Macro-Q Option learning algorithm(IMQ),which was based on hierarchical reinforcement learning,was also introduced to order to handle the variability which was hard to process by the conventional Markov decision process model and abstract action so that IMQ was able to learn and improve control strategies in a dynamic environment.Simulations verify the MQIU algorithm speeds up the convergence rate so that it is able to do with the larger scale of data,and the IMQ algorithm solves the task faster with a stable learning performance.http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000-436x.2016117/big datareinforcement learninghierarchical reinforcement learningOptiononline learning
spellingShingle Fei ZHU
Zhi-peng XU
Quan LIU
Yu-chen FU
Hui WANG
Online hierarchical reinforcement learning based on interrupting Option
Tongxin xuebao
big data
reinforcement learning
hierarchical reinforcement learning
Option
online learning
title Online hierarchical reinforcement learning based on interrupting Option
title_full Online hierarchical reinforcement learning based on interrupting Option
title_fullStr Online hierarchical reinforcement learning based on interrupting Option
title_full_unstemmed Online hierarchical reinforcement learning based on interrupting Option
title_short Online hierarchical reinforcement learning based on interrupting Option
title_sort online hierarchical reinforcement learning based on interrupting option
topic big data
reinforcement learning
hierarchical reinforcement learning
Option
online learning
url http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000-436x.2016117/
work_keys_str_mv AT feizhu onlinehierarchicalreinforcementlearningbasedoninterruptingoption
AT zhipengxu onlinehierarchicalreinforcementlearningbasedoninterruptingoption
AT quanliu onlinehierarchicalreinforcementlearningbasedoninterruptingoption
AT yuchenfu onlinehierarchicalreinforcementlearningbasedoninterruptingoption
AT huiwang onlinehierarchicalreinforcementlearningbasedoninterruptingoption