Optimized algorithm for value iteration based on topological sequence backups
In order to improve the convergence performance, an optimized value iteration based on topological sequence backups, VI-TS, is proposed. The key idea of VI-TS is to circumvent the problem of unnecessary backups by dividing an MDP into strongly-connected components and solving these components in top...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | zho |
Published: |
Editorial Department of Journal on Communications
2014-08-01
|
Series: | Tongxin xuebao |
Subjects: | |
Online Access: | http://www.joconline.com.cn/zh/article/doi/10.3969/j.issn.1000-436x.2014.08.008/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1841539209482993664 |
---|---|
author | Wei HUANG Quan LIU Hong-kun SUN Qi-ming FU HOUXiao-ke Z |
author_facet | Wei HUANG Quan LIU Hong-kun SUN Qi-ming FU HOUXiao-ke Z |
author_sort | Wei HUANG |
collection | DOAJ |
description | In order to improve the convergence performance, an optimized value iteration based on topological sequence backups, VI-TS, is proposed. The key idea of VI-TS is to circumvent the problem of unnecessary backups by dividing an MDP into strongly-connected components and solving these components in topological sequences after detecting the structure of MDP. The experiment results show that VI-TS has a better convergence performance and robustness for state space growth when applied to classical planning experiment scenarios. |
format | Article |
id | doaj-art-854c7d522cd846db8b73edf8edb73bcb |
institution | Kabale University |
issn | 1000-436X |
language | zho |
publishDate | 2014-08-01 |
publisher | Editorial Department of Journal on Communications |
record_format | Article |
series | Tongxin xuebao |
spelling | doaj-art-854c7d522cd846db8b73edf8edb73bcb2025-01-14T07:25:15ZzhoEditorial Department of Journal on CommunicationsTongxin xuebao1000-436X2014-08-0135566259683359Optimized algorithm for value iteration based on topological sequence backupsWei HUANGQuan LIUHong-kun SUNQi-ming FUHOUXiao-ke ZIn order to improve the convergence performance, an optimized value iteration based on topological sequence backups, VI-TS, is proposed. The key idea of VI-TS is to circumvent the problem of unnecessary backups by dividing an MDP into strongly-connected components and solving these components in topological sequences after detecting the structure of MDP. The experiment results show that VI-TS has a better convergence performance and robustness for state space growth when applied to classical planning experiment scenarios.http://www.joconline.com.cn/zh/article/doi/10.3969/j.issn.1000-436x.2014.08.008/reinforcement learningvalue iterationtopological sequenceVI-TS |
spellingShingle | Wei HUANG Quan LIU Hong-kun SUN Qi-ming FU HOUXiao-ke Z Optimized algorithm for value iteration based on topological sequence backups Tongxin xuebao reinforcement learning value iteration topological sequence VI-TS |
title | Optimized algorithm for value iteration based on topological sequence backups |
title_full | Optimized algorithm for value iteration based on topological sequence backups |
title_fullStr | Optimized algorithm for value iteration based on topological sequence backups |
title_full_unstemmed | Optimized algorithm for value iteration based on topological sequence backups |
title_short | Optimized algorithm for value iteration based on topological sequence backups |
title_sort | optimized algorithm for value iteration based on topological sequence backups |
topic | reinforcement learning value iteration topological sequence VI-TS |
url | http://www.joconline.com.cn/zh/article/doi/10.3969/j.issn.1000-436x.2014.08.008/ |
work_keys_str_mv | AT weihuang optimizedalgorithmforvalueiterationbasedontopologicalsequencebackups AT quanliu optimizedalgorithmforvalueiterationbasedontopologicalsequencebackups AT hongkunsun optimizedalgorithmforvalueiterationbasedontopologicalsequencebackups AT qimingfu optimizedalgorithmforvalueiterationbasedontopologicalsequencebackups AT houxiaokez optimizedalgorithmforvalueiterationbasedontopologicalsequencebackups |