Trajectory Based Prioritized Double Experience Buffer for Sample-Efficient Policy Optimization
Reinforcement learning has recently made great progress in various challenging domains such as board game of Go and MOBA game of StarCraft II. Policy gradient based reinforcement learning method has become the mainstream due to its effectiveness and simplicity both in discrete and continuous scenari...
Saved in:
| Main Authors: | Shengxiang Li, Ou Li, Guangyi Liu, Siyuan Ding, Yijie Bai |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2021-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/9486881/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
-
Domain Adaptation Using the Replay Buffer: Adaptive Sampling Using Domain-Specific Classifier
by: Seokmin Kim, et al.
Published: (2024-01-01) -
Pri-DDQN: learning adaptive traffic signal control strategy through a hybrid agent
by: Yanliu Zheng, et al.
Published: (2024-11-01) -
BUFFERING FUNCTION: A GENERAL APPROACH FOR BUFFER BEHAVIOR
by: André Fernando de Oliveira
Published: (2020-09-01) -
Push based buffer setting strategy for high density linecard of small buffer size
by: LI Yu-feng1, et al.
Published: (2008-01-01) -
Concurrent Learning of Control Policy and Unknown Safety Specifications in Reinforcement Learning
by: Lunet Yifru, et al.
Published: (2024-01-01)