Trajectory Based Prioritized Double Experience Buffer for Sample-Efficient Policy Optimization

Reinforcement learning has recently made great progress in various challenging domains such as board game of Go and MOBA game of StarCraft II. Policy gradient based reinforcement learning method has become the mainstream due to its effectiveness and simplicity both in discrete and continuous scenari...

Full description

Saved in:
Bibliographic Details
Main Authors: Shengxiang Li, Ou Li, Guangyi Liu, Siyuan Ding, Yijie Bai
Format: Article
Language:English
Published: IEEE 2021-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9486881/
Tags: Add Tag
No Tags, Be the first to tag this record!