Advantage estimator based on importance sampling

Advantage estimator based on importance sampling

In continuous action tasks,deep reinforcement learning usually uses Gaussian distribution as a policy function.Aiming at the problem that the Gaussian distribution policy function slows down due to the clipped action,an importance sampling advantage estimator was proposed.Based on the general advant...

Full description

Saved in:

Bibliographic Details
Main Authors:	Quan LIU, Yubin JIANG, Zhihui HU
Format:	Article
Language:	zho
Published:	Editorial Department of Journal on Communications 2019-05-01
Series:	Tongxin xuebao
Subjects:	reinforcement learning importance sampling deep reinforcement learning advantage function
Online Access:	http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000-436x.2019122/
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Deep reinforcement learning-empowered anti-jamming strategy aided by sample information entropy
by: LI Gang, et al.
Published: (2024-09-01)

Enhanced deep deterministic policy gradient algorithm
by: Jianping CHEN, et al.
Published: (2018-11-01)

Survey on reinforcement learning based adaptive bit rate algorithm for mobile video streaming services
by: Li’na DU, et al.
Published: (2021-09-01)

Adversarial method for malicious ELF file detection based on deep reinforcement learning
by: SUN He, et al.
Published: (2024-10-01)

Machine Learning Applications in Energy Harvesting Internet of Things Networks: A Review
by: Olumide Alamu, et al.
Published: (2025-01-01)

GenFedRL: a general federated reinforcement learning framework for deep reinforcement learning agents
by: Biao JIN, et al.
Published: (2023-06-01)

Online hierarchical reinforcement learning based on interrupting Option
by: Fei ZHU, et al.
Published: (2016-06-01)

Fast deep reinforcement learning anti-jamming algorithm based on similar sample generation
by: ZHOU Quan, et al.
Published: (2024-07-01)

IALight: Importance-Aware Multi-Agent Reinforcement Learning for Arterial Traffic Cooperative Control
by: Lu WEI, et al.
Published: (2025-02-01)

ReMAV: Reward Modeling of Autonomous Vehicles for Finding Likely Failure Events
by: Aizaz Sharif, et al.
Published: (2024-01-01)

A survey of neural architecture search
by: Mingjie HE, et al.
Published: (2019-05-01)

Drone Landing and Reinforcement Learning: State-of-Art, Challenges and Opportunities
by: Jose Amendola, et al.
Published: (2024-01-01)

HEVERL – Viewport Estimation Using Reinforcement Learning for 360-degree Video Streaming
by: Nguyen Viet Hung, et al.
Published: (2025-01-01)

A Multi-Robot Collaborative Exploration Method Based on Deep Reinforcement Learning and Knowledge Distillation
by: Rui Wang, et al.
Published: (2025-01-01)

Queue Formation and Obstacle Avoidance Navigation Strategy for Multi-Robot Systems Based on Deep Reinforcement Learning
by: Tianyi Gao, et al.
Published: (2025-01-01)

Multi-Slot Secure Offloading and Resource Management in VEC Networks: A Deep Reinforcement Learning-Based Method
by: Zhen Li, et al.
Published: (2025-01-01)

Safedrive dreamer: Navigating safety–critical scenarios in autonomous driving with world models
by: Haitao Li, et al.
Published: (2025-01-01)

Analysis of anomalous behaviour in network systems using deep reinforcement learning with convolutional neural network architecture
by: Mohammad Hossein Modirrousta, et al.
Published: (2024-12-01)

Overview on intelligent wireless communication technology
by: Yingchang LIANG, et al.
Published: (2020-07-01)

WEIGHTING IMPORTANCE SAMPLING METHOD FOR STRUCTURAL TIME-DEPENDENT FAILURE PROBABILITY FUNCTION ESTIMATION
by: QIAN YuGeng, et al.
Published: (2023-12-01)

Heuristic Sarsa algorithm based on value function transfer
by: Jianping CHEN, et al.
Published: (2018-08-01)

Learning-based locomotion control fusing multimodal perception for a bipedal humanoid robot
by: Chao Ji, et al.
Published: (2025-03-01)

Gradient descent Sarsa(?)algorithm based on the adaptive potential function shaping reward mechanism
by: Fei XIAO, et al.
Published: (2013-01-01)

Unsupervised data imputation with multiple importance sampling variational autoencoders
by: Shenfen Kuang, et al.
Published: (2025-01-01)

Review: the application of deep reinforcement learning to quantitative trading in financial market
by: XU Bo, et al.
Published: (2024-12-01)

Intelligent routing strategy in the Internet of things based on deep reinforcement learning
by: Ruijin DING, et al.
Published: (2019-06-01)

Quality of service optimization algorithm based on deep reinforcement learning in software defined network
by: Cenhuishan LIAO, et al.
Published: (2023-03-01)

Node selection method in federated learning based on deep reinforcement learning
by: Wenchen HE, et al.
Published: (2021-06-01)

Deep Reinforcement Learning-Based Controller for Field-Oriented Control of SynRM
by: Erdal Kilic
Published: (2025-01-01)

Service chain mapping algorithm based on reinforcement learning
by: Liang WEI, et al.
Published: (2018-01-01)

Research on power efficient autonomous UAV navigation algorithm: an edge intelligence driven approach
by: Chunmin LIN, et al.
Published: (2021-06-01)

Autonomous security analysis and penetration testing model based on attack graph and deep Q-learning network
by: Cheng FAN, et al.
Published: (2023-12-01)

Digital Twin-Empowered Green Mobility Management in Next-Gen Transportation Networks
by: Kubra Duran, et al.
Published: (2024-01-01)

Adaptive pilot design for OFDM based on deep reinforcement learning
by: Qiaoshou LIU, et al.
Published: (2023-09-01)

High-performance directional fuzzing scheme based on deep reinforcement learning
by: Tian XIAO, et al.
Published: (2023-04-01)

Dueling Network Architecture for GNN in the Deep Reinforcement Learning for the Automated ICT System Design
by: Tianchen Zhou, et al.
Published: (2025-01-01)

Relation Between Quantum Advantage in Supervised Learning and Quantum Computational Advantage
by: Jordi Perez-Guijarro, et al.
Published: (2024-01-01)

A Review of Reinforcement Learning for Fixed-Wing Aircraft Control Tasks
by: David J. Richter, et al.
Published: (2024-01-01)

Reinforcement Learning-Based Sequential Control Policy for Multiple Peg-in-Hole Assembly
by: Xinyu Liu, et al.
Published: (2024-10-01)

A Correlation Analysis-Based Structural Load Estimation Method for RC Beams Using Machine Vision and Numerical Simulation
by: Chun Zhang, et al.
Published: (2025-01-01)