Moor: Model-based offline policy optimization with a risk dynamics model

Moor: Model-based offline policy optimization with a risk dynamics model

Abstract Offline reinforcement learning (RL) has been widely used in safety-critical domains by avoiding dangerous and costly online interaction. A significant challenge is addressing uncertainties and risks outside of offline data. Risk-sensitive offline RL attempts to solve this issue by risk aver...

Full description

Saved in:

Bibliographic Details
Main Authors:	Xiaolong Su, Peng Li, Shaofei Chen
Format:	Article
Language:	English
Published:	Springer 2024-11-01
Series:	Complex & Intelligent Systems
Subjects:	Offline Reinforcement Learning Risk-sensitive Reinforcement Learning Risk Dynamics Model Reward Relabelling
Online Access:	https://doi.org/10.1007/s40747-024-01621-x
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Offline prompt reinforcement learning method based on feature extraction
by: Tianlei Yao, et al.
Published: (2025-01-01)

Stealthy data poisoning attack method on offline reinforcement learning in unmanned systems
by: ZHOU Xue, et al.
Published: (2024-12-01)

Reinforcement Learning-Based Autonomous Soccer Agents: A Study in Multi-Agent Coordination and Strategy Development
by: Biplov Paneru, et al.
Published: (2025-01-01)

ANALYSIS OF THE FACTORS INFLUENCING THE OFFLINE LEARNING READINESS DURING THE COVID-19 PANDEMIC
by: Arlina Dhian Sulistyowati, et al.
Published: (2023-05-01)

Reinforcement learning algorithm based on minimum state method and average reward
by: LIU Quan1, et al.
Published: (2011-01-01)

HPRS: hierarchical potential-based reward shaping from task specifications
by: Luigi Berducci, et al.
Published: (2025-02-01)

A Survey of Offline Handwriting Signature Verification
by: Jihad Majeed Nori, et al.
Published: (2025-01-01)

Tactical intent-driven autonomous air combat behavior generation method
by: Xingyu Wang, et al.
Published: (2024-12-01)

Gradient descent Sarsa(?)algorithm based on the adaptive potential function shaping reward mechanism
by: Fei XIAO, et al.
Published: (2013-01-01)

The preference for surprise in reinforcement learning underlies the differences in developmental changes in risk preference between autistic and neurotypical youth
by: Motofumi Sumiya, et al.
Published: (2025-01-01)

CDR-Detector: a chronic disease risk prediction model combining pre-training with deep reinforcement learning
by: Shaofu Lin, et al.
Published: (2024-12-01)

Swarm Confrontation Algorithm for UGV Swarm with Quantity Advantage by a Novel MSRM-MAPOCA Training Method
by: Huanli Gao, et al.
Published: (2025-01-01)

Implementation of home visit method in offline learning during the Covid-19 pandemic
by: Benjamin Metekohy, et al.
Published: (2023-03-01)

A Framework of Recommendation System for Unmanned Aerial Vehicle Autonomous Maneuver Decision
by: Qinzhi Hao, et al.
Published: (2024-12-01)

Sensation seeking and risk adjustment: the role of reward sensitivity in dynamic risky decisions
by: Yin Qianlan, et al.
Published: (2025-02-01)

Scheduling framework based on reinforcement learning in online-offline colocated cloud environment
by: Ling MA, et al.
Published: (2023-06-01)

Reinforcement learning for deep portfolio optimization
by: Ruyu Yan, et al.
Published: (2024-09-01)

Revisiting dice relabeling using cyclotomic polynomials
by: Yikai Chao, et al.
Published: (2025-01-01)

Reconstruction of Reinforced Concrete Structures on The Example of Bridge Spans Damaged as a Result of Dynamic Impacts
by: Oleg A. Simakov
Published: (2023-07-01)

ReMAV: Reward Modeling of Autonomous Vehicles for Finding Likely Failure Events
by: Aizaz Sharif, et al.
Published: (2024-01-01)

Online hierarchical reinforcement learning based on interrupting Option
by: Fei ZHU, et al.
Published: (2016-06-01)

Measurance of Pitch Plane and Offline Detection of Accuracy on Hirth Coupling
by: Guangjun Shao, et al.
Published: (2019-03-01)

Optimizing fast charging protocols for lithium-ion batteries using reinforcement learning: Balancing speed, efficiency, and longevity
by: Khairy Sayed, et al.
Published: (2025-03-01)

Heterogeneous foraging swarms can be better
by: Gal A. Kaminka, et al.
Published: (2025-01-01)

Benefits of Online and Offline Resources for Teaching English and Norwegian as Foreign Languages in a Post-Pandemic Context
by: Diana Lățug, et al.
Published: (2024-10-01)

A Multi-Agent Approach to Modeling Task-Oriented Dialog Policy Learning
by: Songfeng Liang, et al.
Published: (2025-01-01)

Advantage estimator based on importance sampling
by: Quan LIU, et al.
Published: (2019-05-01)

Reinforced Cost-Sensitive Graph Network for Detecting Fraud Leaders in Telecom Fraud
by: Peiwen Gao, et al.
Published: (2024-01-01)

Continual deep reinforcement learning with task-agnostic policy distillation
by: Muhammad Burhan Hafez, et al.
Published: (2024-12-01)

GenFedRL: a general federated reinforcement learning framework for deep reinforcement learning agents
by: Biao JIN, et al.
Published: (2023-06-01)

Survey on reinforcement learning based adaptive bit rate algorithm for mobile video streaming services
by: Li’na DU, et al.
Published: (2021-09-01)

Safedrive dreamer: Navigating safety–critical scenarios in autonomous driving with world models
by: Haitao Li, et al.
Published: (2025-01-01)

Using the proximal policy optimization and prospect theory to train a decision-making model for managing personal finances
by: Vladyslav Didkivskyi, et al.
Published: (2024-11-01)

Machine Learning Applications in Energy Harvesting Internet of Things Networks: A Review
by: Olumide Alamu, et al.
Published: (2025-01-01)

Advancements in CNN Architectures for Offline Handwritten Arabic Character Recognition
by: El Ibrahimi Aissam, et al.
Published: (2025-01-01)

Deep Reinforcement Learning Assisted UAV Path Planning Relying on Cumulative Reward Mode and Region Segmentation
by: Zhipeng Wang, et al.
Published: (2024-01-01)

Negative valuation of ambiguous feedback may predict near-term risk for suicide attempt in Veterans at high risk for suicide
by: Catherine E. Myers, et al.
Published: (2025-01-01)

Actor-critic algorithm with incremental dual natural policy gradient
by: Peng ZHANG, et al.
Published: (2017-04-01)

Concurrent Learning of Control Policy and Unknown Safety Specifications in Reinforcement Learning
by: Lunet Yifru, et al.
Published: (2024-01-01)

Scilab-RL: A software framework for efficient reinforcement learning and cognitive modeling research
by: Jan Benad, et al.
Published: (2025-02-01)