Robust quantum control using reinforcement learning from demonstration

Abstract Quantum control requires high-precision and robust control pulses to ensure optimal system performance. However, control sequences generated with a system model may suffer from model bias, leading to low fidelity. While model-free reinforcement learning (RL) methods have been developed to a...

Full description

Saved in:
Bibliographic Details
Main Authors: Shengyong Li, Yidian Fan, Xiang Li, Xinhui Ruan, Qianchuan Zhao, Zhihui Peng, Re-Bing Wu, Jing Zhang, Pengtao Song
Format: Article
Language:English
Published: Nature Portfolio 2025-07-01
Series:npj Quantum Information
Online Access:https://doi.org/10.1038/s41534-025-01065-2
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849342889823305728
author Shengyong Li
Yidian Fan
Xiang Li
Xinhui Ruan
Qianchuan Zhao
Zhihui Peng
Re-Bing Wu
Jing Zhang
Pengtao Song
author_facet Shengyong Li
Yidian Fan
Xiang Li
Xinhui Ruan
Qianchuan Zhao
Zhihui Peng
Re-Bing Wu
Jing Zhang
Pengtao Song
author_sort Shengyong Li
collection DOAJ
description Abstract Quantum control requires high-precision and robust control pulses to ensure optimal system performance. However, control sequences generated with a system model may suffer from model bias, leading to low fidelity. While model-free reinforcement learning (RL) methods have been developed to avoid such biases, training an RL agent from scratch can be time-consuming, often taking hours to gather enough samples for convergence. This challenge has hindered the broad application of RL techniques to larger and more complex quantum control issues, limiting their adaptability. In this work, we use Reinforcement Learning from Demonstration (RLfD) to leverage the control sequences generated with system models and further optimize them with RL to avoid model bias. By avoiding learning from scratch and starting with reasonable control pulse shapes, this approach can increase sample efficiency by reducing the number of samples, which can significantly reduce the training time. Thus, this method can effectively handle pulse shapes that are discretized into more than 1000 pieces without compromising final fidelity. We have simulated the preparation of several high-fidelity non-classical states using the RLfD method. We also find that the training process is more stable when using RLfD. In addition, this method is suitable for fast gate calibration using reinforcement learning.
format Article
id doaj-art-0b51aa034ac647e99f64c930cc091480
institution Kabale University
issn 2056-6387
language English
publishDate 2025-07-01
publisher Nature Portfolio
record_format Article
series npj Quantum Information
spelling doaj-art-0b51aa034ac647e99f64c930cc0914802025-08-20T03:43:14ZengNature Portfolionpj Quantum Information2056-63872025-07-011111810.1038/s41534-025-01065-2Robust quantum control using reinforcement learning from demonstrationShengyong Li0Yidian Fan1Xiang Li2Xinhui Ruan3Qianchuan Zhao4Zhihui Peng5Re-Bing Wu6Jing Zhang7Pengtao Song8Department of Automation, Tsinghua UniversityDepartment of Automation, Tsinghua UniversityInstitute of Physics, Chinese Academy of SciencesDepartment of Automation, Tsinghua UniversityDepartment of Automation, Tsinghua UniversityDepartment of Physics and Synergetic Innovation Center for Quantum Effects and Applications, Hunan Normal UniversityDepartment of Automation, Tsinghua UniversitySchool of Automation Science and Engineering, Xi’an Jiaotong UniversitySchool of Automation Science and Engineering, Xi’an Jiaotong UniversityAbstract Quantum control requires high-precision and robust control pulses to ensure optimal system performance. However, control sequences generated with a system model may suffer from model bias, leading to low fidelity. While model-free reinforcement learning (RL) methods have been developed to avoid such biases, training an RL agent from scratch can be time-consuming, often taking hours to gather enough samples for convergence. This challenge has hindered the broad application of RL techniques to larger and more complex quantum control issues, limiting their adaptability. In this work, we use Reinforcement Learning from Demonstration (RLfD) to leverage the control sequences generated with system models and further optimize them with RL to avoid model bias. By avoiding learning from scratch and starting with reasonable control pulse shapes, this approach can increase sample efficiency by reducing the number of samples, which can significantly reduce the training time. Thus, this method can effectively handle pulse shapes that are discretized into more than 1000 pieces without compromising final fidelity. We have simulated the preparation of several high-fidelity non-classical states using the RLfD method. We also find that the training process is more stable when using RLfD. In addition, this method is suitable for fast gate calibration using reinforcement learning.https://doi.org/10.1038/s41534-025-01065-2
spellingShingle Shengyong Li
Yidian Fan
Xiang Li
Xinhui Ruan
Qianchuan Zhao
Zhihui Peng
Re-Bing Wu
Jing Zhang
Pengtao Song
Robust quantum control using reinforcement learning from demonstration
npj Quantum Information
title Robust quantum control using reinforcement learning from demonstration
title_full Robust quantum control using reinforcement learning from demonstration
title_fullStr Robust quantum control using reinforcement learning from demonstration
title_full_unstemmed Robust quantum control using reinforcement learning from demonstration
title_short Robust quantum control using reinforcement learning from demonstration
title_sort robust quantum control using reinforcement learning from demonstration
url https://doi.org/10.1038/s41534-025-01065-2
work_keys_str_mv AT shengyongli robustquantumcontrolusingreinforcementlearningfromdemonstration
AT yidianfan robustquantumcontrolusingreinforcementlearningfromdemonstration
AT xiangli robustquantumcontrolusingreinforcementlearningfromdemonstration
AT xinhuiruan robustquantumcontrolusingreinforcementlearningfromdemonstration
AT qianchuanzhao robustquantumcontrolusingreinforcementlearningfromdemonstration
AT zhihuipeng robustquantumcontrolusingreinforcementlearningfromdemonstration
AT rebingwu robustquantumcontrolusingreinforcementlearningfromdemonstration
AT jingzhang robustquantumcontrolusingreinforcementlearningfromdemonstration
AT pengtaosong robustquantumcontrolusingreinforcementlearningfromdemonstration