Robust quantum control using reinforcement learning from demonstration
Abstract Quantum control requires high-precision and robust control pulses to ensure optimal system performance. However, control sequences generated with a system model may suffer from model bias, leading to low fidelity. While model-free reinforcement learning (RL) methods have been developed to a...
Saved in:
| Main Authors: | , , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-07-01
|
| Series: | npj Quantum Information |
| Online Access: | https://doi.org/10.1038/s41534-025-01065-2 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849342889823305728 |
|---|---|
| author | Shengyong Li Yidian Fan Xiang Li Xinhui Ruan Qianchuan Zhao Zhihui Peng Re-Bing Wu Jing Zhang Pengtao Song |
| author_facet | Shengyong Li Yidian Fan Xiang Li Xinhui Ruan Qianchuan Zhao Zhihui Peng Re-Bing Wu Jing Zhang Pengtao Song |
| author_sort | Shengyong Li |
| collection | DOAJ |
| description | Abstract Quantum control requires high-precision and robust control pulses to ensure optimal system performance. However, control sequences generated with a system model may suffer from model bias, leading to low fidelity. While model-free reinforcement learning (RL) methods have been developed to avoid such biases, training an RL agent from scratch can be time-consuming, often taking hours to gather enough samples for convergence. This challenge has hindered the broad application of RL techniques to larger and more complex quantum control issues, limiting their adaptability. In this work, we use Reinforcement Learning from Demonstration (RLfD) to leverage the control sequences generated with system models and further optimize them with RL to avoid model bias. By avoiding learning from scratch and starting with reasonable control pulse shapes, this approach can increase sample efficiency by reducing the number of samples, which can significantly reduce the training time. Thus, this method can effectively handle pulse shapes that are discretized into more than 1000 pieces without compromising final fidelity. We have simulated the preparation of several high-fidelity non-classical states using the RLfD method. We also find that the training process is more stable when using RLfD. In addition, this method is suitable for fast gate calibration using reinforcement learning. |
| format | Article |
| id | doaj-art-0b51aa034ac647e99f64c930cc091480 |
| institution | Kabale University |
| issn | 2056-6387 |
| language | English |
| publishDate | 2025-07-01 |
| publisher | Nature Portfolio |
| record_format | Article |
| series | npj Quantum Information |
| spelling | doaj-art-0b51aa034ac647e99f64c930cc0914802025-08-20T03:43:14ZengNature Portfolionpj Quantum Information2056-63872025-07-011111810.1038/s41534-025-01065-2Robust quantum control using reinforcement learning from demonstrationShengyong Li0Yidian Fan1Xiang Li2Xinhui Ruan3Qianchuan Zhao4Zhihui Peng5Re-Bing Wu6Jing Zhang7Pengtao Song8Department of Automation, Tsinghua UniversityDepartment of Automation, Tsinghua UniversityInstitute of Physics, Chinese Academy of SciencesDepartment of Automation, Tsinghua UniversityDepartment of Automation, Tsinghua UniversityDepartment of Physics and Synergetic Innovation Center for Quantum Effects and Applications, Hunan Normal UniversityDepartment of Automation, Tsinghua UniversitySchool of Automation Science and Engineering, Xi’an Jiaotong UniversitySchool of Automation Science and Engineering, Xi’an Jiaotong UniversityAbstract Quantum control requires high-precision and robust control pulses to ensure optimal system performance. However, control sequences generated with a system model may suffer from model bias, leading to low fidelity. While model-free reinforcement learning (RL) methods have been developed to avoid such biases, training an RL agent from scratch can be time-consuming, often taking hours to gather enough samples for convergence. This challenge has hindered the broad application of RL techniques to larger and more complex quantum control issues, limiting their adaptability. In this work, we use Reinforcement Learning from Demonstration (RLfD) to leverage the control sequences generated with system models and further optimize them with RL to avoid model bias. By avoiding learning from scratch and starting with reasonable control pulse shapes, this approach can increase sample efficiency by reducing the number of samples, which can significantly reduce the training time. Thus, this method can effectively handle pulse shapes that are discretized into more than 1000 pieces without compromising final fidelity. We have simulated the preparation of several high-fidelity non-classical states using the RLfD method. We also find that the training process is more stable when using RLfD. In addition, this method is suitable for fast gate calibration using reinforcement learning.https://doi.org/10.1038/s41534-025-01065-2 |
| spellingShingle | Shengyong Li Yidian Fan Xiang Li Xinhui Ruan Qianchuan Zhao Zhihui Peng Re-Bing Wu Jing Zhang Pengtao Song Robust quantum control using reinforcement learning from demonstration npj Quantum Information |
| title | Robust quantum control using reinforcement learning from demonstration |
| title_full | Robust quantum control using reinforcement learning from demonstration |
| title_fullStr | Robust quantum control using reinforcement learning from demonstration |
| title_full_unstemmed | Robust quantum control using reinforcement learning from demonstration |
| title_short | Robust quantum control using reinforcement learning from demonstration |
| title_sort | robust quantum control using reinforcement learning from demonstration |
| url | https://doi.org/10.1038/s41534-025-01065-2 |
| work_keys_str_mv | AT shengyongli robustquantumcontrolusingreinforcementlearningfromdemonstration AT yidianfan robustquantumcontrolusingreinforcementlearningfromdemonstration AT xiangli robustquantumcontrolusingreinforcementlearningfromdemonstration AT xinhuiruan robustquantumcontrolusingreinforcementlearningfromdemonstration AT qianchuanzhao robustquantumcontrolusingreinforcementlearningfromdemonstration AT zhihuipeng robustquantumcontrolusingreinforcementlearningfromdemonstration AT rebingwu robustquantumcontrolusingreinforcementlearningfromdemonstration AT jingzhang robustquantumcontrolusingreinforcementlearningfromdemonstration AT pengtaosong robustquantumcontrolusingreinforcementlearningfromdemonstration |