Reward shaping-based deep reinforcement learning for look-ahead dispatch with rolling-horizon
The increasing penetration of renewable energy exacerbates the challenges in designing an effective and adaptable model-driven Look-ahead Dispatch (LAD) method. Recently, deep reinforcement learning (DRL) methods show enormous potential in developing a dispatching agent with self-learning ability, a...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Elsevier
2025-07-01
|
| Series: | International Journal of Electrical Power & Energy Systems |
| Subjects: | |
| Online Access: | http://www.sciencedirect.com/science/article/pii/S0142061525002248 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849321637560713216 |
|---|---|
| author | Hongsheng Xu Yungui Xu Ke Wang Yaping Li Abdullah Al Ahad |
| author_facet | Hongsheng Xu Yungui Xu Ke Wang Yaping Li Abdullah Al Ahad |
| author_sort | Hongsheng Xu |
| collection | DOAJ |
| description | The increasing penetration of renewable energy exacerbates the challenges in designing an effective and adaptable model-driven Look-ahead Dispatch (LAD) method. Recently, deep reinforcement learning (DRL) methods show enormous potential in developing a dispatching agent with self-learning ability, attributed to their superior generalization, adaptability, and computational efficiency. However, existing DRL-based LAD methods overlook the discounting effect when calculating the immediate total reward for LAD and lack attention to trial-and-error reward design and expected discounted returns that could reflect the true performance metrics of LAD. Therefore, this paper proposes novel reward shaping (RS)-based DRL algorithms for the rolling-horizon LAD problem. We propose the method for accurately estimating the look-ahead discounted factor that best matches different look-ahead horizons (LAHs). The shaped reward functions are designed and an RS-based regularization is also proposed by employing a potential function. Case studies on the SG 126-bus and IEEE 118-bus systems demonstrate the effectiveness of the proposed improved measures, as well as the superiority and adaptability of the proposed improved DRL algorithms in training and testing performance.© 2017 Elsevier Inc. All rights reserved. |
| format | Article |
| id | doaj-art-c4f953377bcc4a6bb2b786d303541b1a |
| institution | Kabale University |
| issn | 0142-0615 |
| language | English |
| publishDate | 2025-07-01 |
| publisher | Elsevier |
| record_format | Article |
| series | International Journal of Electrical Power & Energy Systems |
| spelling | doaj-art-c4f953377bcc4a6bb2b786d303541b1a2025-08-20T03:49:41ZengElsevierInternational Journal of Electrical Power & Energy Systems0142-06152025-07-0116811067310.1016/j.ijepes.2025.110673Reward shaping-based deep reinforcement learning for look-ahead dispatch with rolling-horizonHongsheng Xu0Yungui Xu1Ke Wang2Yaping Li3Abdullah Al Ahad4School of Electrical and Power Engineering, Hohai University, Nanjing 211100, ChinaSchool of Electrical and Power Engineering, Hohai University, Nanjing 211100, ChinaSchool of Electrical and Power Engineering, Hohai University, Nanjing 211100, China; Corresponding author.Department of Power Automation, China Electric Power Research Institute, Nanjing 210000, ChinaSchool of Electrical and Power Engineering, Hohai University, Nanjing 211100, ChinaThe increasing penetration of renewable energy exacerbates the challenges in designing an effective and adaptable model-driven Look-ahead Dispatch (LAD) method. Recently, deep reinforcement learning (DRL) methods show enormous potential in developing a dispatching agent with self-learning ability, attributed to their superior generalization, adaptability, and computational efficiency. However, existing DRL-based LAD methods overlook the discounting effect when calculating the immediate total reward for LAD and lack attention to trial-and-error reward design and expected discounted returns that could reflect the true performance metrics of LAD. Therefore, this paper proposes novel reward shaping (RS)-based DRL algorithms for the rolling-horizon LAD problem. We propose the method for accurately estimating the look-ahead discounted factor that best matches different look-ahead horizons (LAHs). The shaped reward functions are designed and an RS-based regularization is also proposed by employing a potential function. Case studies on the SG 126-bus and IEEE 118-bus systems demonstrate the effectiveness of the proposed improved measures, as well as the superiority and adaptability of the proposed improved DRL algorithms in training and testing performance.© 2017 Elsevier Inc. All rights reserved.http://www.sciencedirect.com/science/article/pii/S0142061525002248Look-ahead dispatchRolling-horizonDeep reinforcement learningReward shapingSoft actor-critic |
| spellingShingle | Hongsheng Xu Yungui Xu Ke Wang Yaping Li Abdullah Al Ahad Reward shaping-based deep reinforcement learning for look-ahead dispatch with rolling-horizon International Journal of Electrical Power & Energy Systems Look-ahead dispatch Rolling-horizon Deep reinforcement learning Reward shaping Soft actor-critic |
| title | Reward shaping-based deep reinforcement learning for look-ahead dispatch with rolling-horizon |
| title_full | Reward shaping-based deep reinforcement learning for look-ahead dispatch with rolling-horizon |
| title_fullStr | Reward shaping-based deep reinforcement learning for look-ahead dispatch with rolling-horizon |
| title_full_unstemmed | Reward shaping-based deep reinforcement learning for look-ahead dispatch with rolling-horizon |
| title_short | Reward shaping-based deep reinforcement learning for look-ahead dispatch with rolling-horizon |
| title_sort | reward shaping based deep reinforcement learning for look ahead dispatch with rolling horizon |
| topic | Look-ahead dispatch Rolling-horizon Deep reinforcement learning Reward shaping Soft actor-critic |
| url | http://www.sciencedirect.com/science/article/pii/S0142061525002248 |
| work_keys_str_mv | AT hongshengxu rewardshapingbaseddeepreinforcementlearningforlookaheaddispatchwithrollinghorizon AT yunguixu rewardshapingbaseddeepreinforcementlearningforlookaheaddispatchwithrollinghorizon AT kewang rewardshapingbaseddeepreinforcementlearningforlookaheaddispatchwithrollinghorizon AT yapingli rewardshapingbaseddeepreinforcementlearningforlookaheaddispatchwithrollinghorizon AT abdullahalahad rewardshapingbaseddeepreinforcementlearningforlookaheaddispatchwithrollinghorizon |