Fast deep reinforcement learning anti-jamming algorithm based on similar sample generation
To improve the learning efficiency of anti-jamming algorithms based on deep reinforcement learning and enable them to adapt more quickly to unknown jamming environments, a fast deep reinforcement learning anti-jamming algorithm based on similar sample generation was proposed. By combining the simila...
Saved in:
Main Authors: | , |
---|---|
Format: | Article |
Language: | zho |
Published: |
Editorial Department of Journal on Communications
2024-07-01
|
Series: | Tongxin xuebao |
Subjects: | |
Online Access: | http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000-436x.2024131/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1841539178126376960 |
---|---|
author | ZHOU Quan NIU Yingtao |
author_facet | ZHOU Quan NIU Yingtao |
author_sort | ZHOU Quan |
collection | DOAJ |
description | To improve the learning efficiency of anti-jamming algorithms based on deep reinforcement learning and enable them to adapt more quickly to unknown jamming environments, a fast deep reinforcement learning anti-jamming algorithm based on similar sample generation was proposed. By combining the similarity measurement of state-action pairs, derived from bisimulation, with an anti-jamming algorithm grounded in the deep Q-network, this algorithm was able to quickly learn effective multi-domain anti-jamming strategies in unknown, dynamic jamming environments. Specifically, once a transmission action was completed, the proposed algorithm first interacted with the environment using the deep Q-network to acquire actual state-action pairs. Then it generated a set of similar state-action pairs based on bisimulation, employing these similar state-action pairs to produce simulated training samples. Through these operations, the algorithm was able to acquire a large number of training samples at each iteration step, thereby significantly accelerating the training process and convergence speed. Simulation results show that under comb sweep jamming and intelligent blocking jamming, the proposed algorithm exhibits rapid convergence speed, and its normalized throughput after convergence significantly superior to the conventional deep Q-network algorithm, the Q-learning algorithm, and the improved Q-learning algorithm based on knowledge reuse. |
format | Article |
id | doaj-art-257c19f3f5334b019d7a1d3d4a98dfa7 |
institution | Kabale University |
issn | 1000-436X |
language | zho |
publishDate | 2024-07-01 |
publisher | Editorial Department of Journal on Communications |
record_format | Article |
series | Tongxin xuebao |
spelling | doaj-art-257c19f3f5334b019d7a1d3d4a98dfa72025-01-14T07:24:44ZzhoEditorial Department of Journal on CommunicationsTongxin xuebao1000-436X2024-07-014511712667384924Fast deep reinforcement learning anti-jamming algorithm based on similar sample generationZHOU QuanNIU YingtaoTo improve the learning efficiency of anti-jamming algorithms based on deep reinforcement learning and enable them to adapt more quickly to unknown jamming environments, a fast deep reinforcement learning anti-jamming algorithm based on similar sample generation was proposed. By combining the similarity measurement of state-action pairs, derived from bisimulation, with an anti-jamming algorithm grounded in the deep Q-network, this algorithm was able to quickly learn effective multi-domain anti-jamming strategies in unknown, dynamic jamming environments. Specifically, once a transmission action was completed, the proposed algorithm first interacted with the environment using the deep Q-network to acquire actual state-action pairs. Then it generated a set of similar state-action pairs based on bisimulation, employing these similar state-action pairs to produce simulated training samples. Through these operations, the algorithm was able to acquire a large number of training samples at each iteration step, thereby significantly accelerating the training process and convergence speed. Simulation results show that under comb sweep jamming and intelligent blocking jamming, the proposed algorithm exhibits rapid convergence speed, and its normalized throughput after convergence significantly superior to the conventional deep Q-network algorithm, the Q-learning algorithm, and the improved Q-learning algorithm based on knowledge reuse.http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000-436x.2024131/communication anti-jammingdeep reinforcement learningfast anti-jammingreliable communication |
spellingShingle | ZHOU Quan NIU Yingtao Fast deep reinforcement learning anti-jamming algorithm based on similar sample generation Tongxin xuebao communication anti-jamming deep reinforcement learning fast anti-jamming reliable communication |
title | Fast deep reinforcement learning anti-jamming algorithm based on similar sample generation |
title_full | Fast deep reinforcement learning anti-jamming algorithm based on similar sample generation |
title_fullStr | Fast deep reinforcement learning anti-jamming algorithm based on similar sample generation |
title_full_unstemmed | Fast deep reinforcement learning anti-jamming algorithm based on similar sample generation |
title_short | Fast deep reinforcement learning anti-jamming algorithm based on similar sample generation |
title_sort | fast deep reinforcement learning anti jamming algorithm based on similar sample generation |
topic | communication anti-jamming deep reinforcement learning fast anti-jamming reliable communication |
url | http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000-436x.2024131/ |
work_keys_str_mv | AT zhouquan fastdeepreinforcementlearningantijammingalgorithmbasedonsimilarsamplegeneration AT niuyingtao fastdeepreinforcementlearningantijammingalgorithmbasedonsimilarsamplegeneration |