Mix-Spectrum for Generalization in Visual Reinforcement Learning
Visual Reinforcement Learning (RL) trains agents on policies using images showing the potential for real-world applications. However, the limited diversity in the training environment often results in overfitting with agents underperforming in unseen environments. To address this issue, image augmen...
Saved in:
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2025-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10833629/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1841536204610207744 |
---|---|
author | Jeong Woon Lee Hyoseok Hwang |
author_facet | Jeong Woon Lee Hyoseok Hwang |
author_sort | Jeong Woon Lee |
collection | DOAJ |
description | Visual Reinforcement Learning (RL) trains agents on policies using images showing the potential for real-world applications. However, the limited diversity in the training environment often results in overfitting with agents underperforming in unseen environments. To address this issue, image augmentation is utilized in visual RL to increase data diversity, but the effectiveness is limited due to the potential to alter the semantic information of the image. Therefore, we introduce Mix-Spectrum, a straightforward yet highly effective frequency-based augmentation method that maintains the semantic consistency of data and enhances the agent’s focus on semantic information. The proposed method combines two existing methods: mixing amplitudes of original and reference images, and Random Convolution. Through this synergistic combination of established methods, our approach not only maintains the advantages of each method but also introduces a novel characteristic that enhances performance. Furthermore, the proposed method stands out for adaptability when integrated with any visual RL algorithm, whether off-policy or on-policy. Through extensive experiments on the DMControl Generalization Benchmark (DMControl-GB) and Procgen, our method demonstrates superior performance compared to existing frequency-based, normalization-based, and image augmentation methods in zero-shot generalization. In DMControl-GB, our method improved by 35.5% over the baseline and 15.2% over the second-best. In Procgen, it achieved 15.2% and 10.1% improvements, respectively. |
format | Article |
id | doaj-art-bd7ecf6e952e4511bd90a32c1a1198d1 |
institution | Kabale University |
issn | 2169-3536 |
language | English |
publishDate | 2025-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj-art-bd7ecf6e952e4511bd90a32c1a1198d12025-01-15T00:02:58ZengIEEEIEEE Access2169-35362025-01-01137939795010.1109/ACCESS.2025.352695910833629Mix-Spectrum for Generalization in Visual Reinforcement LearningJeong Woon Lee0https://orcid.org/0009-0006-5862-0124Hyoseok Hwang1https://orcid.org/0000-0003-3241-8455Department of Software Convergence, Kyung Hee University, Yongin, Gyeonggi, Republic of KoreaDepartment of Software Convergence, Kyung Hee University, Yongin, Gyeonggi, Republic of KoreaVisual Reinforcement Learning (RL) trains agents on policies using images showing the potential for real-world applications. However, the limited diversity in the training environment often results in overfitting with agents underperforming in unseen environments. To address this issue, image augmentation is utilized in visual RL to increase data diversity, but the effectiveness is limited due to the potential to alter the semantic information of the image. Therefore, we introduce Mix-Spectrum, a straightforward yet highly effective frequency-based augmentation method that maintains the semantic consistency of data and enhances the agent’s focus on semantic information. The proposed method combines two existing methods: mixing amplitudes of original and reference images, and Random Convolution. Through this synergistic combination of established methods, our approach not only maintains the advantages of each method but also introduces a novel characteristic that enhances performance. Furthermore, the proposed method stands out for adaptability when integrated with any visual RL algorithm, whether off-policy or on-policy. Through extensive experiments on the DMControl Generalization Benchmark (DMControl-GB) and Procgen, our method demonstrates superior performance compared to existing frequency-based, normalization-based, and image augmentation methods in zero-shot generalization. In DMControl-GB, our method improved by 35.5% over the baseline and 15.2% over the second-best. In Procgen, it achieved 15.2% and 10.1% improvements, respectively.https://ieeexplore.ieee.org/document/10833629/Deep reinforcement learningdata augmentationfast Fourier transforms |
spellingShingle | Jeong Woon Lee Hyoseok Hwang Mix-Spectrum for Generalization in Visual Reinforcement Learning IEEE Access Deep reinforcement learning data augmentation fast Fourier transforms |
title | Mix-Spectrum for Generalization in Visual Reinforcement Learning |
title_full | Mix-Spectrum for Generalization in Visual Reinforcement Learning |
title_fullStr | Mix-Spectrum for Generalization in Visual Reinforcement Learning |
title_full_unstemmed | Mix-Spectrum for Generalization in Visual Reinforcement Learning |
title_short | Mix-Spectrum for Generalization in Visual Reinforcement Learning |
title_sort | mix spectrum for generalization in visual reinforcement learning |
topic | Deep reinforcement learning data augmentation fast Fourier transforms |
url | https://ieeexplore.ieee.org/document/10833629/ |
work_keys_str_mv | AT jeongwoonlee mixspectrumforgeneralizationinvisualreinforcementlearning AT hyoseokhwang mixspectrumforgeneralizationinvisualreinforcementlearning |