Mix-Spectrum for Generalization in Visual Reinforcement Learning

Visual Reinforcement Learning (RL) trains agents on policies using images showing the potential for real-world applications. However, the limited diversity in the training environment often results in overfitting with agents underperforming in unseen environments. To address this issue, image augmen...

Full description

Saved in:

Bibliographic Details
Main Authors:	Jeong Woon Lee, Hyoseok Hwang
Format:	Article
Language:	English
Published:	IEEE 2025-01-01
Series:	IEEE Access
Subjects:	Deep reinforcement learning data augmentation fast Fourier transforms
Online Access:	https://ieeexplore.ieee.org/document/10833629/
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1841536204610207744
author	Jeong Woon Lee Hyoseok Hwang
author_facet	Jeong Woon Lee Hyoseok Hwang
author_sort	Jeong Woon Lee
collection	DOAJ
description	Visual Reinforcement Learning (RL) trains agents on policies using images showing the potential for real-world applications. However, the limited diversity in the training environment often results in overfitting with agents underperforming in unseen environments. To address this issue, image augmentation is utilized in visual RL to increase data diversity, but the effectiveness is limited due to the potential to alter the semantic information of the image. Therefore, we introduce Mix-Spectrum, a straightforward yet highly effective frequency-based augmentation method that maintains the semantic consistency of data and enhances the agent’s focus on semantic information. The proposed method combines two existing methods: mixing amplitudes of original and reference images, and Random Convolution. Through this synergistic combination of established methods, our approach not only maintains the advantages of each method but also introduces a novel characteristic that enhances performance. Furthermore, the proposed method stands out for adaptability when integrated with any visual RL algorithm, whether off-policy or on-policy. Through extensive experiments on the DMControl Generalization Benchmark (DMControl-GB) and Procgen, our method demonstrates superior performance compared to existing frequency-based, normalization-based, and image augmentation methods in zero-shot generalization. In DMControl-GB, our method improved by 35.5% over the baseline and 15.2% over the second-best. In Procgen, it achieved 15.2% and 10.1% improvements, respectively.
format	Article
id	doaj-art-bd7ecf6e952e4511bd90a32c1a1198d1
institution	Kabale University
issn	2169-3536
language	English
publishDate	2025-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj-art-bd7ecf6e952e4511bd90a32c1a1198d12025-01-15T00:02:58ZengIEEEIEEE Access2169-35362025-01-01137939795010.1109/ACCESS.2025.352695910833629Mix-Spectrum for Generalization in Visual Reinforcement LearningJeong Woon Lee0https://orcid.org/0009-0006-5862-0124Hyoseok Hwang1https://orcid.org/0000-0003-3241-8455Department of Software Convergence, Kyung Hee University, Yongin, Gyeonggi, Republic of KoreaDepartment of Software Convergence, Kyung Hee University, Yongin, Gyeonggi, Republic of KoreaVisual Reinforcement Learning (RL) trains agents on policies using images showing the potential for real-world applications. However, the limited diversity in the training environment often results in overfitting with agents underperforming in unseen environments. To address this issue, image augmentation is utilized in visual RL to increase data diversity, but the effectiveness is limited due to the potential to alter the semantic information of the image. Therefore, we introduce Mix-Spectrum, a straightforward yet highly effective frequency-based augmentation method that maintains the semantic consistency of data and enhances the agent’s focus on semantic information. The proposed method combines two existing methods: mixing amplitudes of original and reference images, and Random Convolution. Through this synergistic combination of established methods, our approach not only maintains the advantages of each method but also introduces a novel characteristic that enhances performance. Furthermore, the proposed method stands out for adaptability when integrated with any visual RL algorithm, whether off-policy or on-policy. Through extensive experiments on the DMControl Generalization Benchmark (DMControl-GB) and Procgen, our method demonstrates superior performance compared to existing frequency-based, normalization-based, and image augmentation methods in zero-shot generalization. In DMControl-GB, our method improved by 35.5% over the baseline and 15.2% over the second-best. In Procgen, it achieved 15.2% and 10.1% improvements, respectively.https://ieeexplore.ieee.org/document/10833629/Deep reinforcement learningdata augmentationfast Fourier transforms
spellingShingle	Jeong Woon Lee Hyoseok Hwang Mix-Spectrum for Generalization in Visual Reinforcement Learning IEEE Access Deep reinforcement learning data augmentation fast Fourier transforms
title	Mix-Spectrum for Generalization in Visual Reinforcement Learning
title_full	Mix-Spectrum for Generalization in Visual Reinforcement Learning
title_fullStr	Mix-Spectrum for Generalization in Visual Reinforcement Learning
title_full_unstemmed	Mix-Spectrum for Generalization in Visual Reinforcement Learning
title_short	Mix-Spectrum for Generalization in Visual Reinforcement Learning
title_sort	mix spectrum for generalization in visual reinforcement learning
topic	Deep reinforcement learning data augmentation fast Fourier transforms
url	https://ieeexplore.ieee.org/document/10833629/
work_keys_str_mv	AT jeongwoonlee mixspectrumforgeneralizationinvisualreinforcementlearning AT hyoseokhwang mixspectrumforgeneralizationinvisualreinforcementlearning

Mix-Spectrum for Generalization in Visual Reinforcement Learning

Similar Items