Exploration-Driven Genetic Algorithms for Hyperparameter Optimisation in Deep Reinforcement Learning
This paper investigates the application of genetic algorithms (GAs) for hyperparameter optimisation in deep reinforcement learning (RL), focusing on the Deep Q-Learning (DQN) algorithm. This study aims to identify approaches that enhance RL model performance through the effective exploration of the...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-02-01
|
| Series: | Applied Sciences |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2076-3417/15/4/2067 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | This paper investigates the application of genetic algorithms (GAs) for hyperparameter optimisation in deep reinforcement learning (RL), focusing on the Deep Q-Learning (DQN) algorithm. This study aims to identify approaches that enhance RL model performance through the effective exploration of the configuration space. By comparing different GA methods for selection, crossover, and mutation, this study focuses on deep RL models. The results indicate that GA techniques emphasising the exploration of the configuration space yield significant improvements in optimisation efficiency, reducing training time and enhancing convergence. The most effective GA improved the fitness function value from 68.26 (initial best chromosome) to 979.16 after 200 iterations, demonstrating the efficacy of the proposed approach. Furthermore, variations in specific hyperparameters, such as learning rate, gamma, and update frequency, were shown to substantially affect the DQN model’s learning ability. These findings suggest that exploration-driven GA strategies outperform GA approaches with limited exploration, underscoring the critical role of selection and crossover methods in enhancing DQN model efficiency and performance. Moreover, a mini case study on the CartPole environment revealed that even a 5% sensor dropout impaired the performance of a GA-optimised RL agent, while a 20% dropout almost entirely halted improvements. |
|---|---|
| ISSN: | 2076-3417 |