A Framework for History-Aware Hyperparameter Optimisation in   Reinforcement Learning

Juan Marcelo Parra-Ullauri; Chen Zhen; Antonio Garc\'ia-Dom\'inguez,; Nelly Bencomo; Changgang Zheng; Juan Boubeta-Puig; Guadalupe Ortiz; Shufan; Yang

arXiv:2303.05186·cs.LG·March 10, 2023·1 cites

A Framework for History-Aware Hyperparameter Optimisation in Reinforcement Learning

Juan Marcelo Parra-Ullauri, Chen Zhen, Antonio Garc\'ia-Dom\'inguez,, Nelly Bencomo, Changgang Zheng, Juan Boubeta-Puig, Guadalupe Ortiz, Shufan, Yang

PDF

Open Access

TL;DR

This paper introduces a history-aware hyperparameter optimization framework for reinforcement learning that adaptively adjusts parameters during training, leading to improved stability and performance in complex systems.

Contribution

It presents a novel framework combining complex event processing and temporal models to enable runtime hyperparameter adjustments based on historical performance analysis.

Findings

01

Enhanced training stability and reward values with history-aware tuning

02

Significant performance improvements over traditional hyperparameter tuning methods

03

Effective resource utilization through parallel processing

Abstract

A Reinforcement Learning (RL) system depends on a set of initial conditions (hyperparameters) that affect the system's performance. However, defining a good choice of hyperparameters is a challenging problem. Hyperparameter tuning often requires manual or automated searches to find optimal values. Nonetheless, a noticeable limitation is the high cost of algorithm evaluation for complex models, making the tuning process computationally expensive and time-consuming. In this paper, we propose a framework based on integrating complex event processing and temporal models, to alleviate these trade-offs. Through this combination, it is possible to gain insights about a running RL system efficiently and unobtrusively based on data stream monitoring and to create abstract representations that allow reasoning about the historical behaviour of the RL system. The obtained knowledge is exploited…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEvolutionary Algorithms and Applications · Data Stream Mining Techniques · Metaheuristic Optimization Algorithms Research

MethodsConvolution · Dense Connections · Q-Learning · Deep Q-Network