Automatic tuning of hyper-parameters of reinforcement learning algorithms using Bayesian optimization with behavioral cloning
Juan Cruz Barsce, Jorge A. Palombarini, Ernesto C. Mart\'inez

TL;DR
This paper introduces a novel Bayesian optimization approach with behavioral cloning for automatic hyper-parameter tuning in reinforcement learning, reducing the need for manual tuning and improving learning efficiency.
Contribution
It proposes a new meta-learning method that uses behavioral cloning to enhance Bayesian optimization for hyper-parameter tuning in RL, making algorithms more user-independent.
Findings
Reduces the number of state transitions needed for convergence.
Outperforms manual tuning and other optimization methods.
Improves data efficiency and learning speed in RL tasks.
Abstract
Optimal setting of several hyper-parameters in machine learning algorithms is key to make the most of available data. To this aim, several methods such as evolutionary strategies, random search, Bayesian optimization and heuristic rules of thumb have been proposed. In reinforcement learning (RL), the information content of data gathered by the learning agent while interacting with its environment is heavily dependent on the setting of many hyper-parameters. Therefore, the user of an RL algorithm has to rely on search-based optimization methods, such as grid search or the Nelder-Mead simplex algorithm, that are very inefficient for most RL tasks, slows down significantly the learning curve and leaves to the user the burden of purposefully biasing data gathering. In this work, in order to make an RL algorithm more user-independent, a novel approach for autonomous hyper-parameter setting…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEvolutionary Algorithms and Applications · Metaheuristic Optimization Algorithms Research · Advanced Bandit Algorithms Research
