HyperQ-Opt: Q-learning for Hyperparameter Optimization
Md. Tarek Hasan

TL;DR
This paper introduces a reinforcement learning approach using Q-learning to improve hyperparameter optimization, aiming for more efficient and scalable search strategies compared to traditional methods.
Contribution
It formulates HPO as a Markov Decision Process and applies Q-learning, offering a novel policy-based framework that outperforms conventional search techniques.
Findings
Q-learning can effectively optimize hyperparameters within limited trials
Reinforcement learning approaches outperform traditional methods in efficiency
Identifies research gaps and future directions in HPO methods
Abstract
Hyperparameter optimization (HPO) is critical for enhancing the performance of machine learning models, yet it often involves a computationally intensive search across a large parameter space. Traditional approaches such as Grid Search and Random Search suffer from inefficiency and limited scalability, while surrogate models like Sequential Model-based Bayesian Optimization (SMBO) rely heavily on heuristic predictions that can lead to suboptimal results. This paper presents a novel perspective on HPO by formulating it as a sequential decision-making problem and leveraging Q-learning, a reinforcement learning technique, to optimize hyperparameters. The study explores the works of H.S. Jomaa et al. and Qi et al., which model HPO as a Markov Decision Process (MDP) and utilize Q-learning to iteratively refine hyperparameter settings. The approaches are evaluated for their ability to find…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Advanced Fiber Optic Sensors · Metaheuristic Optimization Algorithms Research
MethodsQ-Learning · Random Search · Hyper-parameter optimization
