Online Weighted Q-Ensembles for Reduced Hyperparameter Tuning in Reinforcement Learning
Renata Garcia, Wouter Caarls

TL;DR
This paper introduces an online weighted Q-ensemble method for reinforcement learning that reduces hyperparameter tuning effort and improves performance on real robotic systems without requiring multiple simulators.
Contribution
The work presents a novel online weighted Q-ensemble approach that decreases hyperparameter tuning in reinforcement learning for robotics, outperforming traditional ensemble strategies.
Findings
Lower variance in results compared to q-average ensembles
Superior performance on robotic benchmarks
Eliminates the need for extensive hyperparameter tuning
Abstract
Reinforcement learning is a promising paradigm for learning robot control, allowing complex control policies to be learned without requiring a dynamics model. However, even state of the art algorithms can be difficult to tune for optimum performance. We propose employing an ensemble of multiple reinforcement learning agents, each with a different set of hyperparameters, along with a mechanism for choosing the best performing set(s) on-line. In the literature, the ensemble technique is used to improve performance in general, but the current work specifically addresses decreasing the hyperparameter tuning effort. Furthermore, our approach targets on-line learning on a single robotic system, and does not require running multiple simulators in parallel. Although the idea is generic, the Deep Deterministic Policy Gradient was the model chosen, being a representative deep learning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Model Reduction and Neural Networks
