Soft Switching Expert Policies for Controlling Systems with Uncertain Parameters
Junya Ikemoto

TL;DR
This paper introduces a two-stage reinforcement learning method that learns multiple control policies in simulation and adaptively switches among them in real systems to handle uncertain parameters and reduce the reality gap.
Contribution
It presents a novel simulation-based reinforcement learning algorithm with adaptive policy switching to better control systems with uncertain parameters.
Findings
Reduces learning complexity compared to single-policy approaches.
Effectively adapts control policies in real systems with parameter uncertainties.
Utilizes online convex optimization for policy switching.
Abstract
This paper proposes a simulation-based reinforcement learning algorithm for controlling systems with uncertain and varying system parameters. While simulators are useful for safely learning control policies, the reality gap remains a major challenge. To alleviate this challenge, we propose a two-stage algorithm. First, multiple control policies are learned for systems with different system parameters in a simulator. Second, for a real system, the control policies are adaptively switched using an online convex optimization algorithm based on observations. This approach is expected to reduce learning complexity compared with existing approaches that rely on a single policy to address the reality gap.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
