Model-based Policy Search for Partially Measurable Systems
Fabio Amadio, Alberto Dalla Libera, Ruggero Carli, Daniel Nikovski,, Diego Romeres

TL;DR
This paper introduces MC-PILCO4PMS, a model-based reinforcement learning algorithm that effectively handles systems with unmeasurable states by incorporating state observers into the policy optimization process.
Contribution
It presents a novel GP-based MBRL algorithm explicitly modeling state observers for partially measurable systems, enhancing policy learning in such challenging environments.
Findings
Successful in simulation tests
Effective on real systems
Outperforms previous GP-based MBRL methods
Abstract
In this paper, we propose a Model-Based Reinforcement Learning (MBRL) algorithm for Partially Measurable Systems (PMS), i.e., systems where the state can not be directly measured, but must be estimated through proper state observers. The proposed algorithm, named Monte Carlo Probabilistic Inference for Learning COntrol for Partially Measurable Systems (MC-PILCO4PMS), relies on Gaussian Processes (GPs) to model the system dynamics, and on a Monte Carlo approach to update the policy parameters. W.r.t. previous GP-based MBRL algorithms, MC-PILCO4PMS models explicitly the presence of state observers during policy optimization, allowing to deal PMS. The effectiveness of the proposed algorithm has been tested both in simulation and in two real systems.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSimulation Techniques and Applications · Reinforcement Learning in Robotics · Parallel Computing and Optimization Techniques
