Model-based Policy Search for Partially Measurable Systems

Fabio Amadio; Alberto Dalla Libera; Ruggero Carli; Daniel Nikovski,; Diego Romeres

arXiv:2101.08740·cs.RO·January 22, 2021

Model-based Policy Search for Partially Measurable Systems

Fabio Amadio, Alberto Dalla Libera, Ruggero Carli, Daniel Nikovski,, Diego Romeres

PDF

Open Access

TL;DR

This paper introduces MC-PILCO4PMS, a model-based reinforcement learning algorithm that effectively handles systems with unmeasurable states by incorporating state observers into the policy optimization process.

Contribution

It presents a novel GP-based MBRL algorithm explicitly modeling state observers for partially measurable systems, enhancing policy learning in such challenging environments.

Findings

01

Successful in simulation tests

02

Effective on real systems

03

Outperforms previous GP-based MBRL methods

Abstract

In this paper, we propose a Model-Based Reinforcement Learning (MBRL) algorithm for Partially Measurable Systems (PMS), i.e., systems where the state can not be directly measured, but must be estimated through proper state observers. The proposed algorithm, named Monte Carlo Probabilistic Inference for Learning COntrol for Partially Measurable Systems (MC-PILCO4PMS), relies on Gaussian Processes (GPs) to model the system dynamics, and on a Monte Carlo approach to update the policy parameters. W.r.t. previous GP-based MBRL algorithms, MC-PILCO4PMS models explicitly the presence of state observers during policy optimization, allowing to deal PMS. The effectiveness of the proposed algorithm has been tested both in simulation and in two real systems.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSimulation Techniques and Applications · Reinforcement Learning in Robotics · Parallel Computing and Optimization Techniques