PAC-Bayes Control: Learning Policies that Provably Generalize to Novel   Environments

Anirudha Majumdar; Alec Farid; Anoopkumar Sonar

arXiv:1806.04225·cs.RO·August 27, 2020

PAC-Bayes Control: Learning Policies that Provably Generalize to Novel Environments

Anirudha Majumdar, Alec Farid, Anoopkumar Sonar

PDF

1 Repo

TL;DR

This paper introduces a PAC-Bayes framework for learning control policies that provably generalize to new environments, with algorithms optimized via convex programming or stochastic gradient descent, demonstrated on robotic tasks.

Contribution

It presents a novel PAC-Bayes-based approach for control policy generalization, including algorithms for finite and continuous policy spaces, with theoretical guarantees and practical robotic applications.

Findings

01

Successful simulation of obstacle avoidance and grasping policies.

02

Hardware validation on a drone navigating through obstacles.

03

Strong generalization guarantees for neural network policies in robotics.

Abstract

Our goal is to learn control policies for robots that provably generalize well to novel environments given a dataset of example environments. The key technical idea behind our approach is to leverage tools from generalization theory in machine learning by exploiting a precise analogy (which we present in the form of a reduction) between generalization of control policies to novel environments and generalization of hypotheses in the supervised learning setting. In particular, we utilize the Probably Approximately Correct (PAC)-Bayes framework, which allows us to obtain upper bounds that hold with high probability on the expected cost of (stochastic) control policies across novel environments. We propose policy learning algorithms that explicitly seek to minimize this upper bound. The corresponding optimization problem can be solved using convex optimization (Relative Entropy Programming…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

irom-lab/PAC-Bayes-Control
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.