Receding Horizon Curiosity

Matthias Schultheis; Boris Belousov; Hany Abdulsamad; Jan Peters

arXiv:1910.03620·cs.LG·October 10, 2019

Receding Horizon Curiosity

Matthias Schultheis, Boris Belousov, Hany Abdulsamad, Jan Peters

PDF

1 Repo

TL;DR

This paper introduces a trajectory-optimization-based exploration algorithm for unknown MDPs that improves sample efficiency and model fidelity by directed exploration, outperforming intrinsic motivation methods and maintaining computational efficiency.

Contribution

It presents a novel approximate solution for optimal exploration in unknown MDPs using Bayesian experimental design and trajectory optimization, without prior knowledge of the environment.

Findings

01

Faster convergence and higher model fidelity compared to intrinsic motivation algorithms.

02

Maintains computational efficiency over recent model-based active exploration methods.

03

Effective directed exploration improves sample efficiency in unknown MDPs.

Abstract

Sample-efficient exploration is crucial not only for discovering rewarding experiences but also for adapting to environment changes in a task-agnostic fashion. A principled treatment of the problem of optimal input synthesis for system identification is provided within the framework of sequential Bayesian experimental design. In this paper, we present an effective trajectory-optimization-based approximate solution of this otherwise intractable problem that models optimal exploration in an unknown Markov decision process (MDP). By interleaving episodic exploration with Bayesian nonlinear system identification, our algorithm takes advantage of the inductive bias to explore in a directed manner, without assuming prior knowledge of the MDP. Empirical evaluations indicate a clear advantage of the proposed algorithm in terms of the rate of convergence and the final model fidelity when…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mschulth/rhc
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.