Multiple Plans are Better than One: Diverse Stochastic Planning

Mahsa Ghasemi; Evan Scope Crafts; Bo Zhao; Ufuk Topcu

arXiv:2012.15485·cs.RO·January 1, 2021

Multiple Plans are Better than One: Diverse Stochastic Planning

Mahsa Ghasemi, Evan Scope Crafts, Bo Zhao, Ufuk Topcu

PDF

TL;DR

This paper introduces diverse stochastic planning, generating multiple near-optimal and diverse policies for Markov decision processes to better handle complex or private specifications, especially in human-robot interaction.

Contribution

It formulates the problem of creating diverse, near-optimal policies as a constrained nonlinear optimization and proposes a Frank-Wolfe based solution with proven convergence.

Findings

01

Effective in generating diverse policies in planning problems.

02

Converges to a stationary point with theoretical guarantees.

03

Demonstrated success in multiple planning scenarios.

Abstract

In planning problems, it is often challenging to fully model the desired specifications. In particular, in human-robot interaction, such difficulty may arise due to human's preferences that are either private or complex to model. Consequently, the resulting objective function can only partially capture the specifications and optimizing that may lead to poor performance with respect to the true specifications. Motivated by this challenge, we formulate a problem, called diverse stochastic planning, that aims to generate a set of representative -- small and diverse -- behaviors that are near-optimal with respect to the known objective. In particular, the problem aims to compute a set of diverse and near-optimal policies for systems modeled by a Markov decision process. We cast the problem as a constrained nonlinear optimization for which we propose a solution relying on the Frank-Wolfe…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.