Learning Task Specifications from Demonstrations

Marcell Vazquez-Chanlatte; Susmit Jha; Ashish Tiwari; Mark K. Ho,; Sanjit A. Seshia

arXiv:1710.03875·cs.LG·October 30, 2018·42 cites

Learning Task Specifications from Demonstrations

Marcell Vazquez-Chanlatte, Susmit Jha, Ashish Tiwari, Mark K. Ho,, Sanjit A. Seshia

PDF

Open Access

TL;DR

This paper introduces a method to infer logical task specifications from demonstrations, enabling safe and interpretable composition of sub-tasks in uncertain environments, which improves upon previous approaches that lacked guarantees or flexibility.

Contribution

It formulates specification inference as a MAP problem using maximum entropy, providing an efficient way to identify likely logical specifications from demonstrations.

Findings

01

Learning specifications prevents issues from ad-hoc reward composition

02

The approach efficiently searches large candidate pools of specifications

03

Demonstrates improved interpretability and composability of learned sub-tasks

Abstract

Real world applications often naturally decompose into several sub-tasks. In many settings (e.g., robotics) demonstrations provide a natural way to specify the sub-tasks. However, most methods for learning from demonstrations either do not provide guarantees that the artifacts learned for the sub-tasks can be safely recombined or limit the types of composition available. Motivated by this deficit, we consider the problem of inferring Boolean non-Markovian rewards (also known as logical trace properties or specifications) from demonstrations provided by an agent operating in an uncertain, stochastic environment. Crucially, specifications admit well-defined composition rules that are typically easy to interpret. In this paper, we formulate the specification inference task as a maximum a posteriori (MAP) probability inference problem, apply the principle of maximum entropy to derive an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Formal Methods in Verification · Advanced Malware Detection Techniques