Learning Task Specifications from Demonstrations
Marcell Vazquez-Chanlatte, Susmit Jha, Ashish Tiwari, Mark K. Ho,, Sanjit A. Seshia

TL;DR
This paper introduces a method to infer logical task specifications from demonstrations, enabling safe and interpretable composition of sub-tasks in uncertain environments, which improves upon previous approaches that lacked guarantees or flexibility.
Contribution
It formulates specification inference as a MAP problem using maximum entropy, providing an efficient way to identify likely logical specifications from demonstrations.
Findings
Learning specifications prevents issues from ad-hoc reward composition
The approach efficiently searches large candidate pools of specifications
Demonstrates improved interpretability and composability of learned sub-tasks
Abstract
Real world applications often naturally decompose into several sub-tasks. In many settings (e.g., robotics) demonstrations provide a natural way to specify the sub-tasks. However, most methods for learning from demonstrations either do not provide guarantees that the artifacts learned for the sub-tasks can be safely recombined or limit the types of composition available. Motivated by this deficit, we consider the problem of inferring Boolean non-Markovian rewards (also known as logical trace properties or specifications) from demonstrations provided by an agent operating in an uncertain, stochastic environment. Crucially, specifications admit well-defined composition rules that are typically easy to interpret. In this paper, we formulate the specification inference task as a maximum a posteriori (MAP) probability inference problem, apply the principle of maximum entropy to derive an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Formal Methods in Verification · Advanced Malware Detection Techniques
