Interpretable Apprenticeship Learning with Temporal Logic Specifications

Daniel Kasenberg; Matthias Scheutz

arXiv:1710.10532·cs.SY·November 2, 2017

Interpretable Apprenticeship Learning with Temporal Logic Specifications

Daniel Kasenberg, Matthias Scheutz

PDF

TL;DR

This paper introduces a method to infer linear temporal logic specifications from agent behavior in Markov Decision Processes, using multiobjective optimization and genetic programming, enhancing interpretability of learned behaviors.

Contribution

It presents a novel inverse approach to LTL specification inference from demonstrations, combining violation cost metrics with genetic programming for improved interpretability.

Findings

01

Effective inference of LTL specs demonstrated in simple domains

02

Multiobjective optimization balances different aspects of behavior

03

Genetic programming successfully solves the inverse problem

Abstract

Recent work has addressed using formulas in linear temporal logic (LTL) as specifications for agents planning in Markov Decision Processes (MDPs). We consider the inverse problem: inferring an LTL specification from demonstrated behavior trajectories in MDPs. We formulate this as a multiobjective optimization problem, and describe state-based ("what actually happened") and action-based ("what the agent expected to happen") objective functions based on a notion of "violation cost". We demonstrate the efficacy of the approach by employing genetic programming to solve this problem in two simple domains.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.