Interpretable Apprenticeship Learning with Temporal Logic Specifications
Daniel Kasenberg, Matthias Scheutz

TL;DR
This paper introduces a method to infer linear temporal logic specifications from agent behavior in Markov Decision Processes, using multiobjective optimization and genetic programming, enhancing interpretability of learned behaviors.
Contribution
It presents a novel inverse approach to LTL specification inference from demonstrations, combining violation cost metrics with genetic programming for improved interpretability.
Findings
Effective inference of LTL specs demonstrated in simple domains
Multiobjective optimization balances different aspects of behavior
Genetic programming successfully solves the inverse problem
Abstract
Recent work has addressed using formulas in linear temporal logic (LTL) as specifications for agents planning in Markov Decision Processes (MDPs). We consider the inverse problem: inferring an LTL specification from demonstrated behavior trajectories in MDPs. We formulate this as a multiobjective optimization problem, and describe state-based ("what actually happened") and action-based ("what the agent expected to happen") objective functions based on a notion of "violation cost". We demonstrate the efficacy of the approach by employing genetic programming to solve this problem in two simple domains.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
