Generating Justifications for Norm-Related Agent Decisions
Daniel Kasenberg, Antonio Roque, Ravenna Thielstrom, Meia, Chita-Tegmark, and Matthias Scheutz

TL;DR
This paper introduces a method for generating natural language explanations for agent decisions based on norm-based reasoning, enabling users to understand rules, actions, and violations through natural language.
Contribution
It presents a novel approach to translating temporal logic-based norm reasoning into natural language justifications, improving interpretability of agent decisions.
Findings
Human judgment evaluation shows improved intelligibility.
Method enhances understanding of agent behavior.
Approach increases perceived trust in agent decisions.
Abstract
We present an approach to generating natural language justifications of decisions derived from norm-based reasoning. Assuming an agent which maximally satisfies a set of rules specified in an object-oriented temporal logic, the user can ask factual questions (about the agent's rules, actions, and the extent to which the agent violated the rules) as well as "why" questions that require the agent comparing actual behavior to counterfactual trajectories with respect to these rules. To produce natural-sounding explanations, we focus on the subproblem of producing natural language clauses from statements in a fragment of temporal logic, and then describe how to embed these clauses into explanatory sentences. We use a human judgment evaluation on a testbed task to compare our approach to variants in terms of intelligibility, mental model and perceived trust.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
