Intention-aware policy graphs: answering what, how, and why in opaque   agents

Victor Gimenez-Abalos; Sergio Alvarez-Napagao; Adrian Tormos; Ulises; Cort\'es; Javier V\'azquez-Salceda

arXiv:2409.19038·cs.AI·October 2, 2024

Intention-aware policy graphs: answering what, how, and why in opaque agents

Victor Gimenez-Abalos, Sergio Alvarez-Napagao, Adrian Tormos, Ulises, Cort\'es, Javier V\'azquez-Salceda

PDF

TL;DR

This paper introduces a probabilistic graphical model and workflow to interpret and explain the intentions and behaviour of opaque AI agents, enhancing trustworthiness and understanding of emergent actions.

Contribution

It proposes a novel intention-aware policy graph model and an iterative design workflow to improve interpretability and reliability of agent explanations.

Findings

01

Provides a method to compute intentions from partial observations

02

Enables answering 'what', 'how', and 'why' questions about agent behaviour

03

Includes measurements for interpretability and reliability

Abstract

Agents are a special kind of AI-based software in that they interact in complex environments and have increased potential for emergent behaviour. Explaining such emergent behaviour is key to deploying trustworthy AI, but the increasing complexity and opaque nature of many agent implementations makes this hard. In this work, we propose a Probabilistic Graphical Model along with a pipeline for designing such model -- by which the behaviour of an agent can be deliberated about -- and for computing a robust numerical value for the intentions the agent has at any moment. We contribute measurements that evaluate the interpretability and reliability of explanations provided, and enables explainability questions such as `what do you want to do now?' (e.g. deliver soup) `how do you plan to do it?' (e.g. returning a plan that considers its skills and the world), and `why would you take this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.