Pragmatic-Pedagogic Value Alignment
Jaime F. Fisac, Monica A. Gates, Jessica B. Hamrick, Chang Liu, Dylan, Hadfield-Menell, Malayandi Palaniappan, Dhruv Malik, S. Shankar Sastry,, Thomas L. Griffiths, and Anca D. Dragan

TL;DR
This paper introduces a formal approach to value alignment in robotics, combining multi-agent decision theory with cognitive models to enable robots to better understand and adapt to human objectives through pragmatic reasoning.
Contribution
It presents the first formal analysis of value alignment grounded in empirically validated cognitive models, integrating pedagogic reasoning into cooperative inverse reinforcement learning.
Findings
Captures reciprocity in human-robot interactions
Models human pedagogical reasoning about robot learning
Enables robots to interpret human actions pragmatically
Abstract
As intelligent systems gain autonomy and capability, it becomes vital to ensure that their objectives match those of their human users; this is known as the value-alignment problem. In robotics, value alignment is key to the design of collaborative robots that can integrate into human workflows, successfully inferring and adapting to their users' objectives as they go. We argue that a meaningful solution to value alignment must combine multi-agent decision theory with rich mathematical models of human cognition, enabling robots to tap into people's natural collaborative capabilities. We present a solution to the cooperative inverse reinforcement learning (CIRL) dynamic game based on well-established cognitive models of decision making and theory of mind. The solution captures a key reciprocity relation: the human will not plan her actions in isolation, but rather reason pedagogically…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
