Selective imitation on the basis of reward function similarity
Max Taylor-Davies, Stephanie Droop, Christopher G. Lucas

TL;DR
This paper explores how agents can selectively imitate others based on inferred similarity of reward functions, using group-based inductive biases to improve imitation decisions in multi-agent environments.
Contribution
It introduces a novel approach where agents infer reward function similarity through sparse data and group-based biases to enhance selective imitation strategies.
Findings
Agents can effectively identify similar reward functions with limited data
Group-based inductive biases improve imitation target selection
Selective imitation leads to better adaptation in multi-agent settings
Abstract
Imitation is a key component of human social behavior, and is widely used by both children and adults as a way to navigate uncertain or unfamiliar situations. But in an environment populated by multiple heterogeneous agents pursuing different goals or objectives, indiscriminate imitation is unlikely to be an effective strategy -- the imitator must instead determine who is most useful to copy. There are likely many factors that play into these judgements, depending on context and availability of information. Here we investigate the hypothesis that these decisions involve inferences about other agents' reward functions. We suggest that people preferentially imitate the behavior of others they deem to have similar reward functions to their own. We further argue that these inferences can be made on the basis of very sparse or indirect data, by leveraging an inductive bias toward positing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLanguage and cultural evolution · Action Observation and Synchronization · Evolutionary Game Theory and Cooperation
