Select to Perfect: Imitating desired behavior from large multi-agent   data

Tim Franzmeyer; Edith Elkind; Philip Torr; Jakob Foerster; Joao; Henriques

arXiv:2405.03735·cs.LG·May 8, 2024

Select to Perfect: Imitating desired behavior from large multi-agent data

Tim Franzmeyer, Edith Elkind, Philip Torr, Jakob Foerster, Joao, Henriques

PDF

Open Access

TL;DR

This paper introduces a method to selectively imitate agents with positive contributions to collective desirability, using the novel concept of Exchange Value to improve AI behavior safety and quality.

Contribution

It proposes the Exchange Value metric to quantify individual agent contributions and develops methods to estimate it from real datasets for better imitation policies.

Findings

01

Exchange Value effectively identifies beneficial agents.

02

Selective imitation improves safety and desirability.

03

Methods outperform baseline imitation approaches.

Abstract

AI agents are commonly trained with large datasets of demonstrations of human behavior. However, not all behaviors are equally safe or desirable. Desired characteristics for an AI agent can be expressed by assigning desirability scores, which we assume are not assigned to individual behaviors but to collective trajectories. For example, in a dataset of vehicle interactions, these scores might relate to the number of incidents that occurred. We first assess the effect of each individual agent's behavior on the collective desirability score, e.g., assessing how likely an agent is to cause incidents. This allows us to selectively imitate agents with a positive effect, e.g., only imitating agents that are unlikely to cause incidents. To enable this, we propose the concept of an agent's Exchange Value, which quantifies an individual agent's contribution to the collective desirability score.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Mining Algorithms and Applications