# Two Body Problem: Collaborative Visual Task Completion

**Authors:** Unnat Jain, Luca Weihs, Eric Kolve, Mohammad Rastegari, Svetlana, Lazebnik, Ali Farhadi, Alexander Schwing, Aniruddha Kembhavi

arXiv: 1904.05879 · 2019-04-12

## TL;DR

This paper explores how agents can learn to collaborate in visually rich environments directly from pixel data, emphasizing the roles of explicit and implicit communication in completing complex visual tasks.

## Contribution

It introduces a framework for learning collaboration from pixels in AI2-THOR, highlighting the importance of communication modes in visual task performance.

## Key findings

- Explicit and implicit communication improve task success.
- Learning from pixels enables collaboration in complex environments.
- Visual communication strategies are effective for multi-agent tasks.

## Abstract

Collaboration is a necessary skill to perform tasks that are beyond one agent's capabilities. Addressed extensively in both conventional and modern AI, multi-agent collaboration has often been studied in the context of simple grid worlds. We argue that there are inherently visual aspects to collaboration which should be studied in visually rich environments. A key element in collaboration is communication that can be either explicit, through messages, or implicit, through perception of the other agents and the visual world. Learning to collaborate in a visual environment entails learning (1) to perform the task, (2) when and what to communicate, and (3) how to act based on these communications and the perception of the visual world. In this paper we study the problem of learning to collaborate directly from pixels in AI2-THOR and demonstrate the benefits of explicit and implicit modes of communication to perform visual tasks. Refer to our project page for more details: https://prior.allenai.org/projects/two-body-problem

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1904.05879/full.md

## Figures

36 figures with captions in the complete paper: https://tomesphere.com/paper/1904.05879/full.md

## References

92 references — full list in the complete paper: https://tomesphere.com/paper/1904.05879/full.md

---
Source: https://tomesphere.com/paper/1904.05879