Learning to Guide and to Be Guided in the Architect-Builder Problem
Paul Barde, Tristan Karch, Derek Nowrouzezahrai, Cl\'ement, Moulin-Frier, Christopher Pal, Pierre-Yves Oudeyer

TL;DR
This paper introduces the Architect-Builder Problem, a formal setting where an architect guides a builder without rewards, and proposes ABIG, a method enabling agents to learn communication protocols for task coordination and generalization.
Contribution
The paper formalizes the Architect-Builder Problem and proposes ABIG, a novel learning framework for agents to develop shared communication protocols for task guidance without explicit reward signals.
Findings
ABIG enables agents to learn effective guiding communication protocols.
The learned protocols generalize to unseen tasks.
Agents successfully coordinate in 2D construction tasks.
Abstract
We are interested in interactive agents that learn to coordinate, namely, a -- which performs actions but ignores the goal of the task, i.e. has no access to rewards -- and an which guides the builder towards the goal of the task. We define and explore a formal setting where artificial agents are equipped with mechanisms that allow them to simultaneously learn a task while at the same time evolving a shared communication protocol. Ideally, such learning should only rely on high-level communication priors and be able to handle a large variety of tasks and meanings while deriving communication protocols that can be reused across tasks. We present the Architect-Builder Problem (ABP): an asymmetrical setting in which an architect must learn to guide a builder towards constructing a specific structure. The architect knows the target structure but cannot act in the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAI-based Problem Solving and Planning · Reinforcement Learning in Robotics · Modular Robots and Swarm Intelligence
