Interpretable Emergent Language Using Inter-Agent Transformers

Mannan Bhardwaj

arXiv:2505.02215·cs.AI·May 6, 2025

Interpretable Emergent Language Using Inter-Agent Transformers

Mannan Bhardwaj

PDF

Open Access 1 Repo

TL;DR

This paper introduces Differentiable Inter-Agent Transformers (DIAT), a novel approach enabling interpretable, symbolic communication protocols in multi-agent reinforcement learning using self-attention mechanisms.

Contribution

DIAT is the first method to leverage self-attention for learning human-understandable communication in multi-agent systems, enhancing interpretability and cooperative performance.

Findings

01

DIAT encodes observations into interpretable vocabularies.

02

DIAT effectively solves cooperative tasks with meaningful embeddings.

03

DIAT demonstrates improved interpretability over existing methods.

Abstract

This paper explores the emergence of language in multi-agent reinforcement learning (MARL) using transformers. Existing methods such as RIAL, DIAL, and CommNet enable agent communication but lack interpretability. We propose Differentiable Inter-Agent Transformers (DIAT), which leverage self-attention to learn symbolic, human-understandable communication protocols. Through experiments, DIAT demonstrates the ability to encode observations into interpretable vocabularies and meaningful embeddings, effectively solving cooperative tasks. These results highlight the potential of DIAT for interpretable communication in complex multi-agent environments.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mannanb/diat
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMulti-Agent Systems and Negotiation · Natural Language Processing Techniques · Semantic Web and Ontologies