Know your audience: specializing grounded language models with listener   subtraction

Aaditya K. Singh; David Ding; Andrew Saxe; Felix Hill; Andrew K.; Lampinen

arXiv:2206.08349·cs.LG·May 3, 2023·1 cites

Know your audience: specializing grounded language models with listener subtraction

Aaditya K. Singh, David Ding, Andrew Saxe, Felix Hill, Andrew K., Lampinen

PDF

Open Access

TL;DR

This paper introduces a method for adapting grounded language models to specific audiences by exploiting differences in listener knowledge, enabling context-dependent language specialization without direct supervision.

Contribution

The paper presents a novel contrastive multi-agent training approach that fine-tunes a speaker model to adapt to different listeners' knowledge using rewards only.

Findings

01

Speaker can adapt to different listeners' knowledge through training.

02

Zero-shot transfer of listener-specific language to real-world data.

03

Method enables language models to specialize without explicit supervision.

Abstract

Effective communication requires adapting to the idiosyncrasies of each communicative context--such as the common ground shared with each partner. Humans demonstrate this ability to specialize to their audience in many contexts, such as the popular game Dixit. We take inspiration from Dixit to formulate a multi-agent image reference game where a (trained) speaker model is rewarded for describing a target image such that one (pretrained) listener model can correctly identify it among distractors, but another listener cannot. To adapt, the speaker must exploit differences in the knowledge it shares with the different listeners. We show that finetuning an attention-based adapter between a CLIP vision encoder and a large language model in this contrastive, multi-agent setting gives rise to context-dependent natural language specialization from rewards only, without direct supervision.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Speech and dialogue systems · Multimodal Machine Learning Applications

MethodsAdapter · Contrastive Language-Image Pre-training