Countering Language Drift via Visual Grounding
Jason Lee, Kyunghyun Cho, Douwe Kiela

TL;DR
This paper investigates how to prevent language drift in pre-trained multi-agent communication systems by combining syntactic and semantic constraints, specifically visual grounding, to maintain natural language structure while conveying accurate meaning.
Contribution
It introduces a novel approach using combined syntactic and semantic constraints, including visual grounding, to mitigate language drift in pre-trained agents.
Findings
Combining syntactic and semantic constraints improves communication fidelity.
Visual grounding helps retain natural language syntax.
Pre-trained agents can preserve English syntax while conveying intended meanings.
Abstract
Emergent multi-agent communication protocols are very different from natural language and not easily interpretable by humans. We find that agents that were initially pretrained to produce natural language can also experience detrimental language drift: when a non-linguistic reward is used in a goal-based task, e.g. some scalar success metric, the communication protocol may easily and radically diverge from natural language. We recast translation as a multi-agent communication game and examine auxiliary training constraints for their effectiveness in mitigating language drift. We show that a combination of syntactic (language model likelihood) and semantic (visual grounding) constraints gives the best communication performance, allowing pre-trained agents to retain English syntax while learning to accurately convey the intended meaning.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
