Learning to Play Guess Who? and Inventing a Grounded Language as a Consequence
Emilio Jorge, Mikael K{\aa}geb\"ack, Fredrik D. Johansson, Emil, Gustavsson

TL;DR
This paper introduces a method where agents learn to communicate and develop a grounded language through interactive gameplay in Guess Who?, demonstrating the emergence of physical concept encoding and multi-step dialogue management.
Contribution
It presents a novel framework combining Deep Recurrent Q-Networks and situated interactions for grounded language learning in a game environment.
Findings
Agents successfully encode physical concepts in their language.
Agents learn to maintain multi-step dialogues with memory.
Grounded language emerges from interactive image search tasks.
Abstract
Acquiring your first language is an incredible feat and not easily duplicated. Learning to communicate using nothing but a few pictureless books, a corpus, would likely be impossible even for humans. Nevertheless, this is the dominating approach in most natural language processing today. As an alternative, we propose the use of situated interactions between agents as a driving force for communication, and the framework of Deep Recurrent Q-Networks for evolving a shared language grounded in the provided environment. We task the agents with interactive image search in the form of the game Guess Who?. The images from the game provide a non trivial environment for the agents to discuss and a natural grounding for the concepts they decide to encode in their communication. Our experiments show that the agents learn not only to encode physical concepts in their words, i.e. grounding, but also…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Natural Language Processing Techniques
