An Empirical Study on the Generalization Power of Neural Representations Learned via Visual Guessing Games
Alessandro Suglia, Yonatan Bisk, Ioannis Konstas, Antonio Vergari,, Emanuele Bastianelli, Andrea Vanzo, Oliver Lemon

TL;DR
This paper explores how neural representations learned through visual guessing games can enhance generalization in NLP tasks like VQA, introducing supervised and self-play methods to improve performance and object representations.
Contribution
It introduces Self-play via Iterated Experience Learning (SPIEL) and demonstrates its effectiveness in improving generalization for VQA tasks.
Findings
In-domain accuracy increased by 7.79 points on CompGuessWhat?!
VQA harmonic average accuracy improved by 5.31 points on TDIUC
SPIEL enhances fine-grained object representations.
Abstract
Guessing games are a prototypical instance of the "learning by interacting" paradigm. This work investigates how well an artificial agent can benefit from playing guessing games when later asked to perform on novel NLP downstream tasks such as Visual Question Answering (VQA). We propose two ways to exploit playing guessing games: 1) a supervised learning scenario in which the agent learns to mimic successful guessing games and 2) a novel way for an agent to play by itself, called Self-play via Iterated Experience Learning (SPIEL). We evaluate the ability of both procedures to generalize: an in-domain evaluation shows an increased accuracy (+7.79) compared with competitors on the evaluation suite CompGuessWhat?!; a transfer evaluation shows improved performance for VQA on the TDIUC dataset in terms of harmonic average accuracy (+5.31) thanks to more fine-grained object representations…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
