CompGuessWhat?!: A Multi-task Evaluation Framework for Grounded Language Learning
Alessandro Suglia, Ioannis Konstas, Andrea Vanzo, Emanuele, Bastianelli, Desmond Elliott, Stella Frank, Oliver Lemon

TL;DR
This paper introduces GROLLA, a comprehensive evaluation framework with three sub-tasks for assessing grounded language learning models, and presents the CompGuessWhat?! dataset to evaluate attribute grounding and generalization in neural representations.
Contribution
It proposes a multi-task evaluation framework and a new dataset to better assess the quality and generalization of grounded language learning models.
Findings
Current models have limited ability to encode object attributes (average F1 44.27).
Models struggle with zero-shot generalization, achieving only 50.06% accuracy.
The framework highlights the need for more expressive and robust representations.
Abstract
Approaches to Grounded Language Learning typically focus on a single task-based final performance measure that may not depend on desirable properties of the learned hidden representations, such as their ability to predict salient attributes or to generalise to unseen situations. To remedy this, we present GROLLA, an evaluation framework for Grounded Language Learning with Attributes with three sub-tasks: 1) Goal-oriented evaluation; 2) Object attribute prediction evaluation; and 3) Zero-shot evaluation. We also propose a new dataset CompGuessWhat?! as an instance of this framework for evaluating the quality of learned neural representations, in particular concerning attribute grounding. To this end, we extend the original GuessWhat?! dataset by including a semantic layer on top of the perceptual one. Specifically, we enrich the VisualGenome scene graphs associated with the GuessWhat?!…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · Natural Language Processing Techniques
