The red one!: On learning to refer to things based on their discriminative properties
Angeliki Lazaridou, Nghia The Pham, Marco Baroni

TL;DR
This paper introduces a system for learning to identify and assign discriminative attributes to objects in visual environments, enabling agents to communicate about their visual properties without direct attribute supervision.
Contribution
It presents a novel approach for learning discriminative attributes and their referential success in visual contexts without explicit attribute-level supervision.
Findings
The system successfully identifies discriminative properties between objects.
It learns plausible attribute assignments without direct supervision.
Preliminary experiments show effective referential communication.
Abstract
As a first step towards agents learning to communicate about their visual environment, we propose a system that, given visual representations of a referent (cat) and a context (sofa), identifies their discriminative attributes, i.e., properties that distinguish them (has_tail). Moreover, despite the lack of direct supervision at the attribute level, the model learns to assign plausible attributes to objects (sofa-has_cushion). Finally, we present a preliminary experiment confirming the referential success of the predicted discriminative attributes.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Speech and dialogue systems
