Learning Functional Distributional Semantics with Visual Data
Yinhong Liu, Guy Emerson

TL;DR
This paper introduces a method to train a linguistically interpretable distributional semantics model using visual data, specifically grounded in the Visual Genome dataset, improving semantic learning from visual information.
Contribution
It presents a novel approach to train Functional Distributional Semantics models with visual data, enhancing interpretability and performance over previous methods.
Findings
Outperforms previous models on four external datasets
Demonstrates effective grounding of semantics in visual data
Improves learning of semantics from visual sources
Abstract
Functional Distributional Semantics is a recently proposed framework for learning distributional semantics that provides linguistic interpretability. It models the meaning of a word as a binary classifier rather than a numerical vector. In this work, we propose a method to train a Functional Distributional Semantics model with grounded visual data. We train it on the Visual Genome dataset, which is closer to the kind of data encountered in human language acquisition than a large text corpus. On four external evaluation datasets, our model outperforms previous work on learning semantics from Visual Genome.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Natural Language Processing Techniques
