Grounded Textual Entailment

Hoa Trong Vu; Claudio Greco; Aliia Erofeeva; Somayeh Jafaritazehjan,; Guido Linders; Marc Tanti; Alberto Testoni; Raffaella Bernardi; Albert Gatt

arXiv:1806.05645·cs.CL·June 15, 2018·1 cites

Grounded Textual Entailment

Hoa Trong Vu, Claudio Greco, Aliia Erofeeva, Somayeh Jafaritazehjan,, Guido Linders, Marc Tanti, Alberto Testoni, Raffaella Bernardi, Albert Gatt

PDF

Open Access 1 Repo

TL;DR

This paper explores whether incorporating visual information into textual entailment models improves their understanding, using a multimodal SNLI dataset and comparing models with and without visual context.

Contribution

It introduces a visually-grounded version of the textual entailment task and demonstrates that visual data can enhance model performance, highlighting current limitations.

Findings

01

Visual information improves entailment model accuracy.

02

Current multimodal models do not optimally ground visual context.

03

Error analysis reveals areas for model improvement.

Abstract

Capturing semantic relations between sentences, such as entailment, is a long-standing challenge for computational semantics. Logic-based models analyse entailment in terms of possible worlds (interpretations, or situations) where a premise P entails a hypothesis H iff in all worlds where P is true, H is also true. Statistical models view this relationship probabilistically, addressing it in terms of whether a human would likely infer H from P. In this paper, we wish to bridge these two perspectives, by arguing for a visually-grounded version of the Textual Entailment task. Specifically, we ask whether models can perform better if, in addition to P and H, there is also an image (corresponding to the relevant "world" or "situation"). We use a multimodal version of the SNLI dataset (Bowman et al., 2015) and we compare "blind" and visually-augmented models of textual entailment. We show…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

claudiogreco/coling18-gte
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Natural Language Processing Techniques · Topic Modeling