Phrase Grounding by Soft-Label Chain Conditional Random Field

Jiacheng Liu; Julia Hockenmaier

arXiv:1909.00301·cs.CL·September 4, 2019

Phrase Grounding by Soft-Label Chain Conditional Random Field

Jiacheng Liu, Julia Hockenmaier

PDF

Open Access 1 Repo

TL;DR

This paper introduces Soft-Label Chain CRFs for phrase grounding, modeling dependencies among image regions and handling multiple correct labels, achieving state-of-the-art results on Flickr30k Entities.

Contribution

It proposes a novel Soft-Label Chain CRF framework for phrase grounding, enabling end-to-end training and capturing entity dependencies effectively.

Findings

01

Achieved state-of-the-art performance on Flickr30k Entities.

02

Model benefits from entity dependency modeling and soft-label training.

03

Soft-Label Chain CRFs improve grounding accuracy.

Abstract

The phrase grounding task aims to ground each entity mention in a given caption of an image to a corresponding region in that image. Although there are clear dependencies between how different mentions of the same caption should be grounded, previous structured prediction methods that aim to capture such dependencies need to resort to approximate inference or non-differentiable losses. In this paper, we formulate phrase grounding as a sequence labeling task where we treat candidate regions as potential labels, and use neural chain Conditional Random Fields (CRFs) to model dependencies among regions for adjacent mentions. In contrast to standard sequence labeling tasks, the phrase grounding task is defined such that there may be multiple correct candidate regions. To address this multiplicity of gold labels, we define so-called Soft-Label Chain CRFs, and present an algorithm that enables…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

liujch1998/SoftLabelCCRF
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Natural Language Processing Techniques · Topic Modeling

MethodsConditional Random Field