Query-guided Regression Network with Context Policy for Phrase Grounding

Kan Chen; Rama Kovvuri; Ram Nevatia

arXiv:1708.01676·cs.CV·August 8, 2017

Query-guided Regression Network with Context Policy for Phrase Grounding

Kan Chen, Rama Kovvuri, Ram Nevatia

PDF

1 Video

TL;DR

This paper introduces QRC Net, a novel framework combining proposal generation, query-guided regression, and context policy learning to improve phrase grounding accuracy significantly.

Contribution

The paper presents a new neural network architecture that jointly learns proposal generation, regression, and context policy for enhanced phrase grounding performance.

Findings

01

Achieved 14.25% and 17.14% accuracy improvements on Flickr30K and Referit datasets.

02

Outperforms state-of-the-art methods in phrase grounding tasks.

03

Utilizes reinforcement learning to leverage semantic context effectively.

Abstract

Given a textual description of an image, phrase grounding localizes objects in the image referred by query phrases in the description. State-of-the-art methods address the problem by ranking a set of proposals based on the relevance to each query, which are limited by the performance of independent proposal generation systems and ignore useful cues from context in the description. In this paper, we adopt a spatial regression method to break the performance limit, and introduce reinforcement learning techniques to further leverage semantic context information. We propose a novel Query-guided Regression network with Context policy (QRC Net) which jointly learns a Proposal Generation Network (PGN), a Query-guided Regression Network (QRN) and a Context Policy Network (CPN). Experiments show QRC Net provides a significant improvement in accuracy on two popular datasets: Flickr30K Entities…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Query-guided Regression Network with Context Policy for Phrase Grounding· youtube