Progressive Attention Networks for Visual Attribute Prediction

Paul Hongsuck Seo; Zhe Lin; Scott Cohen; Xiaohui Shen; Bohyung Han

arXiv:1606.02393·cs.CV·August 8, 2018·32 cites

Progressive Attention Networks for Visual Attribute Prediction

Paul Hongsuck Seo, Zhe Lin, Scott Cohen, Xiaohui Shen, Bohyung Han

PDF

Open Access 1 Repo

TL;DR

This paper introduces a progressive attention model for visual attribute prediction that iteratively refines focus on relevant image regions, improving accuracy over traditional methods.

Contribution

It presents a novel progressive attention mechanism that effectively attends to objects of various scales and shapes, incorporating local context for enhanced performance.

Findings

01

Outperforms traditional attention methods on synthetic and real datasets

02

Works well with hard attention mechanisms

03

Improves accuracy in visual attribute prediction tasks

Abstract

We propose a novel attention model that can accurately attends to target objects of various scales and shapes in images. The model is trained to gradually suppress irrelevant regions in an input image via a progressive attentive process over multiple layers of a convolutional neural network. The attentive process in each layer determines whether to pass or block features at certain spatial locations for use in the subsequent layers. The proposed progressive attention mechanism works well especially when combined with hard attention. We further employ local contexts to incorporate neighborhood features of each location and estimate a better attention probability map. The experiments on synthetic and real datasets show that the proposed attention networks outperform traditional attention methods in visual attribute prediction tasks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hworang77/PAN
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Visual Attention and Saliency Detection