Learning Generative Models with Visual Attention

Yichuan Tang; Nitish Srivastava; Ruslan Salakhutdinov

arXiv:1312.6110·cs.CV·February 24, 2015·86 cites

Learning Generative Models with Visual Attention

Yichuan Tang, Nitish Srivastava, Ruslan Salakhutdinov

PDF

Open Access

TL;DR

This paper introduces a generative modeling framework that uses visual attention mechanisms to focus on objects within scenes, improving robustness and learning capabilities for complex, unannotated datasets.

Contribution

It proposes a novel attentional generative model that propagates signals from regions of interest to a canonical space, enabling object-centric learning without explicit location annotations.

Findings

01

Model robustly attends to face regions of new subjects

02

Can learn generative models of faces from large, unannotated images

03

Uses a graphical model with 2D similarity transformations and Hamiltonian Monte Carlo

Abstract

Attention has long been proposed by psychologists as important for effectively dealing with the enormous sensory stimulus available in the neocortex. Inspired by the visual attention models in computational neuroscience and the need of object-centric data for generative models, we describe for generative learning framework using attentional mechanisms. Attentional mechanisms can propagate signals from region of interest in a scene to an aligned canonical representation, where generative modeling takes place. By ignoring background clutter, generative models can concentrate their resources on the object of interest. Our model is a proper graphical model where the 2D Similarity transformation is a part of the top-down process. A ConvNet is employed to provide good initializations during posterior inference which is based on Hamiltonian Monte Carlo. Upon learning images of faces, our model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Face recognition and analysis · Aesthetic Perception and Analysis