Learning Graphical Models of Images, Videos and Their Spatial   Transformations

Brendan J. Frey; Nebojsa Jojic

arXiv:1301.3854·cs.CV·January 18, 2013·34 cites

Learning Graphical Models of Images, Videos and Their Spatial Transformations

Brendan J. Frey, Nebojsa Jojic

PDF

Open Access

TL;DR

This paper introduces transformation-invariant graphical models for images and videos by incorporating discrete transformation variables, enabling robust clustering, dimensionality reduction, and analysis despite spatial transformations.

Contribution

It presents a novel approach to modeling spatial transformations within graphical models using a discrete variable, enhancing their ability to handle transformed data.

Findings

01

Effective clustering of faces and facial poses

02

Successful recognition of handwritten digits

03

Robust video clustering and object tracking

Abstract

Mixtures of Gaussians, factor analyzers (probabilistic PCA) and hidden Markov models are staples of static and dynamic data modeling and image and video modeling in particular. We show how topographic transformations in the input, such as translation and shearing in images, can be accounted for in these models by including a discrete transformation variable. The resulting models perform clustering, dimensionality reduction and time-series analysis in a way that is invariant to transformations in the input. Using the EM algorithm, these transformation-invariant models can be fit to static data and time series. We give results on filtering microscopy images, face and facial pose clustering, handwritten digit modeling and recognition, video clustering, object tracking, and removal of distractions from video sequences.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Retrieval and Classification Techniques · Face and Expression Recognition · Advanced Image and Video Retrieval Techniques