Discovering Hidden Factors of Variation in Deep Networks
Brian Cheung, Jesse A. Livezey, Arjun K. Bansal, Bruno A. Olshausen

TL;DR
This paper presents a method to enable deep networks to discover and explicitly represent underlying factors of variation in data, beyond what is needed for classification, using a novel regularization technique.
Contribution
It introduces the cross-covariance penalty (XCov) to disentangle factors of variation in deep autoencoders, improving interpretability and manipulation of learned features.
Findings
Successfully disentangles handwriting style and identity in images
Enables generation of manipulated data instances
Shows deep networks can extrapolate hidden variations
Abstract
Deep learning has enjoyed a great deal of success because of its ability to learn useful features for tasks such as classification. But there has been less exploration in learning the factors of variation apart from the classification signal. By augmenting autoencoders with simple regularization terms during training, we demonstrate that standard deep architectures can discover and explicitly represent factors of variation beyond those relevant for categorization. We introduce a cross-covariance penalty (XCov) as a method to disentangle factors like handwriting style for digits and subject identity in faces. We demonstrate this on the MNIST handwritten digit database, the Toronto Faces Database (TFD) and the Multi-PIE dataset by generating manipulated instances of the data. Furthermore, we demonstrate these deep networks can extrapolate `hidden' variation in the supervised signal.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Face recognition and analysis · Digital Media Forensic Detection
