Understanding Visual Concepts with Continuation Learning
William F. Whitney, Michael Chang, Tejas Kulkarni, Joshua B. Tenenbaum

TL;DR
This paper presents a neural network architecture and learning algorithm that develop factorized symbolic representations of visual concepts by observing sequential frames, enabling better understanding of variations in images and videos.
Contribution
The authors introduce a novel method for learning symbolic representations from video sequences using a gating mechanism to capture factors of variation.
Findings
Effective on face datasets with 3D transformations
Successfully applied to Atari 2600 game frames
Produces interpretable symbolic representations
Abstract
We introduce a neural network architecture and a learning algorithm to produce factorized symbolic representations. We propose to learn these concepts by observing consecutive frames, letting all the components of the hidden representation except a small discrete set (gating units) be predicted from the previous frame, and let the factors of variation in the next frame be represented entirely by these discrete gated units (corresponding to symbolic representations). We demonstrate the efficacy of our approach on datasets of faces undergoing 3D transformations and Atari 2600 games.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Vision and Imaging · Face and Expression Recognition
