Learned Equivariant Rendering without Transformation Supervision

Cinjon Resnick; Or Litany; Hugo Larochelle; Joan Bruna; Kyunghyun Cho

arXiv:2011.05787·cs.CV·November 12, 2020

Learned Equivariant Rendering without Transformation Supervision

Cinjon Resnick, Or Litany, Hugo Larochelle, Joan Bruna, Kyunghyun Cho

PDF

Open Access

TL;DR

This paper introduces a self-supervised method for learning scene representations from videos, enabling real-time scene manipulation and rendering without explicit transformation supervision.

Contribution

It presents a novel framework that leverages object equivariance and background constancy to automatically delineate objects and backgrounds in videos.

Findings

01

Effective on moving MNIST with backgrounds

02

Allows real-time scene manipulation

03

No transformation supervision needed

Abstract

We propose a self-supervised framework to learn scene representations from video that are automatically delineated into objects and background. Our method relies on moving objects being equivariant with respect to their transformation across frames and the background being constant. After training, we can manipulate and render the scenes in real time to create unseen combinations of objects, transformations, and backgrounds. We show results on moving MNIST with backgrounds.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Generative Adversarial Networks and Image Synthesis · 3D Shape Modeling and Analysis