Video-Specific Autoencoders for Exploring, Editing and Transmitting   Videos

Kevin Wang; Deva Ramanan; Aayush Bansal

arXiv:2103.17261·cs.CV·January 11, 2022·1 cites

Video-Specific Autoencoders for Exploring, Editing and Transmitting Videos

Kevin Wang, Deva Ramanan, Aayush Bansal

PDF

Open Access

TL;DR

This paper introduces video-specific autoencoders that learn a representation capturing spatial and temporal features of a video, enabling exploration, editing, and efficient transmission of videos through a unified learned model.

Contribution

The work demonstrates that training a simple autoencoder on a specific video captures its properties and allows for editing and transmission via latent space operations, a novel unified approach.

Findings

01

Latent codes encode spatial and temporal video features.

02

Autoencoders can project out-of-sample inputs onto the video manifold.

03

Linear operations on latent codes facilitate video visualization and editing.

Abstract

We study video-specific autoencoders that allow a human user to explore, edit, and efficiently transmit videos. Prior work has independently looked at these problems (and sub-problems) and proposed different formulations. In this work, we train a simple autoencoder (from scratch) on multiple frames of a specific video. We observe: (1) latent codes learned by a video-specific autoencoder capture spatial and temporal properties of that video; and (2) autoencoders can project out-of-sample inputs onto the video-specific manifold. These two properties allow us to explore, edit, and efficiently transmit a video using one learned representation. For e.g., linear operations on latent codes allow users to visualize the contents of a video. Associating latent codes of a video and manifold projection enables users to make desired edits. Interpolating latent codes and manifold projection allows…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Video Coding and Compression Technologies · Generative Adversarial Networks and Image Synthesis

MethodsSolana Customer Service Number +1-833-534-1729