Video-Specific Autoencoders for Exploring, Editing and Transmitting Videos
Kevin Wang, Deva Ramanan, Aayush Bansal

TL;DR
This paper introduces video-specific autoencoders that learn a representation capturing spatial and temporal features of a video, enabling exploration, editing, and efficient transmission of videos through a unified learned model.
Contribution
The work demonstrates that training a simple autoencoder on a specific video captures its properties and allows for editing and transmission via latent space operations, a novel unified approach.
Findings
Latent codes encode spatial and temporal video features.
Autoencoders can project out-of-sample inputs onto the video manifold.
Linear operations on latent codes facilitate video visualization and editing.
Abstract
We study video-specific autoencoders that allow a human user to explore, edit, and efficiently transmit videos. Prior work has independently looked at these problems (and sub-problems) and proposed different formulations. In this work, we train a simple autoencoder (from scratch) on multiple frames of a specific video. We observe: (1) latent codes learned by a video-specific autoencoder capture spatial and temporal properties of that video; and (2) autoencoders can project out-of-sample inputs onto the video-specific manifold. These two properties allow us to explore, edit, and efficiently transmit a video using one learned representation. For e.g., linear operations on latent codes allow users to visualize the contents of a video. Associating latent codes of a video and manifold projection enables users to make desired edits. Interpolating latent codes and manifold projection allows…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Video Coding and Compression Technologies · Generative Adversarial Networks and Image Synthesis
MethodsSolana Customer Service Number +1-833-534-1729
