Strumming to the Beat: Audio-Conditioned Contrastive Video Textures

Medhini Narasimhan; Shiry Ginosar; Andrew Owens; Alexei A. Efros,; Trevor Darrell

arXiv:2104.02687·cs.CV·April 7, 2021

Strumming to the Beat: Audio-Conditioned Contrastive Video Textures

Medhini Narasimhan, Shiry Ginosar, Andrew Owens, Alexei A. Efros,, Trevor Darrell

PDF

Open Access 1 Video

TL;DR

This paper presents a contrastive learning-based method for infinite video texture synthesis that can incorporate audio cues, producing diverse, smooth, and synchronized videos from a single input.

Contribution

It introduces a non-parametric, contrastive learning approach for video texture synthesis that extends to audio-conditioned generation without fine-tuning.

Findings

01

Outperforms baselines on perceptual quality

02

Handles diverse input videos effectively

03

Synthesizes audio-visual synchronized videos

Abstract

We introduce a non-parametric approach for infinite video texture synthesis using a representation learned via contrastive learning. We take inspiration from Video Textures, which showed that plausible new videos could be generated from a single one by stitching its frames together in a novel yet consistent order. This classic work, however, was constrained by its use of hand-designed distance metrics, limiting its use to simple, repetitive videos. We draw on recent techniques from self-supervised learning to learn this distance metric, allowing us to compare frames in a manner that scales to more challenging dynamics, and to condition on other data, such as audio. We learn representations for video frames and frame-to-frame transition probabilities by fitting a video-specific model trained using contrastive learning. To synthesize a texture, we randomly sample frames with high…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Strumming to the Beat: Audio-Conditioned Contrastive Video Textures· youtube

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Computer Graphics and Visualization Techniques · Image Processing and 3D Reconstruction