Video Diffusion Alignment via Reward Gradients

Mihir Prabhudesai; Russell Mendonca; Zheyang Qin; Katerina; Fragkiadaki; Deepak Pathak

arXiv:2407.08737·cs.CV·July 12, 2024

Video Diffusion Alignment via Reward Gradients

Mihir Prabhudesai, Russell Mendonca, Zheyang Qin, Katerina, Fragkiadaki, Deepak Pathak

PDF

Open Access 1 Repo 2 Models

TL;DR

This paper introduces a method to efficiently adapt pre-trained video diffusion models to specific tasks by using reward gradients from preference-based reward models, reducing the need for extensive dataset collection.

Contribution

The work proposes leveraging reward gradients from preference-based models to align video diffusion models, enabling more efficient adaptation without large labeled datasets.

Findings

01

Outperforms gradient-free methods in reward query efficiency

02

Enables effective video diffusion model adaptation with fewer computations

03

Demonstrates versatility across various reward and diffusion models

Abstract

We have made significant progress towards building foundational video diffusion models. As these models are trained using large-scale unsupervised data, it has become crucial to adapt these models to specific downstream tasks. Adapting these models via supervised fine-tuning requires collecting target datasets of videos, which is challenging and tedious. In this work, we utilize pre-trained reward models that are learned via preferences on top of powerful vision discriminative models to adapt video diffusion models. These models contain dense gradient information with respect to generated RGB pixels, which is critical to efficient learning in complex search spaces, such as videos. We show that backpropagating gradients from these reward models to a video diffusion model can allow for compute and sample efficient alignment of the video diffusion model. We show results across a variety of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mihirp1998/vader
pytorchOfficial

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image Processing Techniques · Image and Signal Denoising Methods · Generative Adversarial Networks and Image Synthesis

MethodsDiffusion