Video Diffusion Alignment via Reward Gradients
Mihir Prabhudesai, Russell Mendonca, Zheyang Qin, Katerina, Fragkiadaki, Deepak Pathak

TL;DR
This paper introduces a method to efficiently adapt pre-trained video diffusion models to specific tasks by using reward gradients from preference-based reward models, reducing the need for extensive dataset collection.
Contribution
The work proposes leveraging reward gradients from preference-based models to align video diffusion models, enabling more efficient adaptation without large labeled datasets.
Findings
Outperforms gradient-free methods in reward query efficiency
Enables effective video diffusion model adaptation with fewer computations
Demonstrates versatility across various reward and diffusion models
Abstract
We have made significant progress towards building foundational video diffusion models. As these models are trained using large-scale unsupervised data, it has become crucial to adapt these models to specific downstream tasks. Adapting these models via supervised fine-tuning requires collecting target datasets of videos, which is challenging and tedious. In this work, we utilize pre-trained reward models that are learned via preferences on top of powerful vision discriminative models to adapt video diffusion models. These models contain dense gradient information with respect to generated RGB pixels, which is critical to efficient learning in complex search spaces, such as videos. We show that backpropagating gradients from these reward models to a video diffusion model can allow for compute and sample efficient alignment of the video diffusion model. We show results across a variety of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image Processing Techniques · Image and Signal Denoising Methods · Generative Adversarial Networks and Image Synthesis
MethodsDiffusion
