ChopGrad: Pixel-Wise Losses for Latent Video Diffusion via Truncated Backpropagation
Dmitriy Rivkin, Parker Ewen, Lili Gao, Julian Ost, Stefanie Walz, Rasika Kangutkar, Mario Bijelic, Felix Heide

TL;DR
ChopGrad introduces a truncated backpropagation method for video diffusion models, enabling efficient training and fine-tuning with pixel-wise losses on long or high-resolution videos by limiting gradient computation to local frame windows.
Contribution
The paper proposes ChopGrad, a novel truncated backpropagation scheme that reduces memory costs and allows effective pixel-wise loss training for video diffusion models.
Findings
ChopGrad reduces training memory from linear to constant with respect to video length.
It enables fine-tuning of video diffusion models with pixel-wise losses.
ChopGrad outperforms existing models on various conditional video generation tasks.
Abstract
Recent video diffusion models achieve high-quality generation through recurrent frame processing where each frame generation depends on previous frames. However, this recurrent mechanism means that training such models in the pixel domain incurs prohibitive memory costs, as activations accumulate across the entire video sequence. This fundamental limitation also makes fine-tuning these models with pixel-wise losses computationally intractable for long or high-resolution videos. This paper introduces ChopGrad, a truncated backpropagation scheme for video decoding, limiting gradient computation to local frame windows while maintaining global consistency. We provide a theoretical analysis of this approximation and show that it enables efficient fine-tuning with frame-wise losses. ChopGrad reduces training memory from scaling linearly with the number of video frames (full backpropagation)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
