Brick-Diffusion: Generating Long Videos with Brick-to-Wall Denoising
Yunlong Yuan, Yuanfan Guo, Chunwei Wang, Hang Xu, Li Zhang

TL;DR
Brick-Diffusion is a training-free method that generates long, high-quality videos by denoising in segments, effectively communicating across frames and overcoming limitations of existing short-video diffusion models.
Contribution
It introduces a novel brick-to-wall denoising strategy enabling long video generation without training, improving quality and motion dynamics.
Findings
Outperforms baseline methods in video fidelity
Enables arbitrary length video generation
Effective communication between frames
Abstract
Recent advances in diffusion models have greatly improved text-driven video generation. However, training models for long video generation demands significant computational power and extensive data, leading most video diffusion models to be limited to a small number of frames. Existing training-free methods that attempt to generate long videos using pre-trained short video diffusion models often struggle with issues such as insufficient motion dynamics and degraded video fidelity. In this paper, we present Brick-Diffusion, a novel, training-free approach capable of generating long videos of arbitrary length. Our method introduces a brick-to-wall denoising strategy, where the latent is denoised in segments, with a stride applied in subsequent iterations. This process mimics the construction of a staggered brick wall, where each brick represents a denoised segment, enabling communication…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCinema and Media Studies
MethodsDiffusion
