MatchDiffusion: Training-free Generation of Match-cuts
Alejandro Pardo, Fabio Pizzati, Tong Zhang, Alexander Pondaven, Philip, Torr, Juan Camilo Perez, Bernard Ghanem

TL;DR
MatchDiffusion introduces a training-free, diffusion-model-based method for generating cinematic match-cuts from text prompts, simplifying and democratizing the creative process.
Contribution
It is the first training-free approach to generate match-cuts using diffusion models, leveraging shared noise and staged diffusion to produce coherent scene transitions.
Findings
Effective in creating visually coherent match-cuts
User studies show high satisfaction with generated videos
Outperforms baseline methods in coherence and diversity
Abstract
Match-cuts are powerful cinematic tools that create seamless transitions between scenes, delivering strong visual and metaphorical connections. However, crafting match-cuts is a challenging, resource-intensive process requiring deliberate artistic planning. In MatchDiffusion, we present the first training-free method for match-cut generation using text-to-video diffusion models. MatchDiffusion leverages a key property of diffusion models: early denoising steps define the scene's broad structure, while later steps add details. Guided by this insight, MatchDiffusion employs "Joint Diffusion" to initialize generation for two prompts from shared noise, aligning structure and motion. It then applies "Disjoint Diffusion", allowing the videos to diverge and introduce unique details. This approach produces visually coherent videos suited for match-cuts. User studies and metrics demonstrate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSports Analytics and Performance · Scheduling and Timetabling Solutions
MethodsDiffusion
