SAGE: Structure-Aware Generative Video Transitions between Diverse Clips
Mia Kan, Yilin Liu, Niloy Mitra

TL;DR
SAGE introduces a structure-aware, zero-shot video transition method that synthesizes smooth, coherent intermediate frames between diverse clips by leveraging structural guidance and generative synthesis, outperforming existing techniques.
Contribution
The paper presents SAGE, a novel zero-shot approach combining structural guidance with generative synthesis for high-quality video transitions between diverse clips.
Findings
SAGE outperforms classical and generative baselines in quantitative metrics.
User studies favor SAGE for producing coherent transitions.
The method effectively handles large semantic and temporal gaps.
Abstract
Video transitions aim to synthesize intermediate frames between two clips, but naive approaches such as linear blending introduce artifacts that limit professional use or break temporal coherence. Traditional techniques (cross-fades, morphing, frame interpolation) and recent generative inbetweening methods can produce high-quality plausible intermediates, but they struggle with bridging diverse clips involving large temporal gaps or significant semantic differences, leaving a gap for content-aware and visually coherent transitions. We address this challenge by drawing on artistic workflows, distilling strategies such as aligning silhouettes and interpolating salient features to preserve structure and perceptual continuity. Building on these strategies, we propose SAGE (Structure-Aware Generative vidEo transitions) as a simple yet effective zeroshot approach that combines structural…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
