DynamicScaler: Seamless and Scalable Video Generation for Panoramic Scenes
Jinxiu Liu, Shaoheng Lin, Yinxiao Li, Ming-Hsuan Yang

TL;DR
DynamicScaler is a novel diffusion-based method that enables seamless, scalable, and high-quality panoramic video generation for immersive AR/VR applications, maintaining coherence across arbitrary scene sizes.
Contribution
It introduces a spatially scalable diffusion model with an Offset Shifting Denoiser and Global Motion Guidance for coherent panoramic dynamic scene synthesis.
Findings
Achieves superior content and motion quality in panoramic videos.
Maintains constant VRAM usage regardless of output resolution.
Provides a training-free, efficient solution for dynamic scene creation.
Abstract
The increasing demand for immersive AR/VR applications and spatial intelligence has heightened the need to generate high-quality scene-level and 360 panoramic video. However, most video diffusion models are constrained by limited resolution and aspect ratio, which restricts their applicability to scene-level dynamic content synthesis. In this work, we propose , addressing these challenges by enabling spatially scalable and panoramic dynamic scene synthesis that preserves coherence across panoramic scenes of arbitrary size. Specifically, we introduce a Offset Shifting Denoiser, facilitating efficient, synchronous, and coherent denoising panoramic dynamic scenes via a diffusion model with fixed resolution through a seamless rotating Window, which ensures seamless boundary transitions and consistency across the entire panoramic space, accommodating varying…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Motion and Animation · Advanced Vision and Imaging · Computer Graphics and Visualization Techniques
MethodsDiffusion
