ScrollScape: Unlocking 32K Image Generation With Video Diffusion Priors

Haodong Yu; Yabo Zhang; Donglin Di; Ruyi Zhang; Wangmeng Zuo

arXiv:2603.24270·cs.CV·April 3, 2026

ScrollScape: Unlocking 32K Image Generation With Video Diffusion Priors

Haodong Yu, Yabo Zhang, Donglin Di, Ruyi Zhang, Wangmeng Zuo

PDF

1 Models

TL;DR

ScrollScape transforms ultra-high-resolution image synthesis into a video generation problem, leveraging video priors to maintain structural integrity at 32K resolution with extreme aspect ratios.

Contribution

It introduces a novel framework that reformulates EAR image synthesis as video generation, utilizing spatial-temporal mapping and super-resolution priors for unprecedented scale and quality.

Findings

01

Outperforms existing baselines by reducing artifacts.

02

Achieves 32K resolution with global coherence.

03

Effectively aligns video priors with high-resolution image synthesis.

Abstract

While diffusion models excel at generating images with conventional dimensions, pushing them to synthesize ultra-high-resolution imagery at extreme aspect ratios (EAR) often triggers catastrophic structural failures, such as object repetition and spatial fragmentation. This limitation fundamentally stems from a lack of robust spatial priors, as static text-to-image models are primarily trained on image distributions with conventional dimensions. To overcome this bottleneck, we present ScrollScape, a novel framework that reformulates EAR image synthesis into a continuous video generation process through two core innovations. By mapping the spatial expansion of a massive canvas to the temporal evolution of video frames, ScrollScape leverages the inherent temporal consistency of video models as a powerful global constraint to ensure long-range structural integrity. Specifically, Scanning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
kkkkkb/ScrollScape
model

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.