SURF: Signature-Retained Fast Video Generation

Kaixin Ding; Xi Chen; Sihui Ji; Yuan Gao; Liang Hou; Xin Tao; Hengshuang Zhao

arXiv:2603.21002·cs.GR·May 19, 2026

SURF: Signature-Retained Fast Video Generation

Kaixin Ding, Xi Chen, Sihui Ji, Yuan Gao, Liang Hou, Xin Tao, Hengshuang Zhao

PDF

TL;DR

SURF is an efficient two-stage framework that accelerates high-resolution video generation while preserving the original model's signatures by combining low-res previews and a specialized Refiner.

Contribution

It introduces a novel, training-free noise reshifting technique and a mapping-based Refiner to significantly speed up high-res video generation without signature loss.

Findings

01

Achieves 12.5x speedup on Wan 2.1 videos

02

Achieves 8.7x speedup on HunyuanVideo

03

Maintains signatures close to pretrained models

Abstract

The demand for high-resolution video generation is growing rapidly. However, the generation resolution is severely constrained by slow inference speeds. For instance, Wan2.1 requires over 50 minutes to generate a single 720p video. While previous works explore accelerating video generation from various aspects, most of them compromise the distinctive signatures (e.g., layout, semantic, motion) of the original model. In this work, we propose SURF, an efficient framework for generating high-resolution videos, while maximally keeping the signatures. Specifically, SURF divides video generation into two stages: First, we leverage the pretrained model to infer at optimal resolution and downsample latent to generate low-resolution previews in fast speed; then we design a Refiner to upscale the preview. In the preview stage, we identify that directly inferring a model (trained with higher…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image Processing Techniques · Generative Adversarial Networks and Image Synthesis · Advanced Vision and Imaging