ResMaster: Mastering High-Resolution Image Generation via Structural and Fine-Grained Guidance
Shuwei Shi, Wenbo Li, Yuechen Zhang, Jingwen He, Biao Gong, Yinqiang, Zheng

TL;DR
ResMaster is a training-free method that enhances high-resolution image generation by guiding diffusion models with low-resolution references, ensuring structural coherence and fine details beyond traditional resolution limits.
Contribution
It introduces a novel patch-wise guidance technique using low-resolution references and prompts, enabling high-quality 4K image synthesis without additional training.
Findings
Sets new benchmarks for high-resolution image quality
Reduces structural distortions and pattern repetitions
Demonstrates efficiency in high-resolution generation
Abstract
Diffusion models excel at producing high-quality images; however, scaling to higher resolutions, such as 4K, often results in over-smoothed content, structural distortions, and repetitive patterns. To this end, we introduce ResMaster, a novel, training-free method that empowers resolution-limited diffusion models to generate high-quality images beyond resolution restrictions. Specifically, ResMaster leverages a low-resolution reference image created by a pre-trained diffusion model to provide structural and fine-grained guidance for crafting high-resolution images on a patch-by-patch basis. To ensure a coherent global structure, ResMaster meticulously aligns the low-frequency components of high-resolution patches with the low-resolution reference at each denoising step. For fine-grained guidance, tailored image prompts based on the low-resolution reference and enriched textual prompts…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAugmented Reality Applications · Robotics and Sensor-Based Localization · Interactive and Immersive Displays
MethodsDiffusion
