Ultra-High-Resolution Image Synthesis: Data, Method and Evaluation
Jinjin Zhang, Qiuyu Huang, Junjie Liu, Xiefan Guo, Di Huang

TL;DR
This paper introduces a new dataset, methods, and evaluation metrics for ultra-high-resolution image synthesis, enabling more detailed and realistic image generation at 4K resolution.
Contribution
It presents Aesthetic-4K dataset, Diffusion-4K framework with novel SC-VAE and WLF techniques, and new metrics for comprehensive evaluation of ultra-high-resolution images.
Findings
Diffusion-4K outperforms existing methods in 4K image synthesis.
The proposed metrics effectively measure texture richness and detail.
The framework is compatible with large-scale diffusion models like Flux-12B.
Abstract
Ultra-high-resolution image synthesis holds significant potential, yet remains an underexplored challenge due to the absence of standardized benchmarks and computational constraints. In this paper, we establish Aesthetic-4K, a meticulously curated dataset containing dedicated training and evaluation subsets specifically designed for comprehensive research on ultra-high-resolution image synthesis. This dataset consists of high-quality 4K images accompanied by descriptive captions generated by GPT-4o. Furthermore, we propose Diffusion-4K, an innovative framework for the direct generation of ultra-high-resolution images. Our approach incorporates the Scale Consistent Variational Auto-Encoder (SC-VAE) and Wavelet-based Latent Fine-tuning (WLF), which are designed for efficient visual token compression and the capture of intricate details in ultra-high-resolution images, thereby facilitating…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputer Graphics and Visualization Techniques · Advanced Vision and Imaging · Advanced Image Processing Techniques
MethodsDiffusion
