PixVerve: Advancing Native UHR Image Generation to 100MP with a Large-Scale High-Quality Dataset

Haojun Chen; Haoyang He; Chengming Xu; Qingdong He; Junwei Zhu; Yabiao Wang; Zhucun Xue; Xianfang Zeng; Zhennan Chen; Xiaobin Hu; Hao Zhao; Yong Liu; Jiangning Zhang; Dacheng Tao

arXiv:2605.20147·cs.CV·May 20, 2026

PixVerve: Advancing Native UHR Image Generation to 100MP with a Large-Scale High-Quality Dataset

Haojun Chen, Haoyang He, Chengming Xu, Qingdong He, Junwei Zhu, Yabiao Wang, Zhucun Xue, Xianfang Zeng, Zhennan Chen, Xiaobin Hu, Hao Zhao, Yong Liu, Jiangning Zhang, Dacheng Tao

PDF

1 Repo

TL;DR

PixVerve introduces a large-scale 95K high-quality UHR image-text dataset and extends T2I models to generate 100MP images, supported by a new benchmark and training strategies.

Contribution

The paper presents PixVerve-95K, a high-resolution dataset, and pioneering methods for native 100MP UHR image generation with comprehensive evaluation protocols.

Findings

01

Successfully extended T2I models to 100MP resolution.

02

Established a new benchmark for UHR image quality and semantic alignment.

03

Provided insights into training strategies for ultra-high-resolution image synthesis.

Abstract

Text-to-Image (T2I) models have recently seen notable progress around 1K and 2K resolution. With the extreme desire for better visual experience and the rapid development of imaging technology, the demand for Ultra-High-Resolution (UHR) image generation has grown significantly. However, UHR image generation poses great challenges due to the scarcity and complexity of high-resolution content. In this paper, we first introduce PixVerve-95K, a high-quality, open-source UHR T2I dataset curated with a carefully designed data pipeline, which contains 95K images across diverse scenarios (each image has a minimum pixel-count of 100M) and seven-dimensional annotations. Based on our large-scale image-text dataset, we take a pioneering step to extend various T2I foundation models to native 100MP generation with three training schemes. Finally, leveraging both conventional metrics and multimodal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

haojunchen663/PixVerve-95K
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.