Fine-T2I: An Open, Large-Scale, and Diverse Dataset for High-Quality T2I Fine-Tuning

Xu Ma; Yitian Zhang; Qihua Dong; Yun Fu

arXiv:2602.09439·cs.CV·February 11, 2026

Fine-T2I: An Open, Large-Scale, and Diverse Dataset for High-Quality T2I Fine-Tuning

Xu Ma, Yitian Zhang, Qihua Dong, Yun Fu

PDF

Open Access 1 Datasets

TL;DR

Fine-T2I introduces a large, high-quality, open dataset with over 6 million text-image pairs, combining synthetic and real images, to significantly improve text-to-image model fine-tuning across diverse tasks and styles.

Contribution

The paper presents Fine-T2I, a comprehensive, rigorously filtered dataset that addresses the scarcity of high-quality open datasets for T2I fine-tuning, enabling better model performance.

Findings

01

Fine-T2I improves generation quality across models.

02

Fine-T2I enhances instruction adherence.

03

Dataset covers diverse tasks and styles.

Abstract

High-quality and open datasets remain a major bottleneck for text-to-image (T2I) fine-tuning. Despite rapid progress in model architectures and training pipelines, most publicly available fine-tuning datasets suffer from low resolution, poor text-image alignment, or limited diversity, resulting in a clear performance gap between open research models and enterprise-grade models. In this work, we present Fine-T2I, a large-scale, high-quality, and fully open dataset for T2I fine-tuning. Fine-T2I spans 10 task combinations, 32 prompt categories, 11 visual styles, and 5 prompt templates, and combines synthetic images generated by strong modern models with carefully curated real images from professional photographers. All samples are rigorously filtered for text-image alignment, visual fidelity, and prompt quality, with over 95% of initial candidates removed. The final dataset contains over 6…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

ma-xu/fine-t2i
dataset· 32k dl
32k dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques