TIQA: Human-Aligned Perceptual Text Quality Assessment in Generated Images
Kirill Koltsov, Aleksandr Gushchin, Anastasia Antsiferova, Dmitriy Vatolin

TL;DR
This paper introduces TIQA, a perceptual text quality assessment method for generated images, supported by new datasets and a lightweight predictor, achieving high alignment with human judgments and improving image ranking.
Contribution
The paper presents TIQA, a no-reference perceptual text quality assessment framework with new datasets and a novel predictor, enhancing evaluation of text rendering in generated images.
Findings
ANTIQ achieved PLCC of 0.942 on TIQA-Crops.
ANTIQ improved text quality ranking by 14%.
Datasets support training and evaluation of text quality models.
Abstract
Recent text-to-image models have improved global realism, but text rendering remains a persistent failure mode: images may look convincing overall, yet local typography often contains malformed glyphs, broken strokes, irregular spacing, and other artifacts that humans heavily penalize. We formulate Text-in-Image Quality Assessment (TIQA), a no-reference task that estimates a human-aligned perceptual quality score for detected text regions while disentangling visual text quality from semantic correctness. To support this setting, we introduce two datasets. TIQA-Crops contains 120k text crops from 36k AI-generated images produced by 12 generators, with 10k mean-opinion-score (MOS) labels and 110k proxy labels for pretraining. TIQA-Images contains 1,500 text-heavy images from 10 recent generators, including proprietary systems, with paired overall-quality and text-quality subjective…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
