TASTE: A Designer-Annotated Multi-Dimensional Preference Dataset for AI-Generated Graphic Design
Haonan Zhu, Elad Hirsch, Alexandria Minetti, Allison Nulty, Purvanshi Mehta

TL;DR
TASTE is a comprehensive dataset of professional designer ratings on AI-generated graphic designs, enabling better evaluation and understanding of model performance across multiple design criteria.
Contribution
The paper introduces TASTE, a multi-dimensional preference dataset with expert annotations, and provides analysis tools and benchmark evaluations for AI-generated graphic design quality.
Findings
Designers agree on preferences across different criteria and image types.
Current models and judges do not surpass 0.55 agreement with expert consensus.
A pairwise-difference model trained on TASTE improves agreement to 0.611.
Abstract
Text-to-image models produce graphic design at production scale, but their supervision comes from photo-style preference data with a single overall verdict per comparison. Designers evaluate along several distinct axes, including typography, visual hierarchy, color harmony, layout, and brief fidelity, and a single label collapses them. We release TASTE (Typography, Aesthetics, Spatial, Tone, Etc.): ten professional designers ranked outputs from four current text-to-image models on nine criteria across two disjoint cohorts, yielding 1,600 ratings per criterion plus per-image hallucination flags on the holistic-preference cohorts. We pair the dataset with three contributions. First, a criterion-agnostic signal test framework, using Kendall's tau, majority probability, and Condorcet cycles against exact iid-uniform nulls at p = 4 and R = 5, places designer agreement on graphic design…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
