DSH-Bench: A Difficulty- and Scenario-Aware Benchmark with Hierarchical Subject Taxonomy for Subject-Driven Text-to-Image Generation

Zhenyu Hu; Qing Wang; Te Cao; Luo Liao; Longfei Lu; Liqun Liu; Shuang Li; Hang Chen; Mengge Xue; Yuan Chen; Chao Deng; Peng Shu; Huan Yu; Jie Jiang

arXiv:2603.08090·cs.CV·April 21, 2026

DSH-Bench: A Difficulty- and Scenario-Aware Benchmark with Hierarchical Subject Taxonomy for Subject-Driven Text-to-Image Generation

Zhenyu Hu, Qing Wang, Te Cao, Luo Liao, Longfei Lu, Liqun Liu, Shuang Li, Hang Chen, Mengge Xue, Yuan Chen, Chao Deng, Peng Shu, Huan Yu, Jie Jiang

PDF

TL;DR

DSH-Bench introduces a detailed, hierarchical benchmark for subject-driven text-to-image models, addressing evaluation gaps by assessing diversity, difficulty, and scenario-specific performance with new metrics and insights.

Contribution

It presents a comprehensive, multi-perspective benchmark with hierarchical subject taxonomy, novel scoring metrics, and diagnostic tools for improved evaluation and development of T2I models.

Findings

01

Demonstrates a 9.4% higher correlation of SICS with human judgment.

02

Uncovers limitations in current models through extensive empirical evaluation.

03

Provides actionable insights for future model training and data strategies.

Abstract

Significant progress has been achieved in subject-driven text-to-image (T2I) generation, which aims to synthesize new images depicting target subjects according to user instructions. However, evaluating these models remains a significant challenge. Existing benchmarks exhibit critical limitations: 1) insufficient diversity and comprehensiveness in subject images, 2) inadequate granularity in assessing model performance across different subject difficulty levels and prompt scenarios, and 3) a profound lack of actionable insights and diagnostic guidance for subsequent model refinement. To address these limitations, we propose DSH-Bench, a comprehensive benchmark that enables systematic multi-perspective analysis of subject-driven T2I models through four principal innovations: 1) a hierarchical taxonomy sampling mechanism ensuring comprehensive subject representation across 58 fine-grained…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.