Relative Classification Accuracy: A Calibrated Metric for Identity Consistency in Fine-Grained K-pop Face Generation
Sylvey Lin, Eranki Vasistha

TL;DR
This paper introduces a calibrated metric called Relative Classification Accuracy (RCA) to evaluate identity consistency in fine-grained K-pop face generation using diffusion models, addressing limitations of traditional metrics.
Contribution
It proposes RCA as a new evaluation metric for identity accuracy, specifically tailored for high-similarity, single-domain face generation tasks like K-pop idols.
Findings
High visual quality (FID 8.93) achieved
Severe semantic mode collapse (RCA 0.27) observed
Failure modes linked to resolution and intra-gender ambiguity
Abstract
Denoising Diffusion Probabilistic Models (DDPMs) have achieved remarkable success in high-fidelity image generation. However, evaluating their semantic controllability-specifically for fine-grained, single-domain tasks-remains challenging. Standard metrics like FID and Inception Score (IS) often fail to detect identity misalignment in such specialized contexts. In this work, we investigate Class-Conditional DDPMs for K-pop idol face generation (32x32), a domain characterized by high inter-class similarity. We propose a calibrated metric, Relative Classification Accuracy (RCA), which normalizes generative performance against an oracle classifier's baseline. Our evaluation reveals a critical trade-off: while the model achieves high visual quality (FID 8.93), it suffers from severe semantic mode collapse (RCA 0.27), particularly for visually ambiguous identities. We analyze these failure…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Face recognition and analysis · Domain Adaptation and Few-Shot Learning
