BodyMetric: Evaluating the Realism of Human Bodies in Text-to-Image Generation
Nefeli Andreou, Varsha Vivek, Ying Wang, Alex Vorobiov, Tiffany Deng,, Raja Bala, Larry Davis, Betty Mohler Tesch

TL;DR
BodyMetric is a learnable, scalable metric for evaluating the realism of human bodies in text-to-image generation, trained on a new dataset and leveraging 3D body priors to improve artifact detection.
Contribution
We introduce BodyMetric, a novel learnable metric for assessing human body realism in generated images, supported by a new annotated dataset and 3D prior integration.
Findings
BodyMetric accurately predicts body realism artifacts.
It outperforms general preference metrics in body-specific evaluation.
It enables large-scale benchmarking and ranking of text-to-image models.
Abstract
Accurately generating images of human bodies from text remains a challenging problem for state of the art text-to-image models. Commonly observed body-related artifacts include extra or missing limbs, unrealistic poses, blurred body parts, etc. Currently, evaluation of such artifacts relies heavily on time-consuming human judgments, limiting the ability to benchmark models at scale. We address this by proposing BodyMetric, a learnable metric that predicts body realism in images. BodyMetric is trained on realism labels and multi-modal signals including 3D body representations inferred from the input image, and textual descriptions. In order to facilitate this approach, we design an annotation pipeline to collect expert ratings on human body realism leading to a new dataset for this task, namely, BodyRealism. Ablation studies support our architectural choices for BodyMetric and the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Motion and Animation · Artificial Intelligence in Games · Video Analysis and Summarization
