HumanScore: Benchmarking Human Motions in Generated Videos
Yusu Fang, Tiange Xiang, Tian Tan, Narayan Schuetz, Scott Delp, Li Fei-Fei, Ehsan Adeli

TL;DR
HumanScore is a comprehensive evaluation framework for assessing the realism and biomechanical accuracy of human motions in AI-generated videos, addressing a gap in current video generation benchmarks.
Contribution
It introduces six interpretable metrics for detailed analysis of human motion quality, enabling systematic diagnosis and ranking of state-of-the-art models.
Findings
Identifies common failure modes like jitter and implausible poses.
Reveals gaps between perceptual plausibility and biomechanical fidelity.
Provides robust rankings of models based on quantitative criteria.
Abstract
Recent advances in model architectures, compute, and data scale have driven rapid progress in video generation, producing increasingly realistic content. Yet, no prior method systematically measures how faithfully these systems render human bodies and motion dynamics. In this paper, we present HumanScore, a systematic framework to evaluate the quality of human motions in AI-generated videos. HumanScore defines six interpretable metrics spanning kinematic plausibility, temporal stability, and biomechanical consistency, enabling fine-grained diagnosis beyond visual realism alone. Through carefully designed prompts, we elicit a diverse set of movements at varying intensities and evaluate videos generated by thirteen state-of-the-art models. Our analysis reveals consistent gaps between perceptual plausibility and motion biomechanical fidelity, identifies recurrent failure modes (e.g.,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
