World Consistency Score: A Unified Metric for Video Generation Quality
Akshat Rakheja, Aarsh Ashdhir, Aryan Bhattacharjee, Vanshika Sharma

TL;DR
The paper introduces World Consistency Score (WCS), a comprehensive metric for evaluating the temporal and physical coherence of generated videos, aligning well with human judgments.
Contribution
It proposes a novel unified evaluation metric combining four interpretable subcomponents to assess video quality holistically.
Findings
WCS correlates strongly with human preferences.
WCS outperforms existing metrics like FVD and CLIPScore.
WCS provides an interpretable assessment of video consistency.
Abstract
We introduce World Consistency Score (WCS), a novel unified evaluation metric for generative video models that emphasizes internal world consistency of the generated videos. WCS integrates four interpretable sub-components - object permanence, relation stability, causal compliance, and flicker penalty - each measuring a distinct aspect of temporal and physical coherence in a video. These submetrics are combined via a learned weighted formula to produce a single consistency score that aligns with human judgments. We detail the motivation for WCS in the context of existing video evaluation metrics, formalize each submetric and how it is computed with open-source tools (trackers, action recognizers, CLIP embeddings, optical flow), and describe how the weights of the WCS combination are trained using human preference data. We also outline an experimental validation blueprint: using…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Human Pose and Action Recognition · Human Motion and Animation
