EntityBench: Towards Entity-Consistent Long-Range Multi-Shot Video Generation

Ruozhen He; Meng Wei; Ziyan Yang; Vicente Ordonez

arXiv:2605.15199·cs.CV·May 15, 2026

EntityBench: Towards Entity-Consistent Long-Range Multi-Shot Video Generation

Ruozhen He, Meng Wei, Ziyan Yang, Vicente Ordonez

PDF

1 Repo 1 Datasets

TL;DR

EntityBench is a comprehensive benchmark for evaluating long-range multi-shot video generation, emphasizing entity consistency across complex visual narratives.

Contribution

The paper introduces EntityBench, a new dataset and evaluation suite, along with EntityMem, a memory-augmented generation system for improved entity consistency.

Findings

01

Cross-shot entity consistency decreases with recurrence distance in existing methods.

02

Explicit per-entity memory significantly improves character fidelity and presence.

03

EntityMem outperforms baseline methods in maintaining entity consistency.

Abstract

Multi-shot video generation extends single-shot generation to coherent visual narratives, yet maintaining consistent characters, objects, and locations across shots remains a challenge over long sequences. Existing evaluations typically use independently generated prompt sets with limited entity coverage and simple consistency metrics, making standardized comparison difficult. We introduce EntityBench, a benchmark of 140 episodes (2,491 shots) derived from real narrative media, with explicit per-shot entity schedules tracking characters, objects, and locations simultaneously across easy / medium / hard tiers of up to 50 shots, 13 cross-shot characters, 8 cross-shot locations, 22 cross-shot objects, and recurrence gaps spanning up to 48 shots. It is paired with a three-pillar evaluation suite that disentangles intra-shot quality, prompt-following alignment, and cross-shot consistency,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Catherine-R-He/EntityBench
github

Datasets

aoiandroid/papers
dataset· 28 dl
28 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.