Sora Detector: A Unified Hallucination Detection for Large Text-to-Video Models
Zhixuan Chu, Lei Zhang, Yichen Sun, Siqiao Xue, Zhibo Wang, Zhan Qin,, Kui Ren

TL;DR
The paper introduces SoraDetector, a unified framework for detecting hallucinations in large text-to-video models, improving reliability by evaluating content consistency and using knowledge graphs, with a new benchmark for evaluation.
Contribution
It presents SoraDetector, a novel unified hallucination detection framework for T2V models, incorporating content analysis, knowledge graphs, and automation, along with a new benchmark dataset.
Findings
Effective detection of hallucinations across multiple T2V models
High accuracy in identifying static and dynamic hallucinations
Demonstrated robustness on Sora and other large T2V models
Abstract
The rapid advancement in text-to-video (T2V) generative models has enabled the synthesis of high-fidelity video content guided by textual descriptions. Despite this significant progress, these models are often susceptible to hallucination, generating contents that contradict the input text, which poses a challenge to their reliability and practical deployment. To address this critical issue, we introduce the SoraDetector, a novel unified framework designed to detect hallucinations across diverse large T2V models, including the cutting-edge Sora model. Our framework is built upon a comprehensive analysis of hallucination phenomena, categorizing them based on their manifestation in the video content. Leveraging the state-of-the-art keyframe extraction techniques and multimodal large language models, SoraDetector first evaluates the consistency between extracted video content summary and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Retrieval and Classification Techniques · Digital Media Forensic Detection · Advanced Steganography and Watermarking Techniques
