Sora Detector: A Unified Hallucination Detection for Large Text-to-Video   Models

Zhixuan Chu; Lei Zhang; Yichen Sun; Siqiao Xue; Zhibo Wang; Zhan Qin,; Kui Ren

arXiv:2405.04180·cs.LG·May 8, 2024·3 cites

Sora Detector: A Unified Hallucination Detection for Large Text-to-Video Models

Zhixuan Chu, Lei Zhang, Yichen Sun, Siqiao Xue, Zhibo Wang, Zhan Qin,, Kui Ren

PDF

Open Access 1 Repo

TL;DR

The paper introduces SoraDetector, a unified framework for detecting hallucinations in large text-to-video models, improving reliability by evaluating content consistency and using knowledge graphs, with a new benchmark for evaluation.

Contribution

It presents SoraDetector, a novel unified hallucination detection framework for T2V models, incorporating content analysis, knowledge graphs, and automation, along with a new benchmark dataset.

Findings

01

Effective detection of hallucinations across multiple T2V models

02

High accuracy in identifying static and dynamic hallucinations

03

Demonstrated robustness on Sora and other large T2V models

Abstract

The rapid advancement in text-to-video (T2V) generative models has enabled the synthesis of high-fidelity video content guided by textual descriptions. Despite this significant progress, these models are often susceptible to hallucination, generating contents that contradict the input text, which poses a challenge to their reliability and practical deployment. To address this critical issue, we introduce the SoraDetector, a novel unified framework designed to detect hallucinations across diverse large T2V models, including the cutting-edge Sora model. Our framework is built upon a comprehensive analysis of hallucination phenomena, categorizing them based on their manifestation in the video content. Leveraging the state-of-the-art keyframe extraction techniques and multimodal large language models, SoraDetector first evaluates the consistency between extracted video content summary and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

truthai-lab/soradetector
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Retrieval and Classification Techniques · Digital Media Forensic Detection · Advanced Steganography and Watermarking Techniques