CelebV-Text: A Large-Scale Facial Text-Video Dataset

Jianhui Yu; Hao Zhu; Liming Jiang; Chen Change Loy; Weidong Cai; Wayne; Wu

arXiv:2303.14717·cs.CV·March 28, 2023·1 cites

CelebV-Text: A Large-Scale Facial Text-Video Dataset

Jianhui Yu, Hao Zhu, Liming Jiang, Chen Change Loy, Weidong Cai, Wayne, Wu

PDF

Open Access 1 Repo

TL;DR

CelebV-Text is a large, high-quality facial text-video dataset designed to advance research in text-driven face video generation, featuring 70,000 diverse videos with precise descriptive texts and a benchmark for evaluation.

Contribution

The paper introduces CelebV-Text, a novel large-scale facial text-video dataset with high-quality annotations and a benchmark, addressing the lack of suitable datasets for facial text-to-video generation.

Findings

01

CelebV-Text outperforms existing datasets in diversity and relevance.

02

The dataset enables effective training and evaluation of facial text-to-video models.

03

Benchmark results demonstrate the dataset's utility for standardizing evaluation.

Abstract

Text-driven generation models are flourishing in video generation and editing. However, face-centric text-to-video generation remains a challenge due to the lack of a suitable dataset containing high-quality videos and highly relevant texts. This paper presents CelebV-Text, a large-scale, diverse, and high-quality dataset of facial text-video pairs, to facilitate research on facial text-to-video generation tasks. CelebV-Text comprises 70,000 in-the-wild face video clips with diverse visual content, each paired with 20 texts generated using the proposed semi-automatic text generation strategy. The provided texts are of high quality, describing both static and dynamic attributes precisely. The superiority of CelebV-Text over other datasets is demonstrated via comprehensive statistical analysis of the videos, texts, and text-video relevance. The effectiveness and potential of CelebV-Text…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

CelebV-Text/CelebV-Text
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Analysis and Summarization · Generative Adversarial Networks and Image Synthesis · Human Motion and Animation