ANIM-400K: A Large-Scale Dataset for Automated End-To-End Dubbing of   Video

Kevin Cai; Chonghua Liu; David M. Chan

arXiv:2401.05314·eess.AS·January 11, 2024·1 cites

ANIM-400K: A Large-Scale Dataset for Automated End-To-End Dubbing of Video

Kevin Cai, Chonghua Liu, David M. Chan

PDF

Open Access 1 Repo 1 Datasets

TL;DR

This paper introduces Anim-400K, a large-scale, publicly available dataset of over 425,000 animated video segments in Japanese and English, aimed at advancing automated video dubbing and related tasks.

Contribution

The creation and release of Anim-400K, a comprehensive dataset supporting multiple video-related tasks, addressing data scarcity in automated dubbing research.

Findings

01

Dataset enables improved end-to-end dubbing models

02

Supports various tasks like translation and summarization

03

Facilitates research in automated video processing

Abstract

The Internet's wealth of content, with up to 60% published in English, starkly contrasts the global population, where only 18.8% are English speakers, and just 5.1% consider it their native language, leading to disparities in online information access. Unfortunately, automated processes for dubbing of video - replacing the audio track of a video with a translated alternative - remains a complex and challenging task due to pipelines, necessitating precise timing, facial movement synchronization, and prosody matching. While end-to-end dubbing offers a solution, data scarcity continues to impede the progress of both end-to-end and pipeline-based methods. In this work, we introduce Anim-400K, a comprehensive dataset of over 425K aligned animated video segments in Japanese and English supporting various video-related tasks, including automated dubbing, simultaneous translation, guided video…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

davidmchan/anim400k
noneOfficial

Datasets

davidchan/anim400k
dataset· 51 dl
51 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Analysis and Summarization · Multimodal Machine Learning Applications · Cancer-related molecular mechanisms research