Se\~norita-2M: A High-Quality Instruction-based Dataset for General Video Editing by Video Specialists
Bojia Zi, Penghui Ruan, Marco Chen, Xianbiao Qi, Shaozhe Hao, Shihao, Zhao, Youze Huang, Bin Liang, Rong Xiao, Kam-Fai Wong

TL;DR
Señorita-2M is a large, high-quality dataset of 2 million video editing pairs created by specialized models, designed to improve end-to-end video editing methods and produce higher quality results.
Contribution
The paper introduces Señorita-2M, a new high-quality, large-scale video editing dataset built with specialized models and filtering, advancing end-to-end video editing techniques.
Findings
The dataset enables high-quality video editing results.
Filtering improves the quality of video pairs.
Effective architecture choices enhance editing performance.
Abstract
Recent advancements in video generation have spurred the development of video editing techniques, which can be divided into inversion-based and end-to-end methods. However, current video editing methods still suffer from several challenges. Inversion-based methods, though training-free and flexible, are time-consuming during inference, struggle with fine-grained editing instructions, and produce artifacts and jitter. On the other hand, end-to-end methods, which rely on edited video pairs for training, offer faster inference speeds but often produce poor editing results due to a lack of high-quality training video pairs. In this paper, to close the gap in end-to-end methods, we introduce Se\~norita-2M, a high-quality video editing dataset. Se\~norita-2M consists of approximately 2 millions of video editing pairs. It is built by crafting four high-quality, specialized video editing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Analysis and Summarization
