Towards Data-Driven Automatic Video Editing

Sergey Podlesnyy

arXiv:1907.07345·cs.CV·July 18, 2019

Towards Data-Driven Automatic Video Editing

Sergey Podlesnyy

PDF

TL;DR

This paper presents a data-driven approach to automatic video editing that leverages neural networks and imitation learning to select and cut footage based on learned cinematography rules, aiming to produce engaging visual stories.

Contribution

It introduces a novel method combining visual feature extraction and imitation learning for automatic video editing, mimicking professional editing principles.

Findings

01

Controller learns basic cinematography editing rules

02

Produces coherent and visually appealing video edits

03

Demonstrates effectiveness on a corpus of motion pictures

Abstract

Automatic video editing involving at least the steps of selecting the most valuable footage from points of view of visual quality and the importance of action filmed; and cutting the footage into a brief and coherent visual story that would be interesting to watch is implemented in a purely data-driven manner. Visual semantic and aesthetic features are extracted by the ImageNet-trained convolutional neural network, and the editing controller is trained by an imitation learning algorithm. As a result, at test time the controller shows the signs of observing basic cinematography editing rules learned from the corpus of motion pictures masterpieces.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.