EditIQ: Automated Cinematic Editing of Static Wide-Angle Videos via   Dialogue Interpretation and Saliency Cues

Rohit Girmaji; Bhav Beri; Ramanathan Subramanian; and Vineet Gandhi

arXiv:2502.02172·cs.MM·February 5, 2025

EditIQ: Automated Cinematic Editing of Static Wide-Angle Videos via Dialogue Interpretation and Saliency Cues

Rohit Girmaji, Bhav Beri, Ramanathan Subramanian, and Vineet Gandhi

PDF

TL;DR

EditIQ is an automated system that creates cinematic edits of static wide-angle videos by interpreting dialogue and saliency cues, producing engaging and coherent scene sequences without manual editing.

Contribution

It introduces a novel framework combining dialogue understanding and visual saliency to automate cinematic editing of static camera footage.

Findings

01

Outperforms baseline methods in viewer engagement and coherence

02

Demonstrates effectiveness on diverse datasets including BBC and theatre videos

03

Validated through psychophysical user study with positive results

Abstract

We present EditIQ, a completely automated framework for cinematically editing scenes captured via a stationary, large field-of-view and high-resolution camera. From the static camera feed, EditIQ initially generates multiple virtual feeds, emulating a team of cameramen. These virtual camera shots termed rushes are subsequently assembled using an automated editing algorithm, whose objective is to present the viewer with the most vivid scene content. To understand key scene elements and guide the editing process, we employ a two-pronged approach: (1) a large language model (LLM)-based dialogue understanding module to analyze conversational flow, coupled with (2) visual saliency prediction to identify meaningful scene elements and camera shots therefrom. We then formulate cinematic video editing as an energy minimization problem over shot selection, where cinematic constraints determine…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.