AudioScenic: Audio-Driven Video Scene Editing

Kaixin Shen; Ruijie Quan; Linchao Zhu; Jun Xiao; Yi Yang

arXiv:2404.16581·cs.CV·April 26, 2024

AudioScenic: Audio-Driven Video Scene Editing

Kaixin Shen, Ruijie Quan, Linchao Zhu, Jun Xiao, Yi Yang

PDF

Open Access

TL;DR

AudioScenic is a novel framework that uses audio signals to guide background editing in videos while preserving foreground content, improving temporal consistency and visual diversity.

Contribution

We introduce AudioScenic, a new audio-driven video scene editing framework with modules for semantic injection, background masking, and audio-guided temporal control.

Findings

01

Outperforms existing methods on DAVIS and Audioset datasets.

02

Enhances temporal consistency with a new temporal score metric.

03

Effectively controls background editing guided by audio signals.

Abstract

Audio-driven visual scene editing endeavors to manipulate the visual background while leaving the foreground content unchanged, according to the given audio signals. Unlike current efforts focusing primarily on image editing, audio-driven video scene editing has not been extensively addressed. In this paper, we introduce AudioScenic, an audio-driven framework designed for video scene editing. AudioScenic integrates audio semantics into the visual scene through a temporal-aware audio semantic injection process. As our focus is on background editing, we further introduce a SceneMasker module, which maintains the integrity of the foreground content during the editing process. AudioScenic exploits the inherent properties of audio, namely, audio magnitude and frequency, to guide the editing process, aiming to control the temporal dynamics and enhance the temporal consistency. First, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Video Analysis and Summarization · Music Technology and Sound Studies

MethodsFocus