Audio Match Cutting: Finding and Creating Matching Audio Transitions in Movies and Videos
Dennis Fedorishin, Lie Lu, Srirangaraj Setlur, Venu Govindaraju

TL;DR
This paper introduces a method for automatically detecting and creating seamless audio transitions in videos by developing a self-supervised audio representation and a pipeline for matching and blending audio segments.
Contribution
It proposes a novel self-supervised audio representation and a coarse-to-fine pipeline for automatic audio match cut detection and creation in videos.
Findings
The self-supervised audio representation effectively identifies matching audio segments.
The pipeline successfully recommends matching shots with smooth audio transitions.
Multiple blending methods are evaluated for creating seamless audio transitions.
Abstract
A "match cut" is a common video editing technique where a pair of shots that have a similar composition transition fluidly from one to another. Although match cuts are often visual, certain match cuts involve the fluid transition of audio, where sounds from different sources merge into one indistinguishable transition between two shots. In this paper, we explore the ability to automatically find and create "audio match cuts" within videos and movies. We create a self-supervised audio representation for audio match cutting and develop a coarse-to-fine audio match pipeline that recommends matching shots and creates the blended audio. We further annotate a dataset for the proposed audio match cut task and compare the ability of multiple audio representations to find audio match cut candidates. Finally, we evaluate multiple methods to blend two matching audio candidates with the goal of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
