Automatic Contextual Audio Denoising
Diep Luong, Konstantinos Drossos, Mikko Heikkinen, Tuomas Virtanen

TL;DR
This paper introduces a deep learning approach for automatic contextual audio denoising that infers acoustic scene context to selectively remove irrelevant noise components, improving denoising performance across diverse environments.
Contribution
It proposes a novel method that automatically infers audio context to enhance denoising, addressing limitations of fixed target-noise definitions in current systems.
Findings
Outperforms other approaches on paired clean/noisy data across contexts.
Model effectively infers context and improves noise suppression.
Context-dependent processing enhances denoising quality.
Abstract
Audio context determines which sound components and sources are relevant and which can be perceived as irrelevant (noise) by listeners. For example, traffic noise is informative in urban surveillance but noise for a phone call at the same location. Most current audio denoising systems apply fixed target-noise definitions, often removing useful components in one context while failing to suppress irrelevant components. To address this, we introduce the concept automatic contextual audio denoising (ACAD) which defines target and noise based on the inferred context. In this work, we restrict context to be associated with an acoustic scene class. We label sound events outside the event distribution of a scene class (noise) as out-of-context (OC) and events typical for that scene as in-context (IC). We implement a deep learning method that automatically infers the context of the audio signal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
