Attention-Based Efficient Breath Sound Removal in Studio Audio Recordings
Nidula Elgiriyewithana, N. D.Kodikara

TL;DR
This paper introduces an attention U-Net based model for efficient, accurate removal of breath sounds from vocal recordings, significantly reducing training time and model complexity while improving precision.
Contribution
The study presents a novel, parameter-efficient deep learning model that outperforms existing methods in breath sound removal using a specialized dataset and attention mechanisms.
Findings
Model requires only 1.9M parameters and 3.2 hours of training.
Achieves higher precision than previous models.
Reduces manual effort and enhances audio quality.
Abstract
In this research, we present an innovative, parameter-efficient model that utilizes the attention U-Net architecture for the automatic detection and eradication of non-speech vocal sounds, specifically breath sounds, in vocal recordings. This task is of paramount importance in the field of sound engineering, despite being relatively under-explored. The conventional manual process for detecting and eliminating these sounds requires significant expertise and is extremely time-intensive. Existing automated detection and removal methods often fall short in terms of efficiency and precision. Our proposed model addresses these limitations by offering a streamlined process and superior accuracy, achieved through the application of advanced deep learning techniques. A unique dataset, derived from Device and Produced Speech (DAPS), was employed for this purpose. The training phase of the model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSoftmax · Attention Is All You Need · Max Pooling · Convolution · Early Stopping · *Communicated@Fast*How Do I Communicate to Expedia? · Concatenated Skip Connection · U-Net
