SeamlessEdit: Background Noise Aware Zero-Shot Speech Editing with in-Context Enhancement
Kuan-Yu Chen, Jeng-Lin Li, De-Yan Lu, Jian-Jiun Ding

TL;DR
SeamlessEdit is a novel noise-resilient speech editing framework that effectively handles background noise in real-world scenarios, outperforming existing methods in quality and robustness.
Contribution
The paper introduces SeamlessEdit, a new framework with frequency-band-aware noise suppression and in-content refinement for robust zero-shot speech editing in noisy environments.
Findings
Outperforms state-of-the-art approaches in quantitative evaluations.
Effectively handles overlapping voice and noise frequency bands.
Demonstrates robustness in real-world noisy speech editing scenarios.
Abstract
With the fast development of zero-shot text-to-speech technologies, it is possible to generate high-quality speech signals that are indistinguishable from the real ones. Speech editing, including speech insertion and replacement, appeals to researchers due to its potential applications. However, existing studies only considered clean speech scenarios. In real-world applications, the existence of environmental noise could significantly degrade the quality of generation. In this study, we propose a noise-resilient speech editing framework, SeamlessEdit, for noisy speech editing. SeamlessEdit adopts a frequency-band-aware noise suppression module and an in-content refinement strategy. It can well address the scenario where the frequency bands of voice and background noise are not separated. The proposed SeamlessEdit framework outperforms state-of-the-art approaches in multiple quantitative…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing
