DGSNA: Dynamic Generative Scene-based Noise Addition method

Zihao Chen; Zhentao Lin; Bi Zeng; Linyi Huang; Jia Cai

arXiv:2411.12363·cs.SD·April 21, 2026

DGSNA: Dynamic Generative Scene-based Noise Addition method

Zihao Chen, Zhentao Lin, Bi Zeng, Linyi Huang, Jia Cai

PDF

1 Repo

TL;DR

DGSNA introduces a prompt-driven, generative approach to create diverse, scene-specific noise for speech systems, improving robustness without relying on pre-existing noise libraries.

Contribution

It combines generative language models with diffusion-based audio synthesis to dynamically generate realistic scene-based noise for speech data augmentation.

Findings

01

Achieves up to 11.32% relative improvement in speech recognition robustness.

02

Effectively simulates diverse acoustic environments without pre-existing noise datasets.

03

Highly compatible with existing noise addition techniques.

Abstract

To ensure the reliable operation of speech systems across diverse environments, noise addition methods have emerged as the standard solution.However, existing methods offer limited coverage of real-world scenes and depend on pre-existing noise libraries and scene metadata.This paper presents prompt-based Dynamic Generative Scene-based Noise Addition (DGSNA), a novel approach driven by generative language models that integrates Dynamic Generation of Scene-based Information (DGSI) with Scene-based Noise Addition for Speech (SNAS).The DGSI module, with a BET (Background, Examples, Task) prompt framework, dynamically generates logic-compliant scene-based information, including scene dimensions, sound sources, and microphone positions, thereby addressing the challenges of scene enumeration and detailed description.Complementing this, the SNAS module employs a Time-Frequency Diffusion-based…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://dgsna.github.io
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.