DG-SED: Domain Generalization for Sound Event Detection with Heterogeneous Training Data
Yang Xiao, Han Yin, Jisheng Bai, Rohan Kumar Das

TL;DR
This paper introduces DG-SED, a domain generalization method for sound event detection that employs a mean-teacher framework with novel data augmentation and normalization techniques to improve cross-domain performance.
Contribution
The paper proposes DG-SED, combining mixstyle, adaptive residual normalization, and a mean-teacher framework to enhance sound event detection across heterogeneous domains.
Findings
DG-SED improves PSDS on DESED dataset.
DG-SED enhances macro-average pAUC on MAESTRO dataset.
The approach outperforms baseline methods in cross-domain sound event detection.
Abstract
This work explores domain generalization (DG) for sound event detection (SED), advancing adaptability to real-world scenarios. Our approach employs a mean-teacher framework with domain generalization named DG-SED to integrate heterogeneous training data while preserving the SED model performance across the datasets. Specifically, we first apply mixstyle to the frequency dimension to adapt the mel-spectrograms from different domains. Next, we use the adaptive residual normalization method to generalize features across multiple domains by applying instance normalization in the frequency dimension. Lastly, we use the sound event bounding boxes method for post-processing. We evaluate the proposed approach DG-SED on the DCASE 2024 Challenge Task 4, measuring PSDS on the DESED dataset and macro-average pAUC on the MAESTRO dataset. The results indicate that the proposed DG-SED method improves…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Speech and Audio Processing · Speech Recognition and Synthesis
MethodsInstance Normalization
