DHAuDS: A Dynamic and Heterogeneous Audio Benchmark for Test-Time Adaptation

Weichuang Shao; Iman Yi Liao; Tomas Henrique Bode Maul; and Tissa Chandesa

arXiv:2511.18421·cs.SD·November 25, 2025

DHAuDS: A Dynamic and Heterogeneous Audio Benchmark for Test-Time Adaptation

Weichuang Shao, Iman Yi Liao, Tomas Henrique Bode Maul, and Tissa Chandesa

PDF

Open Access 4 Datasets

TL;DR

DHAuDS is a comprehensive benchmark for evaluating test-time adaptation methods in audio classification under realistic, dynamic, and heterogeneous acoustic domain shifts, promoting more robust and generalizable audio models.

Contribution

This paper introduces DHAuDS, a novel benchmark with dynamic and diverse noise conditions for assessing TTA methods in audio classification, addressing limitations of previous fixed-noise evaluations.

Findings

01

DHAuDS enables fair and reproducible comparison of TTA algorithms.

02

It includes four benchmarks with dynamic corruption levels and heterogeneous noise types.

03

The framework defines 14 evaluation criteria per benchmark for comprehensive assessment.

Abstract

Audio classifiers frequently face domain shift, when models trained on one dataset lose accuracy on data recorded in acoustically different conditions. Previous Test-Time Adaptation (TTA) research in speech and sound analysis often evaluates models under fixed or mismatched noise settings, that fail to mimic real-world variability. To overcome these limitations, this paper presents DHAuDS (Dynamic and Heterogeneous Audio Domain Shift), a benchmark designed to assess TTA approaches under more realistic and diverse acoustic shifts. DHAuDS comprises four standardized benchmarks: UrbanSound8K-C, SpeechCommandsV2-C, VocalSound-C, and ReefSet-C, each constructed with dynamic corruption severity levels and heterogeneous noise types to simulate authentic audio degradation scenarios. The framework defines 14 evaluation criteria for each benchmark (8 for UrbanSound8K-C), resulting in 50…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Music and Audio Processing · Speech Recognition and Synthesis