Should Audio Front-ends be Adaptive? Comparing Learnable and Adaptive   Front-ends

Qiquan Zhang; Buddhi Wickramasinghe; Eliathamby Ambikairajah,; Vidhyasaharan Sethu; and Haizhou Li

arXiv:2502.03260·eess.AS·February 6, 2025

Should Audio Front-ends be Adaptive? Comparing Learnable and Adaptive Front-ends

Qiquan Zhang, Buddhi Wickramasinghe, Eliathamby Ambikairajah,, Vidhyasaharan Sethu, and Haizhou Li

PDF

Open Access

TL;DR

This paper compares adaptive and learnable audio front-ends, demonstrating that the adaptive Ada-FE outperforms learnable alternatives in accuracy and robustness across multiple audio benchmarks.

Contribution

It introduces and systematically evaluates the adaptive Ada-FE front-end, showing its advantages over existing learnable front-ends in diverse audio tasks.

Findings

01

Ada-FE outperforms learnable front-ends in accuracy.

02

Ada-FE demonstrates greater robustness over training epochs.

03

Comprehensive benchmarks validate Ada-FE's effectiveness.

Abstract

Hand-crafted features, such as Mel-filterbanks, have traditionally been the choice for many audio processing applications. Recently, there has been a growing interest in learnable front-ends that extract representations directly from the raw audio waveform. \textcolor{black}{However, both hand-crafted filterbanks and current learnable front-ends lead to fixed computation graphs at inference time, failing to dynamically adapt to varying acoustic environments, a key feature of human auditory systems.} To this end, we explore the question of whether audio front-ends should be adaptive by comparing the Ada-FE front-end (a recently developed adaptive front-end that employs a neural adaptive feedback controller to dynamically adjust the Q-factors of its spectral decomposition filters) to established learnable front-ends. Specifically, we systematically investigate learnable front-ends and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic Technology and Sound Studies · Hearing Loss and Rehabilitation