ISAC: An Invertible and Stable Auditory Filter Bank with Customizable Kernels for ML Integration
Daniel Haider, Felix Perfler, Peter Balazs, Clara Hollomey, Nicki Holighaus

TL;DR
ISAC is a novel invertible and stable auditory filter bank designed for seamless integration into machine learning, featuring customizable kernels, perceptually-motivated frequency scaling, and perfect reconstruction capabilities.
Contribution
The paper presents ISAC, a new filter bank with invertibility, stability, and customizability, enabling effective ML integration and perceptually-motivated audio analysis.
Findings
Supports perfect reconstruction between analysis and synthesis
Allows user-defined temporal support and learnable kernels
Aligns with auditory frequency scales for perceptual relevance
Abstract
This paper introduces ISAC, an invertible and stable, perceptually-motivated filter bank that is specifically designed to be integrated into machine learning paradigms. More precisely, the center frequencies and bandwidths of the filters are chosen to follow a non-linear, auditory frequency scale, the filter kernels have user-defined maximum temporal support and may serve as learnable convolutional kernels, and there exists a corresponding filter bank such that both form a perfect reconstruction pair. ISAC provides a powerful and user-friendly audio front-end suitable for any application, including analysis-synthesis schemes.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Filter Design and Implementation · Speech and Audio Processing · Music and Audio Processing
