Local Equivariance Error-Based Metrics for Evaluating Sampling-Frequency-Independent Property of Neural Network

Kanami Imamura; Tomohiko Nakamura; Norihiro Takamune; Kohei Yatabe; Hiroshi Saruwatari

arXiv:2506.03550·cs.SD·June 5, 2025

Local Equivariance Error-Based Metrics for Evaluating Sampling-Frequency-Independent Property of Neural Network

Kanami Imamura, Tomohiko Nakamura, Norihiro Takamune, Kohei Yatabe, Hiroshi Saruwatari

PDF

Open Access

TL;DR

This paper introduces three new metrics based on local equivariance error to evaluate how well neural networks for audio processing maintain their performance across different sampling frequencies, addressing a gap in robustness assessment.

Contribution

The paper proposes novel metrics using local equivariance error to quantify sampling-frequency-independent robustness of neural networks in audio source separation.

Findings

01

Metrics strongly correlate with performance degradation at untrained sampling frequencies.

02

Proposed metrics effectively evaluate robustness of network components predicting time-frequency masks.

03

Extension of local equivariance error to measure signal resampling robustness in audio neural networks.

Abstract

Audio signal processing methods based on deep neural networks (DNNs) are typically trained only at a single sampling frequency (SF) and therefore require signal resampling to handle untrained SFs. However, recent studies have shown that signal resampling can degrade performance with untrained SFs. This problem has been overlooked because most studies evaluate only the performance at trained SFs. In this paper, to assess the robustness of DNNs to SF changes, which we refer to as the SF-independent (SFI) property, we propose three metrics to quantify the SFI property on the basis of local equivariance error (LEE). LEE measures the robustness of DNNs to input transformations. By using signal resampling as input transformation, we extend LEE to measure the robustness of audio source separation methods to signal resampling. The proposed metrics are constructed to quantify the SFI property in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Hearing Loss and Rehabilitation · Blind Source Separation Techniques