LipsAM: Lipschitz-Continuous Amplitude Modifier for Audio Signal Processing and its Application to Plug-and-Play Dereverberation
Kazuki Matsumoto, Ren Uchida, Kohei Yatabe

TL;DR
This paper introduces LipsAM, a Lipschitz-continuous variant of the amplitude modifier for audio signals, enhancing stability in speech dereverberation tasks by leveraging Lipschitz continuity in neural networks.
Contribution
It proposes the first Lipschitz-continuous variants of amplitude modifiers for audio, with theoretical conditions and practical architectures demonstrated in dereverberation applications.
Findings
LipsAM architectures satisfy Lipschitz continuity conditions.
Improved stability in speech dereverberation using LipsAM.
Numerical experiments confirm enhanced robustness.
Abstract
The robustness of deep neural networks (DNNs) can be certified through their Lipschitz continuity, which has made the construction of Lipschitz-continuous DNNs an active research field. However, DNNs for audio processing have not been a major focus due to their poor compatibility with existing results. In this paper, we consider the amplitude modifier (AM), a popular architecture for handling audio signals, and propose its Lipschitz-continuous variants, which we refer to as LipsAM. We prove a sufficient condition for an AM to be Lipschitz continuous and propose two architectures as examples of LipsAM. The proposed architectures were applied to a Plug-and-Play algorithm for speech dereverberation, and their improved stability is demonstrated through numerical experiments.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Hearing Loss and Rehabilitation
