Differentiable Time-Frequency Scattering on GPU
John Muradeli, Cyrus Vahidi, Changhong Wang, Han Han, Vincent, Lostanlen, Mathieu Lagrange, George Fazekas

TL;DR
This paper introduces a GPU-compatible, differentiable implementation of joint time-frequency scattering (JTFS) in Python, enabling efficient analysis and application in audio perception and synthesis tasks.
Contribution
It provides a flexible, portable JTFS implementation supporting multiple backends, addressing previous limitations in differentiability, speed, and flexibility.
Findings
Enables unsupervised manifold learning of spectrotemporal modulations
Improves supervised classification of musical instruments
Facilitates texture resynthesis of bioacoustic sounds
Abstract
Joint time-frequency scattering (JTFS) is a convolutional operator in the time-frequency domain which extracts spectrotemporal modulations at various rates and scales. It offers an idealized model of spectrotemporal receptive fields (STRF) in the primary auditory cortex, and thus may serve as a biological plausible surrogate for human perceptual judgments at the scale of isolated audio events. Yet, prior implementations of JTFS and STRF have remained outside of the standard toolkit of perceptual similarity measures and evaluation methods for audio generation. We trace this issue down to three limitations: differentiability, speed, and flexibility. In this paper, we present an implementation of time-frequency scattering in Python. Unlike prior implementations, ours accommodates NumPy, PyTorch, and TensorFlow as backends and is thus portable on both CPU and GPU. We demonstrate the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHearing Loss and Rehabilitation · Music and Audio Processing · Speech and Audio Processing
