Joint Time-Frequency Scattering for Audio Classification

Joakim And\'en; Vincent Lostanlen; St\'ephane Mallat

arXiv:1512.02125·cs.SD·August 6, 2018

Joint Time-Frequency Scattering for Audio Classification

Joakim And\'en, Vincent Lostanlen, St\'ephane Mallat

PDF

1 Repo

TL;DR

The paper presents a novel joint time-frequency scattering transform that effectively captures complex audio features, achieving state-of-the-art results in audio classification tasks like phone segmentation.

Contribution

It introduces a new joint time-frequency scattering transform that enhances audio feature representation for classification.

Findings

01

Successfully characterizes complex time-frequency phenomena

02

Achieves state-of-the-art results on TIMIT dataset

03

Effective for signal reconstruction and phone segmentation

Abstract

We introduce the joint time-frequency scattering transform, a time shift invariant descriptor of time-frequency structure for audio classification. It is obtained by applying a two-dimensional wavelet transform in time and log-frequency to a time-frequency wavelet scalogram. We show that this descriptor successfully characterizes complex time-frequency phenomena such as time-varying filters and frequency modulated excitations. State-of-the-art results are achieved for signal reconstruction and phone segment classification on the TIMIT dataset.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

JohnVinyard/matching-pursuit
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.