LEAF: A Learnable Frontend for Audio Classification

Neil Zeghidour; Olivier Teboul; F\'elix de Chaumont Quitry; Marco; Tagliasacchi

arXiv:2101.08596·cs.SD·January 22, 2021·30 cites

LEAF: A Learnable Frontend for Audio Classification

Neil Zeghidour, Olivier Teboul, F\'elix de Chaumont Quitry, Marco, Tagliasacchi

PDF

Open Access 4 Repos 1 Video

TL;DR

This paper introduces a fully learnable audio frontend that surpasses traditional mel-filterbanks and previous learnable methods across diverse audio classification tasks, with fewer parameters and improved performance.

Contribution

The authors propose a lightweight, fully learnable frontend architecture that replaces mel-filterbanks, learning all feature extraction operations for improved audio classification.

Findings

01

Outperforms mel-filterbanks on multiple audio tasks

02

Achieves state-of-the-art results on Audioset

03

Uses significantly fewer parameters than previous methods

Abstract

Mel-filterbanks are fixed, engineered audio features which emulate human perception and have been used through the history of audio understanding up to today. However, their undeniable qualities are counterbalanced by the fundamental limitations of handmade representations. In this work we show that we can train a single learnable frontend that outperforms mel-filterbanks on a wide range of audio signals, including speech, music, audio events and animal sounds, providing a general-purpose learned frontend for audio classification. To do so, we introduce a new principled, lightweight, fully learnable architecture that can be used as a drop-in replacement of mel-filterbanks. Our system learns all operations of audio features extraction, from filtering to pooling, compression and normalization, and can be integrated into any neural network at a negligible parameter cost. We perform…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

LEAF: A Learnable Frontend for Audio Classification· slideslive

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Speech Recognition and Synthesis