Kapre: On-GPU Audio Preprocessing Layers for a Quick Implementation of   Deep Neural Network Models with Keras

Keunwoo Choi; Deokjin Joo; Juho Kim

arXiv:1706.05781·cs.SD·June 20, 2017·58 cites

Kapre: On-GPU Audio Preprocessing Layers for a Quick Implementation of Deep Neural Network Models with Keras

Keunwoo Choi, Deokjin Joo, Juho Kim

PDF

Open Access 5 Repos

TL;DR

Kapre provides GPU-accelerated audio preprocessing layers for Keras, simplifying deep neural network workflows by integrating signal processing directly into model architecture, with minimal computational overhead.

Contribution

It introduces a novel set of Keras layers for on-GPU audio preprocessing, streamlining deep learning workflows in music research.

Findings

01

Real-time on-GPU preprocessing is feasible with reasonable computational cost.

02

Kapre simplifies audio preprocessing pipeline integration.

03

Benchmark results demonstrate efficiency of GPU-based audio processing.

Abstract

We introduce Kapre, Keras layers for audio and music signal preprocessing. Music research using deep neural networks requires a heavy and tedious preprocessing stage, for which audio processing parameters are often ignored in parameter optimisation. To solve this problem, Kapre implements time-frequency conversions, normalisation, and data augmentation as Keras layers. We report simple benchmark results, showing real-time on-GPU preprocessing adds a reasonable amount of computation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Speech Recognition and Synthesis