Kapre: On-GPU Audio Preprocessing Layers for a Quick Implementation of Deep Neural Network Models with Keras
Keunwoo Choi, Deokjin Joo, Juho Kim

TL;DR
Kapre provides GPU-accelerated audio preprocessing layers for Keras, simplifying deep neural network workflows by integrating signal processing directly into model architecture, with minimal computational overhead.
Contribution
It introduces a novel set of Keras layers for on-GPU audio preprocessing, streamlining deep learning workflows in music research.
Findings
Real-time on-GPU preprocessing is feasible with reasonable computational cost.
Kapre simplifies audio preprocessing pipeline integration.
Benchmark results demonstrate efficiency of GPU-based audio processing.
Abstract
We introduce Kapre, Keras layers for audio and music signal preprocessing. Music research using deep neural networks requires a heavy and tedious preprocessing stage, for which audio processing parameters are often ignored in parameter optimisation. To solve this problem, Kapre implements time-frequency conversions, normalisation, and data augmentation as Keras layers. We report simple benchmark results, showing real-time on-GPU preprocessing adds a reasonable amount of computation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Speech and Audio Processing · Speech Recognition and Synthesis
