WaveBeat: End-to-end beat and downbeat tracking in the time domain

Christian J. Steinmetz; Joshua D. Reiss

arXiv:2110.01436·eess.AS·October 5, 2021·6 cites

WaveBeat: End-to-end beat and downbeat tracking in the time domain

Christian J. Steinmetz, Joshua D. Reiss

PDF

Open Access 1 Repo

TL;DR

WaveBeat introduces an end-to-end waveform-based model for beat and downbeat tracking, eliminating the need for spectral features and achieving state-of-the-art results with large receptive fields and efficient TCNs.

Contribution

It is the first to perform joint beat and downbeat tracking directly from raw waveforms using temporal convolutional networks.

Findings

01

Outperforms previous state-of-the-art on some datasets.

02

Achieves comparable results on other datasets.

03

Demonstrates the potential of time domain approaches for beat tracking.

Abstract

Deep learning approaches for beat and downbeat tracking have brought advancements. However, these approaches continue to rely on hand-crafted, subsampled spectral features as input, restricting the information available to the model. In this work, we propose WaveBeat, an end-to-end approach for joint beat and downbeat tracking operating directly on waveforms. This method forgoes engineered spectral features, and instead, produces beat and downbeat predictions directly from the waveform, the first of its kind for this task. Our model utilizes temporal convolutional networks (TCNs) operating on waveforms that achieve a very large receptive field ( $\geq$ 30 s) at audio sample rates in a memory efficient manner by employing rapidly growing dilation factors with fewer layers. With a straightforward data augmentation strategy, our method outperforms previous state-of-the-art methods on some…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

csteinmetz1/wavebeat
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Music Technology and Sound Studies