Modeling Beats and Downbeats with a Time-Frequency Transformer
Yun-Ning Hung, Ju-Chiang Wang, Xuchen Song, Wei-Tsung Lu and, Minz Won

TL;DR
This paper introduces a novel Transformer-based model called SpecTNT for beat and downbeat tracking in music, leveraging spectral and temporal modeling to improve accuracy over existing methods.
Contribution
The paper proposes a new Spectral-Temporal Transformer architecture, SpecTNT, combined with TCN, to enhance beat and downbeat detection in music information retrieval.
Findings
SpecTNT outperforms TCN in downbeat tracking accuracy.
The combined SpecTNT and TCN model achieves state-of-the-art results.
The approach effectively models spectral and temporal features of music.
Abstract
Transformer is a successful deep neural network (DNN) architecture that has shown its versatility not only in natural language processing but also in music information retrieval (MIR). In this paper, we present a novel Transformer-based approach to tackle beat and downbeat tracking. This approach employs SpecTNT (Spectral-Temporal Transformer in Transformer), a variant of Transformer that models both spectral and temporal dimensions of a time-frequency input of music audio. A SpecTNT model uses a stack of blocks, where each consists of two levels of Transformer encoders. The lower-level (or spectral) encoder handles the spectral features and enables the model to pay attention to harmonic components of each frame. Since downbeats indicate bar boundaries and are often accompanied by harmonic changes, this step may help downbeat modeling. The upper-level (or temporal) encoder aggregates…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Speech and Audio Processing · Music Technology and Sound Studies
