Weakly Supervised Tabla Stroke Transcription via TI-SDRM: A Rhythm-Aware Lattice Rescoring Framework
Rahul Bapusaheb Kodag, Vipul Arora

TL;DR
This paper introduces a rhythm-aware lattice rescoring framework for weakly supervised tabla stroke transcription, leveraging a novel rhythmic model to improve accuracy without requiring detailed annotations.
Contribution
It presents a new weakly supervised approach combining a CTC-based model with a rhythmic rescoring framework, and establishes the first benchmark for this task in Hindustani classical music.
Findings
Significant reduction in stroke error rate with rhythmic rescoring
Effective integration of long-term rhythmic structure
First benchmark dataset for weakly supervised TST
Abstract
Tabla Stroke Transcription (TST) is central to the analysis of rhythmic structure in Hindustani classical music, yet remains challenging due to complex rhythmic organization and the scarcity of strongly annotated data. Existing approaches largely rely on fully supervised learning with onset-level annotations, which are costly and impractical at scale. This work addresses TST in a weakly supervised setting, using only symbolic stroke sequences without temporal alignment. We propose a framework that combines a CTC-based acoustic model with sequence-level rhythmic rescoring. The acoustic model produces a decoding lattice, which is refined using a \textbf{}-Independent Static--Dynamic Rhythmic Model (TI-SDRM) that integrates long-term rhythmic structure with short-term adaptive dynamics through an adaptive interpolation mechanism. We curate a new real-world tabla solo dataset…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic Technology and Sound Studies · Music and Audio Processing · Neuroscience and Music Perception
