Keep the beat going: Automatic drum transcription with momentum
Alisha L. Foster, Robert J. Webber

TL;DR
This paper introduces optimized algorithms for automatic drum transcription using partially fixed nonnegative matrix factorization, demonstrating improved accuracy and convergence guarantees over existing methods.
Contribution
It develops and compares two optimization techniques, multiplicative update and projected gradient descent with momentum, specifically tailored for partially fixed NMF in music transcription.
Findings
Projected gradient descent with momentum outperforms multiplicative updates in accuracy.
The proposed methods provide stronger local convergence guarantees.
Results show effective detection and visualization of instrument onsets.
Abstract
How can we process a piece of recorded music to detect and visualize the onset of each instrument? A simple, interpretable approach is based on partially fixed nonnegative matrix factorization (NMF). Yet despite the method's simplicity, partially fixed NMF is challenging to apply because the associated optimization problem is high-dimensional and non-convex. This paper explores two optimization approaches that preserve the nonnegative structure, including a multiplicative update rule and projected gradient descent with momentum. These techniques are derived from the previous literature, but they have not been fully developed for partially fixed NMF before now. Results indicate that projected gradient descent with momentum leads to the higher accuracy among the two methods, and it satisfies stronger local convergence guarantees.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic Technology and Sound Studies · Music and Audio Processing · Computational Physics and Python Applications
