Dynamic HumTrans: Humming Transcription Using CNNs and Dynamic Programming
Shubham Gupta, Isaac Neri Gomez-Sarmiento, Faez Amjed Mezdari, and Mirco Ravanelli, Cem Subakan

TL;DR
This paper introduces a CNN-based humming transcription method combined with dynamic programming, improving annotation accuracy and achieving state-of-the-art results on the HumTrans dataset.
Contribution
It presents a novel CNN and dynamic programming approach for humming transcription, along with heuristics to correct dataset annotations, enhancing future research capabilities.
Findings
Achieved state-of-the-art transcription accuracy.
Improved dataset annotations with heuristics.
Provided open-source code and corrected dataset.
Abstract
We propose a novel approach for humming transcription that combines a CNN-based architecture with a dynamic programming-based post-processing algorithm, utilizing the recently introduced HumTrans dataset. We identify and address inherent problems with the offset and onset ground truth provided by the dataset, offering heuristics to improve these annotations, resulting in a dataset with precise annotations that will aid future research. Additionally, we compare the transcription accuracy of our method against several others, demonstrating state-of-the-art (SOTA) results. All our code and corrected dataset is available at https://github.com/shubham-gupta-30/humming_transcription
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEvolutionary Algorithms and Applications
