Student-t Networks for Melody Estimation
Udhav Gupta, Avi, Bhavesh Jain

TL;DR
This paper introduces Student-t neural networks designed to improve melody estimation from complex polyphonic audio signals, addressing the challenge of identifying dominant frequencies amidst overlapping sounds.
Contribution
The paper proposes a novel Student-t network architecture tailored for melody extraction, enhancing robustness to correlated and overlapping audio signals.
Findings
Improved accuracy in polyphonic melody extraction
Robustness to correlated sound overlaps
Effective in complex audio scenarios
Abstract
Melody estimation or melody extraction refers to the extraction of the primary or fundamental dominant frequency in a melody. This sequence of frequencies obtained represents the pitch of the dominant melodic line from recorded music audio signals. The music signal may be monophonic or polyphonic. The melody extraction problem from audio signals gets complicated when we start dealing with polyphonic audio data. This is because in generalized audio signals,the sounds are highly correlated over both frequency and time domains. This complex overlap of many sounds, makes identification of predominant frequency challenging.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Speech and Audio Processing · Music Technology and Sound Studies
