Student-t Networks for Melody Estimation

Udhav Gupta; Avi; Bhavesh Jain

arXiv:2110.07419·eess.AS·November 30, 2021

Student-t Networks for Melody Estimation

Udhav Gupta, Avi, Bhavesh Jain

PDF

Open Access

TL;DR

This paper introduces Student-t neural networks designed to improve melody estimation from complex polyphonic audio signals, addressing the challenge of identifying dominant frequencies amidst overlapping sounds.

Contribution

The paper proposes a novel Student-t network architecture tailored for melody extraction, enhancing robustness to correlated and overlapping audio signals.

Findings

01

Improved accuracy in polyphonic melody extraction

02

Robustness to correlated sound overlaps

03

Effective in complex audio scenarios

Abstract

Melody estimation or melody extraction refers to the extraction of the primary or fundamental dominant frequency in a melody. This sequence of frequencies obtained represents the pitch of the dominant melodic line from recorded music audio signals. The music signal may be monophonic or polyphonic. The melody extraction problem from audio signals gets complicated when we start dealing with polyphonic audio data. This is because in generalized audio signals,the sounds are highly correlated over both frequency and time domains. This complex overlap of many sounds, makes identification of predominant frequency challenging.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Music Technology and Sound Studies