Carnatic Raga Identification System using Rigorous Time-Delay Neural Network
Sanjay Natesan, Homayoon Beigi

TL;DR
This paper presents a machine learning system using Time-Delay Neural Networks and LSTM with attention mechanisms for accurate Carnatic raga identification from audio recordings, addressing variations in shruti and background noise.
Contribution
It introduces a novel combination of neural networks and attention mechanisms specifically tailored for Carnatic raga classification, improving robustness and accuracy.
Findings
Achieved effective raga classification on 676 recordings.
Demonstrated improved accuracy with attention-based frequency change analysis.
Enhanced robustness to shruti variations and background noise.
Abstract
Large scale machine learning-based Raga identification continues to be a nontrivial issue in the computational aspects behind Carnatic music. Each raga consists of many unique and intrinsic melodic patterns that can be used to easily identify them from others. These ragas can also then be used to cluster songs within the same raga, as well as identify songs in other closely related ragas. In this case, the input sound is analyzed using a combination of steps including using a Discrete Fourier transformation and using Triangular Filtering to create custom bins of possible notes, extracting features from the presence of particular notes or lack thereof. Using a combination of Neural Networks including 1D Convolutional Neural Networks conventionally known as Time-Delay Neural Networks) and Long Short-Term Memory (LSTM), which are a form of Recurrent Neural Networks, the backbone of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
