Vocal Melody Construction for Persian Lyrics Using LSTM Recurrent Neural   Networks

Farshad Jafari; Farzad Didehvar; Amin Gheibi

arXiv:2410.18203·cs.SD·October 29, 2024

Vocal Melody Construction for Persian Lyrics Using LSTM Recurrent Neural Networks

Farshad Jafari, Farzad Didehvar, Amin Gheibi

PDF

Open Access

TL;DR

This paper presents a neural network-based system for automatic melody generation for Persian lyrics, leveraging phonological correlations and trained on a custom dataset, with evaluation indicating moderate success compared to human melodies.

Contribution

It introduces a seq2seq neural network model trained on Persian music data to generate melodies from lyrics, addressing the lack of existing datasets.

Findings

01

System scored an average of 3.005/5 for pleasantness

02

Human melodies scored an average of 4.078/5

03

Model demonstrates potential but needs improvement

Abstract

The present paper investigated automatic melody construction for Persian lyrics as an input. It was assumed that there is a phonological correlation between the lyric syllables and the melody in a song. A seq2seq neural network was developed to investigate this assumption, trained on parallel syllable and note sequences in Persian songs to suggest a pleasant melody for a new sequence of syllables. More than 100 pieces of Persian music were collected and converted from the printed version to the digital format due to the lack of a dataset on Persian digital music. Finally, 14 new lyrics were given to the model as input, and the suggested melodies were performed and recorded by music experts to evaluate the trained model. The evaluation was conducted using an audio questionnaire, which more than 170 persons answered. According to the answers about the pleasantness of melody, the system…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing

MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory · Sequence to Sequence