Bridging Music and Text with Crowdsourced Music Comments: A   Sequence-to-Sequence Framework for Thematic Music Comments Generation

Peining Zhang; Junliang Guo; Linli Xu; Mu You; Junming Yin

arXiv:2209.01996·cs.SD·September 7, 2022

Bridging Music and Text with Crowdsourced Music Comments: A Sequence-to-Sequence Framework for Thematic Music Comments Generation

Peining Zhang, Junliang Guo, Linli Xu, Mu You, Junming Yin

PDF

Open Access

TL;DR

This paper introduces a sequence-to-sequence framework for generating descriptive music comments by leveraging crowdsourced data, employing advanced neural components and novel evaluation metrics to improve authenticity and thematic relevance.

Contribution

The paper presents a new dataset of music comments and a novel neural model with discriminator and topic evaluator for more authentic and thematic music description generation.

Findings

01

The model generates fluent, meaningful comments.

02

The approach outperforms traditional metrics in alignment with human judgment.

03

The dataset enables better training for music text generation.

Abstract

We consider a novel task of automatically generating text descriptions of music. Compared with other well-established text generation tasks such as image caption, the scarcity of well-paired music and text datasets makes it a much more challenging task. In this paper, we exploit the crowd-sourced music comments to construct a new dataset and propose a sequence-to-sequence model to generate text descriptions of music. More concretely, we use the dilated convolutional layer as the basic component of the encoder and a memory based recurrent neural network as the decoder. To enhance the authenticity and thematicity of generated texts, we further propose to fine-tune the model with a discriminator as well as a novel topic evaluator. To measure the quality of generated texts, we also propose two new evaluation metrics, which are more aligned with human evaluation than traditional metrics such…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Music Technology and Sound Studies · Topic Modeling