SinTra: Learning an inspiration model from a single multi-track music   segment

Qingwei Song; Qiwei Sun; Dongsheng Guo; Haiyong Zheng

arXiv:2204.09917·cs.SD·April 22, 2022·1 cites

SinTra: Learning an inspiration model from a single multi-track music segment

Qingwei Song, Qiwei Sun, Dongsheng Guo, Haiyong Zheng

PDF

Open Access 1 Repo

TL;DR

SinTra is a novel auto-regressive model that learns to generate coherent, multi-instrument polyphonic music from a single segment using a pyramid Transformer-XL architecture and a new pitch-group representation.

Contribution

The paper introduces SinTra, a single-segment learning framework with a pyramid Transformer-XL and a pitch-group representation for high-quality multi-instrument music generation.

Findings

01

SinTra outperforms Music Transformer in learning from a single music segment.

02

The pyramid structure reduces overly-fragmented notes.

03

The model effectively captures musical structure and inter-track relationships.

Abstract

In this paper, we propose SinTra, an auto-regressive sequential generative model that can learn from a single multi-track music segment, to generate coherent, aesthetic, and variable polyphonic music of multi-instruments with an arbitrary length of bar. For this task, to ensure the relevance of generated samples and training music, we present a novel pitch-group representation. SinTra, consisting of a pyramid of Transformer-XL with a multi-scale training strategy, can learn both the musical structure and the relative positional relationship between notes of the single training music segment. Additionally, for maintaining the inter-track correlation, we use the convolution operation to process multi-track music, and when decoding, the tracks are independent to each other to prevent interference. We evaluate SinTra with both subjective study and objective metrics. The comparison results…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

qingweisong/sintra
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Music Technology and Sound Studies · Neuroscience and Music Perception

MethodsAttention Is All You Need · Linear Layer · Layer Normalization · Dropout · *Communicated@Fast*How Do I Communicate to Expedia? · Label Smoothing · Adaptive Input Representations · Cosine Annealing · Adam · Multi-Head Attention