TL;DR
This paper introduces a harmony-aware learning approach and a hierarchical transformer model to improve the structural quality of automatically generated pop music, focusing on texture and form coherence.
Contribution
It proposes a novel harmony-aware hierarchical transformer that jointly learns texture and form, enhancing pop music generation quality and structural understanding.
Findings
HAT outperforms existing methods in structure understanding.
Generated music shows improved form and texture coherence.
Experimental results demonstrate better musical quality and structure.
Abstract
Pop music generation has always been an attractive topic for both musicians and scientists for a long time. However, automatically composing pop music with a satisfactory structure is still a challenging issue. In this paper, we propose to leverage harmony-aware learning for structure-enhanced pop music generation. On the one hand, one of the participants of harmony, chord, represents the harmonic set of multiple notes, which is integrated closely with the spatial structure of music, the texture. On the other hand, the other participant of harmony, chord progression, usually accompanies the development of the music, which promotes the temporal structure of music, the form. Moreover, when chords evolve into chord progression, the texture and form can be bridged by the harmony naturally, which contributes to the joint learning of the two structures. Furthermore, we propose the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Dropout · Layer Normalization · Softmax · Byte Pair Encoding · Residual Connection
