The Jazz Transformer on the Front Line: Exploring the Shortcomings of   AI-composed Music through Quantitative Measures

Shih-Lun Wu; Yi-Hsuan Yang

arXiv:2008.01307·cs.SD·August 5, 2020·38 cites

The Jazz Transformer on the Front Line: Exploring the Shortcomings of AI-composed Music through Quantitative Measures

Shih-Lun Wu, Yi-Hsuan Yang

PDF

Open Access 2 Repos 1 Datasets

TL;DR

This paper introduces the Jazz Transformer, a neural model for Jazz music generation, and critically analyzes its shortcomings through quantitative measures, highlighting the gap between AI-generated and human compositions.

Contribution

The paper presents the Jazz Transformer model incorporating structural elements and provides a comprehensive quantitative analysis of its musical output.

Findings

01

Low training loss achieved but listening scores indicate quality gap

02

Analysis reveals differences in pitch, groove, and structure compared to human music

03

Identifies key areas for future improvement in AI music generation

Abstract

This paper presents the Jazz Transformer, a generative model that utilizes a neural sequence model called the Transformer-XL for modeling lead sheets of Jazz music. Moreover, the model endeavors to incorporate structural events present in the Weimar Jazz Database (WJazzD) for inducing structures in the generated music. While we are able to reduce the training loss to a low value, our listening test suggests however a clear gap between the average ratings of the generated and real compositions. We therefore go one step further and conduct a series of computational analysis of the generated compositions from different perspectives. This includes analyzing the statistics of the pitch class, grooving, and chord progression, assessing the structureness of the music with the help of the fitness scape plot, and evaluating the model's understanding of Jazz music through a MIREX-like…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Datasets

juancopi81/lmd_clean_8bars_32th_resolution
dataset· 105 dl
105 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Music Technology and Sound Studies · Neuroscience and Music Perception

MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Cosine Annealing · Label Smoothing · Dropout · Adam · Variational Dropout · Multi-Head Attention · *Communicated@Fast*How Do I Communicate to Expedia?