Modeling Temporal Dependencies in High-Dimensional Sequences: Application to Polyphonic Music Generation and Transcription
Nicolas Boulanger-Lewandowski (Universite de Montreal), Yoshua Bengio, (Universite de Montreal), Pascal Vincent (Universite de Montreal)

TL;DR
This paper presents a probabilistic neural network model that captures temporal dependencies in high-dimensional polyphonic music sequences, improving generation and transcription accuracy.
Contribution
It introduces a novel RNN-based distribution estimator for modeling complex polyphonic music sequences, outperforming traditional models.
Findings
Outperforms traditional models on realistic datasets
Enhances polyphonic transcription accuracy
Effectively captures temporal dependencies in high-dimensional data
Abstract
We investigate the problem of modeling symbolic sequences of polyphonic music in a completely general piano-roll representation. We introduce a probabilistic model based on distribution estimators conditioned on a recurrent neural network that is able to discover temporal dependencies in high-dimensional sequences. Our approach outperforms many traditional models of polyphonic music on a variety of realistic datasets. We show how our musical language model can serve as a symbolic prior to improve the accuracy of polyphonic transcription.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Music Technology and Sound Studies · Generative Adversarial Networks and Image Synthesis
