An audio-to-analysis pipeline with certified transcription for information-theoretic profiling of the piano repertoire
Fred Jalbert-Desforges

TL;DR
This paper introduces an audio-to-analysis pipeline that accurately transcribes piano recordings to derive composer-level harmonic profiles, revealing stylistic lineages and differences between historical and contemporary composers.
Contribution
The pipeline combines a certified transcription layer with information-theoretic analysis to classify and compare piano styles across a large corpus, providing new insights into harmonic predictability and stylistic evolution.
Findings
Achieved 97.91% F1 score on MAESTRO benchmark for transcription accuracy.
Ordered composers along an interpretable axis of harmonic predictability.
Separated contemporary neoclassical artists from historical composers based on Zipfian fit.
Abstract
We present an audio-to-analysis pipeline that produces composer-level information-theoretic profiles : reflecting compositional vocabulary as it emerges from aggregated performances : from raw recordings, built on a transcription layer whose accuracy we certify on a standard benchmark (F1 = 0.9791 on the MAESTRO v3.0.0 test set). Applied to 1,238 pieces and 15 MAESTRO composers with at least ten attributed pieces, spanning the Baroque through the early twentieth century, the pipeline derives empirical distributions over harmonic scale degrees and analyzes them through Shannon entropy, asymmetric Kullback-Leibler divergence, and Zipfian rank-frequency modeling. The resulting profiles (i) order composers along an interpretable axis of harmonic predictability, with a narrow entropy range (3.33-3.86 bits) that reveals the marginal-level similarity of tonal vocabularies; (ii) recover known…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
