Mathematical Foundations of Polyphonic Music Generation via Structural Inductive Bias
Joonwon Seo

TL;DR
This paper presents a mathematically grounded framework for polyphonic music generation, combining theoretical proofs and empirical results to improve stability, generalization, and parameter efficiency.
Contribution
It introduces the Smart Embedding architecture and rigorous mathematical proofs to address the 'Missing Middle' problem in AI music generation.
Findings
Normalized mutual information between pitch and hand attributes is 0.167.
Smart Embedding reduces model parameters by 48.30%.
Validation loss decreases by 9.47% with the proposed approach.
Abstract
This monograph introduces a novel approach to polyphonic music generation by addressing the "Missing Middle" problem through structural inductive bias. Focusing on Beethoven's piano sonatas as a case study, we empirically verify the independence of pitch and hand attributes using normalized mutual information (NMI=0.167) and propose the Smart Embedding architecture, achieving a 48.30% reduction in parameters. We provide rigorous mathematical proofs using information theory (negligible loss bounded at 0.153 bits), Rademacher complexity (28.09% tighter generalization bound), and category theory to demonstrate improved stability and generalization. Empirical results show a 9.47% reduction in validation loss, confirmed by SVD analysis and an expert listening study (N=53). This dual theoretical and applied framework bridges gaps in AI music generation, offering verifiable insights for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
