Musical Training, but not Mere Exposure to Music, Drives the Emergence of Chroma Equivalence in Artificial Neural Networks
Lukas Grasse, Matthew S. Tata

TL;DR
This study investigates how different training regimes in artificial neural networks influence the emergence of chroma equivalence, a perceptual feature of pitch, revealing that specialized music training, not mere exposure, is crucial.
Contribution
It demonstrates that supervised music transcription training in ANNs leads to chroma equivalence, unlike self-supervised learning, highlighting the importance of task-specific training for perceptual features.
Findings
Models trained on music transcription exhibit chroma equivalence.
Self-supervised learning alone does not produce chroma equivalence.
Pitch height representation emerges in all models.
Abstract
Pitch is a fundamental aspect of auditory perception. Pitch perception is commonly described across two perceptual dimensions: pitch height is the sense that tones with varying frequencies seem to be higher or lower, and chroma equivalence is the cyclical similarity of notes octaves, corresponding to a doubling of fundamental frequency. Existing research is divided on whether chroma equivalence is a learned percept that varies according to musical experience and culture, or is an innate percept that develops automatically. Building on a recent framework that proposes to use ANNs to ask 'why' questions about the brain, we evaluated recent auditory ANNs using representational similarity analysis to test the emergence of pitch height and chroma equivalence in their learned representations. Additionally, we fine-tuned two models, Wav2Vec 2.0 and Data2Vec, on a self-supervised learning task…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeuroscience and Music Perception · Hearing Loss and Rehabilitation · Neural dynamics and brain function
