Sparse Codes for Speech Predict Spectrotemporal Receptive Fields in the Inferior Colliculus
Nicole L. Carlson, Vivienne L. Ming, and Michael R. DeWeese

TL;DR
This paper introduces a sparse coding model for speech that predicts neural receptive fields in the inferior colliculus, revealing both known and novel spectrotemporal features aligned with biological data.
Contribution
The study presents the first model predicting receptive fields of midbrain auditory neurons based on sound statistics and coding principles.
Findings
Model captures known speech features like formants and onsets
Identifies novel spectrotemporal patterns such as checkerboard structures
Receptive fields resemble those observed in the inferior colliculus
Abstract
We have developed a sparse mathematical representation of speech that minimizes the number of active model neurons needed to represent typical speech sounds. The model learns several well-known acoustic features of speech such as harmonic stacks, formants, onsets and terminations, but we also find more exotic structures in the spectrogram representation of sound such as localized checkerboard patterns and frequency-modulated excitatory subregions flanked by suppressive sidebands. Moreover, several of these novel features resemble neuronal receptive fields reported in the Inferior Colliculus (IC), as well as auditory thalamus and cortex, and our model neurons exhibit the same tradeoff in spectrotemporal resolution as has been observed in IC. To our knowledge, this is the first demonstration that receptive fields of neurons in the ascending mammalian auditory pathway beyond the auditory…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
