An extremely coarse feedback signal is sufficient for learning human-aligned visual representations
Yash Mehta, Michael F. Bonner

TL;DR
This study shows that neural networks trained with coarse, broad categories develop visual representations that align closely with human perception and brain data, challenging the need for fine-grained supervision.
Contribution
The paper systematically investigates how the granularity of supervisory signals affects brain-like visual representation learning, revealing coarse signals are surprisingly effective.
Findings
Networks trained on as few as 8 categories match or exceed neural alignment of fine-grained models.
Coarse training improves alignment with human perceptual similarity more than fine supervision.
Coarse feedback signals can produce human-like visual representations in neural networks.
Abstract
Artificial neural networks trained on visual tasks develop internal representations resembling those of the primate visual system, a discovery that has guided a decade of computational neuroscience. Research on building brain-aligned models has progressively embraced finer-grained supervisory signals, from object classification to contrastive self-supervised objectives that maximize distinctions among individual images, yet the role of supervisory signal granularity on brain alignment remains largely unexamined. Here we systematically investigate how the coarseness of a learning signal shapes representational alignment with human vision. We parametrically vary the level of signal granularity using a data-driven approach that partitions a set of training images into varied numbers of categories (2, 4, 8, 16, ..., 64) via PCA-based splits of pretrained embeddings. We train hundreds of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
