Cognitive Psychology for Deep Neural Networks: A Shape Bias Case Study
Samuel Ritter, David G.T. Barrett, Adam Santoro, Matt M. Botvinick

TL;DR
This paper applies cognitive psychology methods to analyze deep neural networks, revealing that state-of-the-art models exhibit human-like shape bias in object categorization, with variability influenced by initialization and training dynamics.
Contribution
It demonstrates how cognitive psychology tools can uncover hidden properties of DNNs, specifically showing shape bias in image classification models and linking them to human word learning.
Findings
DNNs exhibit shape bias similar to humans.
Shape bias varies with model seed and training stage.
Tools from psychology reveal hidden properties of neural networks.
Abstract
Deep neural networks (DNNs) have achieved unprecedented performance on a wide range of complex tasks, rapidly outpacing our understanding of the nature of their solutions. This has caused a recent surge of interest in methods for rendering modern neural systems more interpretable. In this work, we propose to address the interpretability problem in modern DNNs using the rich history of problem descriptions, theories and experimental methods developed by cognitive psychologists to study the human mind. To explore the potential value of these tools, we chose a well-established analysis from developmental psychology that explains how children learn word labels for objects, and applied that analysis to DNNs. Using datasets of stimuli inspired by the original cognitive psychology experiments, we find that state-of-the-art one shot learning models trained on ImageNet exhibit a similar bias to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Neural Networks and Applications · Domain Adaptation and Few-Shot Learning
MethodsInterpretability
