TL;DR
This study introduces controversial stimuli to compare neural network models of human visual recognition, revealing that generative models outperform discriminative ones but none fully match human responses, highlighting differences in perceptual biases.
Contribution
The paper proposes a novel method of synthesizing controversial stimuli to effectively compare neural network models against human perception in visual recognition tasks.
Findings
Generative models better predict human judgments than discriminative models.
Controversial stimuli reveal model limitations and perceptual biases.
None of the models fully match human responses on synthesized stimuli.
Abstract
Distinct scientific theories can make similar predictions. To adjudicate between theories, we must design experiments for which the theories make distinct predictions. Here we consider the problem of comparing deep neural networks as models of human visual recognition. To efficiently compare models' ability to predict human responses, we synthesize controversial stimuli: images for which different models produce distinct responses. We applied this approach to two visual recognition tasks, handwritten digits (MNIST) and objects in small natural images (CIFAR-10). For each task, we synthesized controversial stimuli to maximize the disagreement among models which employed different architectures and recognition algorithms. Human subjects viewed hundreds of these stimuli, as well as natural examples, and judged the probability of presence of each digit/object category in each image. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
