Beyond accuracy: quantifying trial-by-trial behaviour of CNNs and humans by measuring error consistency
Robert Geirhos, Kristof Meding, Felix A. Wichmann

TL;DR
This paper introduces a trial-by-trial error consistency measure to compare decision-making strategies of CNNs and humans, revealing significant differences in their error patterns despite similar accuracy levels.
Contribution
The study presents a novel quantitative method for analyzing and comparing decision strategies of humans and AI systems beyond accuracy metrics.
Findings
CNNs are highly consistent with each other in error patterns.
Humans and CNNs show only chance-level error consistency, indicating different strategies.
Recurrent models like CORnet-S do not better capture human error patterns than standard CNNs.
Abstract
A central problem in cognitive science and behavioural neuroscience as well as in machine learning and artificial intelligence research is to ascertain whether two or more decision makers (be they brains or algorithms) use the same strategy. Accuracy alone cannot distinguish between strategies: two systems may achieve similar accuracy with very different strategies. The need to differentiate beyond accuracy is particularly pressing if two systems are near ceiling performance, like Convolutional Neural Networks (CNNs) and humans on visual object recognition. Here we introduce trial-by-trial error consistency, a quantitative analysis for measuring whether two decision making systems systematically make errors on the same inputs. Making consistent errors on a trial-by-trial basis is a necessary condition for similar processing strategies between decision makers. Our analysis is applicable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning · Neural dynamics and brain function
