Does Progress On Object Recognition Benchmarks Improve Real-World Generalization?
Megan Richards, Polina Kirichenko, Diane Bouchacourt, Mark Ibrahim

TL;DR
This paper investigates whether improvements on standard object recognition benchmarks translate to better real-world generalization across different geographies, revealing significant disparities and limited progress in real-world robustness.
Contribution
It introduces a new focus on geographical distribution shifts, providing extensive empirical analysis showing standard benchmarks do not reflect real-world generalization challenges.
Findings
Progress on ImageNet benchmarks does not translate to improved geographic robustness.
All models exhibit large regional disparities, even state-of-the-art foundation models.
Simple retraining on curated data significantly reduces geographic disparities.
Abstract
For more than a decade, researchers have measured progress in object recognition on ImageNet-based generalization benchmarks such as ImageNet-A, -C, and -R. Recent advances in foundation models, trained on orders of magnitude more data, have begun to saturate these standard benchmarks, but remain brittle in practice. This suggests standard benchmarks, which tend to focus on predefined or synthetic changes, may not be sufficient for measuring real world generalization. Consequently, we propose studying generalization across geography as a more realistic measure of progress using two datasets of objects from households across the globe. We conduct an extensive empirical evaluation of progress across nearly 100 vision models up to most recent foundation models. We first identify a progress gap between standard benchmarks and real-world, geographical shifts: progress on ImageNet results in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Remote-Sensing Image Classification · Advanced Neural Network Applications
MethodsFocus · Contrastive Language-Image Pre-training
