The Unreasonable Effectiveness of Noisy Data for Fine-Grained Recognition
Jonathan Krause, Benjamin Sapp, Andrew Howard, Howard Zhou, Alexander, Toshev, Tom Duerig, James Philbin, Li Fei-Fei

TL;DR
This paper demonstrates that leveraging large amounts of noisy web data with simple recognition methods can significantly outperform traditional expert-annotated datasets in fine-grained recognition tasks, scaling to over 10,000 categories.
Contribution
It introduces a scalable, annotation-free approach using noisy web data for fine-grained recognition, surpassing state-of-the-art accuracy without manual labeling.
Findings
Achieved top-1 accuracy of 92.3% on CUB-200-2011
Surpassed existing methods on four fine-grained datasets
Successfully scaled to over 10,000 categories
Abstract
Current approaches for fine-grained recognition do the following: First, recruit experts to annotate a dataset of images, optionally also collecting more structured data in the form of part annotations and bounding boxes. Second, train a model utilizing this data. Toward the goal of solving fine-grained recognition, we introduce an alternative approach, leveraging free, noisy data from the web and simple, generic methods of recognition. This approach has benefits in both performance and scalability. We demonstrate its efficacy on four fine-grained datasets, greatly exceeding existing state of the art without the manual collection of even a single label, and furthermore show first results at scaling to more than 10,000 fine-grained categories. Quantitatively, we achieve top-1 accuracies of 92.3% on CUB-200-2011, 85.4% on Birdsnap, 93.4% on FGVC-Aircraft, and 80.8% on Stanford Dogs…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Advanced Neural Network Applications · Advanced Image and Video Retrieval Techniques
