Scalable Annotation of Fine-Grained Categories Without Experts
Timnit Gebru, Jonathan Krause, Jia Deng, Li Fei-Fei

TL;DR
This paper introduces a scalable crowdsourcing method to annotate fine-grained categories in images without expert input, effectively grouping visually similar objects across synthetic categories like cars.
Contribution
A novel graph-based crowdsourcing algorithm for automatic grouping of visually indistinguishable objects, enabling large-scale annotation without experts.
Findings
Annotated 712,430 images with ~1,000 workers.
Created the largest fine-grained visual dataset to date.
Achieved 1/20th the cost of expert annotation.
Abstract
We present a crowdsourcing workflow to collect image annotations for visually similar synthetic categories without requiring experts. In animals, there is a direct link between taxonomy and visual similarity: e.g. a collie (type of dog) looks more similar to other collies (e.g. smooth collie) than a greyhound (another type of dog). However, in synthetic categories such as cars, objects with similar taxonomy can have very different appearance: e.g. a 2011 Ford F-150 Supercrew-HD looks the same as a 2011 Ford F-150 Supercrew-LL but very different from a 2011 Ford F-150 Supercrew-SVT. We introduce a graph based crowdsourcing algorithm to automatically group visually indistinguishable objects together. Using our workflow, we label 712,430 images by ~1,000 Amazon Mechanical Turk workers; resulting in the largest fine-grained visual dataset reported to date with 2,657 categories of cars…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMobile Crowdsensing and Crowdsourcing · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning
