OOWL500: Overcoming Dataset Collection Bias in the Wild
Brandon Leung, Chih-Hui Ho, Amir Persekian, David Orozco, Yen Chang,, Erik Sandstrom, Bo Liu, Nuno Vasconcelos

TL;DR
This paper introduces OOWL500, a large, unbiased 'in the lab' image dataset collected via drone, which helps identify and reduce biases in object recognition models and improves their robustness.
Contribution
The paper presents a scalable drone-based data collection method, creating the OOWL500 dataset, and demonstrates its effectiveness in reducing biases and enhancing object recognition.
Findings
OOWL500 contains 120,000 images of 500 objects, making it the largest of its kind.
Augmenting wild datasets with in-lab data reduces biases and improves generalization.
Camera shake and pose diversity are crucial for robust object recognition.
Abstract
The hypothesis that image datasets gathered online "in the wild" can produce biased object recognizers, e.g. preferring professional photography or certain viewing angles, is studied. A new "in the lab" data collection infrastructure is proposed consisting of a drone which captures images as it circles around objects. Crucially, the control provided by this setup and the natural camera shake inherent to flight mitigate many biases. It's inexpensive and easily replicable nature may also potentially lead to a scalable data collection effort by the vision community. The procedure's usefulness is demonstrated by creating a dataset of Objects Obtained With fLight (OOWL). Denoted as OOWL500, it contains 120,000 images of 500 objects and is the largest "in the lab" image dataset available when both number of classes and objects per class are considered. Furthermore, it has enabled several of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Neural Network Applications · Domain Adaptation and Few-Shot Learning
