Auto-Labeling Data for Object Detection
Brent A. Griffin, Manushree Gangwar, Jacob Sela, Jason J. Corso

TL;DR
This paper introduces a method to train object detection models using auto-generated pseudo labels from vision-language models, reducing labeling costs while maintaining competitive performance.
Contribution
It presents a novel approach to eliminate ground truth labels by leveraging pre-trained vision-language models for auto-labeling in object detection tasks.
Findings
Maintains competitive accuracy across multiple datasets
Reduces labeling time and costs significantly
Establishes best practices and benchmarks for auto-labeling
Abstract
Great labels make great models. However, traditional labeling approaches for tasks like object detection have substantial costs at scale. Furthermore, alternatives to fully-supervised object detection either lose functionality or require larger models with prohibitive computational costs for inference at scale. To that end, this paper addresses the problem of training standard object detection models without any ground truth labels. Instead, we configure previously-trained vision-language foundation models to generate application-specific pseudo "ground truth" labels. These auto-generated labels directly integrate with existing model training frameworks, and we subsequently train lightweight detection models that are computationally efficient. In this way, we avoid the costs of traditional labeling, leverage the knowledge of vision-language models, and keep the efficiency of lightweight…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBrain Tumor Detection and Classification · Image and Object Detection Techniques · Machine Learning and Data Classification
