Combating noisy labels in object detection datasets
Krystian Chachu{\l}a, Jakub {\L}yskawa, Bart{\l}omiej Olber, Piotr, Fr\k{a}tczak, Adam Popowicz, Krystian Radlak

TL;DR
This paper introduces CLOD, an algorithm that assesses and corrects label errors in object detection datasets, significantly improving model accuracy by cleaning dataset annotations without changing model architectures.
Contribution
The paper presents a novel Confident Learning for Object Detection (CLOD) algorithm that identifies and corrects label errors, enhancing dataset quality and model performance.
Findings
CLOD detects nearly 80% of artificially disturbed bounding boxes.
Dataset cleaning with CLOD improves mAP scores by 16% to 46%.
The method does not require modifications to existing network architectures.
Abstract
The quality of training datasets for deep neural networks is a key factor contributing to the accuracy of resulting models. This effect is amplified in difficult tasks such as object detection. Dealing with errors in datasets is often limited to accepting that some fraction of examples are incorrect, estimating their confidence, and either assigning appropriate weights or ignoring uncertain ones during training. In this work, we propose a different approach. We introduce the Confident Learning for Object Detection (CLOD) algorithm for assessing the quality of each label in object detection datasets, identifying missing, spurious, mislabeled, and mislocated bounding boxes and suggesting corrections. By focusing on finding incorrect examples in the training datasets, we can eliminate them at the root. Suspicious bounding boxes can be reviewed to improve the quality of the dataset, leading…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Machine Learning and Data Classification
