TL;DR
This paper introduces a low-data baseline for dense object detection in crowded scenes, demonstrating that effective detection is possible with significantly less annotated data using data augmentation.
Contribution
It presents a small, densely annotated dataset and benchmark for generic SKU product detection, facilitating research in low-data dense object detection.
Findings
Low data baseline achieves mAP=0.56 at IoU 0.5
Dataset is 265 times smaller than standard datasets
Benchmark covers multiple public datasets for SKU detection
Abstract
Object detection in densely packed scenes is a new area where standard object detectors fail to train well. Dense object detectors like RetinaNet trained on large and dense datasets show great performance. We train a standard object detector on a small, normally packed dataset with data augmentation techniques. This dataset is 265 times smaller than the standard dataset, in terms of number of annotations. This low data baseline achieves satisfactory results (mAP=0.56) at standard IoU of 0.5. We also create a varied benchmark for generic SKU product detection by providing full annotations for multiple public datasets. It can be accessed at https://github.com/ParallelDots/generic-sku-detection-benchmark. We hope that this benchmark helps in building robust detectors that perform reliably across different settings in the wild.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Methods1x1 Convolution · Convolution · Feature Pyramid Network · Focal Loss · RetinaNet
