Detecting retail products in situ using CNN without human effort labeling
Wei Yi, Yaoran Sun, Tao Ding, Sailing He

TL;DR
This paper presents a CNN-based method for detecting retail products in situ across 324 categories without requiring manual bounding box annotations, utilizing algorithms for bounding box extraction and occlusion simulation.
Contribution
It introduces a novel approach that eliminates the need for human effort in labeling bounding boxes for training CNNs in retail product detection.
Findings
Effective detection of 324 retail product categories in situ
No manual bounding box labeling required for training
Applicable to other scenarios beyond retail detection
Abstract
CNN is a powerful tool for many computer vision tasks, achieving much better result than traditional methods. Since CNN has a very large capacity, training such a neural network often requires many data, but it is often expensive to obtain labeled images in real practice, especially for object detection, where collecting bounding box of every object in training set requires many human efforts. This is the case in detection of retail products where there can be many different categories. In this paper, we focus on applying CNN to detect 324-categories products in situ, while requiring no extra effort of labeling bounding box for any image. Our approach is based on an algorithm that extracts bounding box from in-vitro dataset and an algorithm to simulate occlusion. We have successfully shown the effectiveness and usefulness of our methods to build up a Faster RCNN detection model. Similar…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Industrial Vision Systems and Defect Detection · Advanced Neural Network Applications
