Working with scale: 2nd place solution to Product Detection in Densely Packed Scenes [Technical Report]
Artem Kozlov

TL;DR
This paper presents a detailed re-experimentation and verification of object detection models for densely packed scenes, culminating in a second-place solution at CVPR 2020, emphasizing reproducibility and simple model tricks.
Contribution
It systematically re-evaluates previous findings using MMDetection and introduces simple modifications like anchor scale adjustment and image tiling for improved detection.
Findings
Faster-RCNN and RetinaNet baseline results confirmed
Advanced models outperform initial baselines
Simple tricks improve detection accuracy
Abstract
This report describes a 2nd place solution of the detection challenge which is held within CVPR 2020 Retail-Vision workshop. Instead of going further considering previous results this work mainly aims to verify previously observed takeaways by re-experimenting. The reliability and reproducibility of the results are reached by incorporating a popular object detection toolbox - MMDetection. In this report, I firstly represent the results received for Faster-RCNN and RetinaNet models, which were taken for comparison in the original work. Then I describe the experiment results with more advanced models. The final section reviews two simple tricks for Faster-RCNN model that were used for my final submission: changing default anchor scale parameter and train-time image tiling. The source code is available at https://github.com/tyomj/product_detection.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Advanced Image and Video Retrieval Techniques · Industrial Vision Systems and Defect Detection
Methods1x1 Convolution · Region Proposal Network · Softmax · Convolution · RoIPool · Faster R-CNN · Focal Loss · Feature Pyramid Network · RetinaNet
