How to Train an Accurate and Efficient Object Detection Model on Any Dataset
Galina Zalesskaya, Bogna Bylicka, Eugene Liu

TL;DR
This paper introduces a dataset-agnostic object detection training pipeline that provides strong, out-of-the-box models suitable for various datasets, reducing the need for extensive fine-tuning and optimizing for performance and efficiency.
Contribution
The authors propose a universal training template with pre-trained models and a robust pipeline, optimized through parallel training on multiple datasets, enabling effective deployment across diverse use cases.
Findings
Identified three top-performing architectures: VFNet, ATSS, and SSD.
Achieved strong baseline results across multiple datasets.
Models can be deployed on CPU using OpenVINO toolkit.
Abstract
The rapidly evolving industry demands high accuracy of the models without the need for time-consuming and computationally expensive experiments required for fine-tuning. Moreover, a model and training pipeline, which was once carefully optimized for a specific dataset, rarely generalizes well to training on a different dataset. This makes it unrealistic to have carefully fine-tuned models for each use case. To solve this, we propose an alternative approach that also forms a backbone of Intel Geti platform: a dataset-agnostic template for object detection trainings, consisting of carefully chosen and pre-trained models together with a robust training pipeline for further training. Our solution works out-of-the-box and provides a strong baseline on a wide range of datasets. It can be used on its own or as a starting point for further fine-tuning for specific use cases when needed. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Machine Learning and Data Classification
MethodsConvolution · Varifocal Loss · 1x1 Convolution · Non Maximum Suppression · VarifocalNet · SSD · Adaptive Training Sample Selection
