CerberusDet: Unified Multi-Dataset Object Detection
Irina Tolstykh, Mikhail Chernyshov, Maksim Kuprashevich

TL;DR
CerberusDet is a multi-task object detection framework based on YOLO that efficiently handles multiple datasets and categories, achieving state-of-the-art results with reduced inference time and improved scalability.
Contribution
It introduces a multi-headed YOLO-based model capable of multi-dataset object detection, addressing dataset merging and class extension challenges.
Findings
Achieved state-of-the-art results on PASCAL VOC and Objects365 datasets.
Reduced inference time by 36% compared to traditional models.
Training multiple tasks together improves efficiency over sequential models.
Abstract
Conventional object detection models are usually limited by the data on which they were trained and by the category logic they define. With the recent rise of Language-Visual Models, new methods have emerged that are not restricted to these fixed categories. Despite their flexibility, such Open Vocabulary detection models still fall short in accuracy compared to traditional models with fixed classes. At the same time, more accurate data-specific models face challenges when there is a need to extend classes or merge different datasets for training. The latter often cannot be combined due to different logics or conflicting class definitions, making it difficult to improve a model without compromising its performance. In this paper, we introduce CerberusDet, a framework with a multi-headed model designed for handling multiple object detection tasks. Proposed model is built on the YOLO…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHandwritten Text Recognition Techniques
