Open-Det: An Efficient Learning Framework for Open-Ended Detection
Guiping Cao, Tao Wang, Wenjian Huang, Xiangyuan Lan, Jianguo Zhang, Dongmei Jiang

TL;DR
Open-Det introduces an efficient framework for open-ended object detection that reduces training data and resource requirements while improving performance through innovative alignment and loss mechanisms.
Contribution
The paper presents a novel Open-Det framework that accelerates training and enhances open-ended detection performance by integrating vision-language alignment and specialized loss functions.
Findings
Uses only 1.5% of training data compared to previous models.
Reduces training epochs from 149 to 31.
Achieves 1.0% higher average precision.
Abstract
Open-Ended object Detection (OED) is a novel and challenging task that detects objects and generates their category names in a free-form manner, without requiring additional vocabularies during inference. However, the existing OED models, such as GenerateU, require large-scale datasets for training, suffer from slow convergence, and exhibit limited performance. To address these issues, we present a novel and efficient Open-Det framework, consisting of four collaborative parts. Specifically, Open-Det accelerates model training in both the bounding box and object name generation process by reconstructing the Object Detector and the Object Name Generator. To bridge the semantic gap between Vision and Language modalities, we propose a Vision-Language Aligner with V-to-L and L-to-V alignment mechanisms, incorporating with the Prompts Distiller to transfer knowledge from the VLM into…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Machine Learning and Data Classification · Machine Learning and Algorithms
