Detection Hub: Unifying Object Detection Datasets via Query Adaptation on Language Embedding
Lingchen Meng, Xiyang Dai, Yinpeng Chen, Pengchuan Zhang, Dongdong, Chen, Mengchen Liu, Jianfeng Wang, Zuxuan Wu, Lu Yuan, Yu-Gang Jiang

TL;DR
Detection Hub introduces a dataset-aware, category-aligned approach that unifies multiple object detection datasets using language embeddings, significantly improving detection performance and achieving state-of-the-art results.
Contribution
The paper proposes a novel Detection Hub framework that addresses dataset inconsistency in object detection by learning dataset embeddings and aligning categories via language embeddings.
Findings
Joint training on multiple datasets improves detection accuracy.
Detection Hub achieves state-of-the-art results on UODB benchmark.
Semantic category alignment enhances cross-dataset learning.
Abstract
Combining multiple datasets enables performance boost on many computer vision tasks. But similar trend has not been witnessed in object detection when combining multiple datasets due to two inconsistencies among detection datasets: taxonomy difference and domain gap. In this paper, we address these challenges by a new design (named Detection Hub) that is dataset-aware and category-aligned. It not only mitigates the dataset inconsistency but also provides coherent guidance for the detector to learn across multiple datasets. In particular, the dataset-aware design is achieved by learning a dataset embedding that is used to adapt object queries as well as convolutional kernels in detection heads. The categories across datasets are semantically aligned into a unified space by replacing one-hot category representations with word embedding and leveraging the semantic coherence of language…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Multimodal Machine Learning Applications
