DAMEX: Dataset-aware Mixture-of-Experts for visual understanding of mixture-of-datasets
Yash Jain, Harkirat Behl, Zsolt Kira, Vibhav Vineet

TL;DR
DAMEX introduces a dataset-aware mixture-of-experts model for universal object detection, effectively learning dataset-specific features within a single scalable model, outperforming previous methods on diverse datasets.
Contribution
The paper proposes DAMEX, a novel dataset-aware mixture-of-experts approach that routes dataset tokens to specialized experts, enhancing universal detection performance without increasing model complexity.
Findings
Outperforms state-of-the-art by +10.2 AP on Universal Object-Detection Benchmark.
Improves baseline by +2.0 AP across multiple datasets.
Robust against expert collapse and effective with limited, diverse, and divergent datasets.
Abstract
Construction of a universal detector poses a crucial question: How can we most effectively train a model on a large mixture of datasets? The answer lies in learning dataset-specific features and ensembling their knowledge but do all this in a single model. Previous methods achieve this by having separate detection heads on a common backbone but that results in a significant increase in parameters. In this work, we present Mixture-of-Experts as a solution, highlighting that MoEs are much more than a scalability tool. We propose Dataset-Aware Mixture-of-Experts, DAMEX where we train the experts to become an `expert' of a dataset by learning to route each dataset tokens to its mapped expert. Experiments on Universal Object-Detection Benchmark show that we outperform the existing state-of-the-art by average +10.2 AP score and improve over our non-MoE baseline by average +2.0 AP score. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Advanced Neural Network Applications
