TL;DR
This paper introduces a model-level Mixture-of-Experts architecture for object detection that combines YOLO-based detectors trained on different data subsets, improving performance and interpretability over standard ensembles.
Contribution
It proposes a novel MoE framework for object detection with a learned gating network and demonstrates its effectiveness on the BDD100K dataset.
Findings
MoE outperforms standard ensemble methods in object detection tasks.
The approach provides insights into expert specialization across different domains.
The proposed method enhances interpretability of object detection models.
Abstract
Mixture-of-Experts (MoE) models provide a structured approach to combining specialized neural networks and offer greater interpretability than conventional ensembles. While MoEs have been successfully applied to image classification and semantic segmentation, their use in object detection remains limited due to challenges in merging dense and structured predictions. In this work, we investigate model-level mixtures of object detectors and analyze their suitability for improving performance and interpretability in object detection. We propose an MoE architecture that combines YOLO-based detectors trained on semantically disjoint data subsets, with a learned gating network that dynamically weights expert contributions. We study different strategies for fusing detection outputs and for training the gating mechanism, including balancing losses to prevent expert collapse. Experiments on the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
