DoReMi: Bridging 3D Domains via Topology-Aware Domain-Representation Mixture of Experts
Mingwei Xing, Xinliang Wang, Yifeng Shi

TL;DR
DoReMi introduces a topology-aware MoE framework for 3D scene understanding, effectively handling topological heterogeneity across sensor modalities through novel routing and pre-training strategies.
Contribution
It proposes a self-supervised pre-training method and domain-aware expert mechanisms to improve cross-domain 3D understanding in MoE architectures.
Findings
Achieves 80.1% mIoU on ScanNet validation set.
Achieves 77.2% mIoU on S3DIS.
Outperforms existing state-of-the-art methods.
Abstract
Constructing a unified 3D scene understanding model has long been hindered by the significant topological discrepancies across different sensor modalities. While applying the Mixture-of-Experts (MoE) architecture is an effective approach to achieving universal understanding, we observe that existing 3D MoE networks often suffer from semantics-driven routing bias. This makes it challenging to address cross-domain data characterized by "semantic consistency yet topological heterogeneity." To overcome this challenge, we propose DoReMi (Topology-Aware Domain-Representation Mixture of Experts). Specifically, we introduce a self-supervised pre-training branch based on multi attributes, such as topological and texture variations, to anchor cross-domain structural priors. Building upon this, we design a domain-aware expert branch comprising two core mechanisms: Domain Spatial-Guided Routing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
