UniD-Shift: Towards Unified Semantic Segmentation via Interpretable Share-Private Multimodal Decomposition

Shuai Zhang; Zhecheng Shi; Zhuxiao Li; Jing Ou; Tengxi Wang; Yuan Liu; Wufan Zhao

arXiv:2605.07356·cs.CV·May 11, 2026

UniD-Shift: Towards Unified Semantic Segmentation via Interpretable Share-Private Multimodal Decomposition

Shuai Zhang, Zhecheng Shi, Zhuxiao Li, Jing Ou, Tengxi Wang, Yuan Liu, Wufan Zhao

PDF

1 Repo

TL;DR

UniD-Shift introduces a unified framework for 2D-3D semantic segmentation that decomposes features into shared and private components, enhancing accuracy and robustness across benchmarks.

Contribution

It proposes a share-private multimodal decomposition approach with explicit feature separation and a fusion module, improving cross-modal segmentation performance.

Findings

01

Achieves consistent segmentation accuracy improvements on SemanticKITTI and nuScenes.

02

Demonstrates stable generalization under distribution shifts in nuScenes USA-Singapore.

03

Offers competitive computational efficiency compared to baseline methods.

Abstract

Semantic segmentation of large-scale 3D point clouds is crucial for applications such as autonomous driving and urban digital twins. However, the sparse sampling pattern of LiDAR and the view-dependent geometric distortion in image observations complicate cross-modal alignment and hinder stable fusion. Inspired by the fact that 2D images captured by cameras are representations of the 3D world, we recognize that the features learned from 2D and 3D segmentation share some common semantics, while other aspects remain modality-specific. This insight motivates a unified multimodal framework for joint 2D-3D semantic segmentation. We combine a SAM-based vision encoder with a SPTNet-based geometric encoder to extract complementary semantic and geometric representations. The resulting features from both modalities are explicitly decomposed into shared and private subspaces, where the shared…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

shuaizhang69/UniD-Shift
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.