TL;DR
LangTail introduces a language-guided hierarchical learning framework that leverages language model priors to improve unsupervised 3D point cloud segmentation, especially for minority classes.
Contribution
It proposes a novel method combining language-derived semantic priors with hierarchical clustering to address long-tail ambiguity in 3D segmentation.
Findings
Outperforms existing methods with +13.5, +12.9, +8.9 mIoU improvements on ScanNet-v2, S3DIS, and nuScenes.
Effectively mitigates bias towards dominant classes in unsupervised 3D segmentation.
Demonstrates the benefit of language priors in enhancing minority class representation.
Abstract
Existing approaches for unsupervised 3D point cloud segmentation predominantly rely on a purely visual similarity-based learning-by-clustering paradigm, which suffers from a fundamental limitation: long-tail ambiguity. In such a paradigm, features of minor classes are consistently absorbed by dominant clusters, leading to severely imbalanced predictions. To address this issue, we propose LangTail, a language-guided hierarchical learning framework that leverages the balanced world knowledge encoded in language models to mitigate long-tail ambiguity in unsupervised 3D segmentation. The key idea is to establish multi-level associations between language-derived semantic priors and visually underrepresented minor classes, thereby compensating for the biased attention of purely visual clustering toward dominant classes. Specifically, LangTail first constructs an entity-level semantic prior…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
