Any3DIS: Class-Agnostic 3D Instance Segmentation by 2D Mask Tracking
Phuc Nguyen, Minh Luu, Anh Tran, Cuong Pham, Khoi Nguyen

TL;DR
Any3DIS introduces a novel 3D-aware 2D mask tracking approach that leverages robust priors and dynamic view selection to improve class-agnostic 3D instance segmentation accuracy and reduce redundant proposals.
Contribution
The paper presents a new 3D Mask Optimization module and a 3D-Aware 2D Mask Tracking method that enhance segmentation consistency and proposal quality over prior unsupervised merging techniques.
Findings
Improves 3D instance segmentation accuracy on ScanNet datasets.
Reduces redundant and over-segmented proposals.
Enhances performance in open-vocabulary segmentation tasks.
Abstract
Existing 3D instance segmentation methods frequently encounter issues with over-segmentation, leading to redundant and inaccurate 3D proposals that complicate downstream tasks. This challenge arises from their unsupervised merging approach, where dense 2D instance masks are lifted across frames into point clouds to form 3D candidate proposals without direct supervision. These candidates are then hierarchically merged based on heuristic criteria, often resulting in numerous redundant segments that fail to combine into precise 3D proposals. To overcome these limitations, we propose a 3D-Aware 2D Mask Tracking module that uses robust 3D priors from a 2D mask segmentation and tracking foundation model (SAM-2) to ensure consistent object masks across video frames. Rather than merging all visible superpoints across views to create a 3D mask, our 3D Mask Optimization module leverages a dynamic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIndustrial Vision Systems and Defect Detection
MethodsSparse Evolutionary Training
