TL;DR
Mix3D introduces a novel data augmentation method that combines scenes to create out-of-context training samples, enhancing model generalization by balancing global scene context and local geometry in 3D scene segmentation.
Contribution
The paper proposes Mix3D, a scene mixing augmentation technique that improves 3D scene segmentation by reducing over-reliance on scene context, outperforming state-of-the-art methods.
Findings
Significant performance boost on ScanNet and S3DIS datasets.
Outperforms prior state-of-the-art on ScanNet test benchmark with 78.1 mIoU.
Effective with any existing 3D segmentation method.
Abstract
We present Mix3D, a data augmentation technique for segmenting large-scale 3D scenes. Since scene context helps reasoning about object semantics, current works focus on models with large capacity and receptive fields that can fully capture the global context of an input 3D scene. However, strong contextual priors can have detrimental implications like mistaking a pedestrian crossing the street for a car. In this work, we focus on the importance of balancing global scene context and local geometry, with the goal of generalizing beyond the contextual priors in the training set. In particular, we propose a "mixing" technique which creates new training samples by combining two augmented scenes. By doing so, object instances are implicitly placed into novel out-of-context environments and therefore making it harder for models to rely on scene context alone, and instead infer semantics from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsTest
