Invaria: Learning Scale and Density Invariance in Point Clouds via Next-Resolution Prediction
Chun-Peng Chang, Shaoxiang Wang, Alain Pagani, Dariu Gavrila, Holger Caesar

TL;DR
Invaria is a novel point cloud encoder that learns to be invariant to scale and density changes by predicting next-resolution structures, significantly improving robustness and efficiency in 3D perception tasks.
Contribution
The paper introduces Invaria, a new training method for point cloud encoders that achieves scale and density invariance through next-resolution prediction, enhancing generalization in 3D vision.
Findings
Invaria improves mIoU by 56% at lower resolution on ScanNet.
The model is 45% smaller and uses 40% fewer tokens.
It maintains high performance under object scale variations.
Abstract
Modern image encoders achieve high generalization by decoupling semantic meaning from resolution, an ability yet to be fully realized in the 3D domain. We investigate the failure of 3D point cloud encoders to achieve similar generalization and find that existing models are highly sensitive to sampling resolution and scale changes, leading to significant performance degradation. This sensitivity is a major bottleneck for real-world deployment in robotics, as it suggests models overfit to specific quantization densities and object scales rather than learning invariant semantic features. To mitigate this dependency, we propose Invaria, a point cloud encoder that achieves scale and density invariance through next-resolution prediction and receptive field calibration. While our objective is not the explicit generation of high-resolution point clouds, we find that this training objective…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
