On Geometry-Enhanced Parameter-Efficient Fine-Tuning for 3D Scene Segmentation
Liyao Tang, Zhe Chen, Dacheng Tao

TL;DR
This paper introduces GEM, a geometry-aware parameter-efficient fine-tuning method for 3D point cloud transformers, achieving high performance with minimal parameter updates and reduced computational costs.
Contribution
The paper proposes GEM, a novel geometry-aware PEFT module that effectively captures local and global 3D structures, outperforming existing PEFT methods in 3D scene segmentation tasks.
Findings
GEM achieves comparable or better performance than full fine-tuning.
GEM updates only 1.6% of model parameters, reducing training costs.
GEM sets new benchmarks for efficient 3D point cloud model fine-tuning.
Abstract
The emergence of large-scale pre-trained point cloud models has significantly advanced 3D scene understanding, but adapting these models to specific downstream tasks typically demands full fine-tuning, incurring high computational and storage costs. Parameter-efficient fine-tuning (PEFT) techniques, successful in natural language processing and 2D vision tasks, would underperform when naively applied to 3D point cloud models due to significant geometric and spatial distribution shifts. Existing PEFT methods commonly treat points as orderless tokens, neglecting important local spatial structures and global geometric contexts in 3D modeling. To bridge this gap, we introduce the Geometric Encoding Mixer (GEM), a novel geometry-aware PEFT module specifically designed for 3D point cloud transformers. GEM explicitly integrates fine-grained local positional encodings with a lightweight latent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
Topics3D Shape Modeling and Analysis · Advanced Neural Network Applications · 3D Surveying and Cultural Heritage
