P2P: Tuning Pre-trained Image Models for Point Cloud Analysis with Point-to-Pixel Prompting
Ziyi Wang, Xumin Yu, Yongming Rao, Jie Zhou, Jiwen Lu

TL;DR
This paper introduces Point-to-Pixel Prompting, a method that leverages pre-trained 2D image models for 3D point cloud analysis by transforming point clouds into images, achieving high accuracy with minimal additional training.
Contribution
It proposes a novel Point-to-Pixel Prompting technique that enables effective transfer of pre-trained image models to 3D point cloud tasks with minimal parameter tuning.
Findings
Achieves 89.3% accuracy on ScanObjectNN's hardest setting.
Outperforms conventional point cloud models with fewer trainable parameters.
Demonstrates strong performance on ModelNet and ShapeNet datasets.
Abstract
Nowadays, pre-training big models on large-scale datasets has become a crucial topic in deep learning. The pre-trained models with high representation ability and transferability achieve a great success and dominate many downstream tasks in natural language processing and 2D vision. However, it is non-trivial to promote such a pretraining-tuning paradigm to the 3D vision, given the limited training data that are relatively inconvenient to collect. In this paper, we provide a new perspective of leveraging pre-trained 2D knowledge in 3D domain to tackle this problem, tuning pre-trained image models with the novel Point-to-Pixel prompting for point cloud analysis at a minor parameter cost. Following the principle of prompting engineering, we transform point clouds into colorful images with geometry-preserved projection and geometry-aware coloring to adapt to pre-trained image models, whose…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
Topics3D Surveying and Cultural Heritage · 3D Shape Modeling and Analysis · Advanced Neural Network Applications
