PointRFT: Explicit Reinforcement Fine-tuning for Point Cloud Few-shot Learning
Yankai Wang, Yiding Sun, Qirui Wang, Pengbo Li, Chaoyi Lu, Dongxu Zhang

TL;DR
PointRFT introduces a reinforcement fine-tuning approach for point cloud models, significantly improving few-shot learning performance by leveraging specialized reward functions and hybrid training paradigms.
Contribution
This paper presents the first reinforcement fine-tuning method for point cloud models, enhancing few-shot learning and integrating with pretraining for state-of-the-art results.
Findings
Outperforms vanilla supervised fine-tuning in few-shot classification
Effectively stabilizes training with accuracy and dispersion rewards
Achieves state-of-the-art results in data-scarce scenarios
Abstract
Understanding spatial dynamics and semantics in point cloud is fundamental for comprehensive 3D comprehension. While reinforcement learning algorithms such as Group Relative Policy Optimization (GRPO) have recently achieved remarkable breakthroughs in large language models by incentivizing reasoning capabilities through strategic reward design, their potential remains largely unexplored in the 3D perception domain. This naturally raises a pivotal question: Can RL-based methods effectively empower 3D point cloud fine-tuning? In this paper, we propose PointRFT, the first reinforcement fine-tuning paradigm tailored specifically for point cloud representation learning. We select three prevalent 3D foundation models and devise specialized accuracy reward and dispersion reward functions to stabilize training and mitigate distribution shifts. Through comprehensive few-shot classification…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis · Reinforcement Learning in Robotics · Multimodal Machine Learning Applications
