CLAMP: Contrastive Learning for 3D Multi-View Action-Conditioned Robotic Manipulation Pretraining

I-Chun Arthur Liu; Krzysztof Choromanski; Sandy Huang; Connor Schenck

arXiv:2602.00937·cs.RO·May 7, 2026

CLAMP: Contrastive Learning for 3D Multi-View Action-Conditioned Robotic Manipulation Pretraining

I-Chun Arthur Liu, Krzysztof Choromanski, Sandy Huang, Connor Schenck

PDF

1 Repo

TL;DR

CLAMP introduces a 3D pre-training framework using point clouds and contrastive learning to enhance robotic manipulation, significantly improving efficiency and performance on various tasks.

Contribution

The paper presents a novel 3D pre-training approach with contrastive learning and diffusion policy initialization for robotic manipulation.

Findings

01

Outperforms state-of-the-art baselines on six simulated tasks.

02

Achieves superior results on five real-world tasks.

03

Enhances learning efficiency and policy performance.

Abstract

Leveraging pre-trained 2D image representations in behavior cloning policies has achieved great success and has become a standard approach for robotic manipulation. However, such representations fail to capture the 3D spatial information about objects and scenes that is essential for precise manipulation. In this work, we introduce Contrastive Learning for 3D Multi-View Action-Conditioned Robotic Manipulation Pretraining (CLAMP), a novel 3D pre-training framework that utilizes point clouds and robot actions. From the merged point cloud computed from RGB-D images and camera extrinsics, we re-render multi-view four-channel image observations with depth and 3D coordinates, including dynamic wrist views, to provide clearer views of target objects for high-precision manipulation tasks. The pre-trained encoders learn to associate the 3D geometric and positional information of objects with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://clamp3d.github.io/CLAMP
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.