Contrastive Learning of Features between Images and LiDAR
Peng Jiang, Srikanth Saripalli

TL;DR
This paper introduces a novel contrastive learning approach with a new loss function to learn cross-modal features between images and LiDAR point clouds, enhancing robotic perception tasks.
Contribution
It proposes a Tuple-Circle loss function and a dual-architecture network for effective cross-modal feature learning from images and LiDAR data.
Findings
The proposed loss improves cross-modal feature alignment.
The network effectively captures information from both modalities.
Visualizations confirm the learned features encode meaningful cross-modal correspondences.
Abstract
Image and Point Clouds provide different information for robots. Finding the correspondences between data from different sensors is crucial for various tasks such as localization, mapping, and navigation. Learning-based descriptors have been developed for single sensors; there is little work on cross-modal features. This work treats learning cross-modal features as a dense contrastive learning problem. We propose a Tuple-Circle loss function for cross-modality feature learning. Furthermore, to learn good features and not lose generality, we developed a variant of widely used PointNet++ architecture for point cloud and U-Net CNN architecture for images. Moreover, we conduct experiments on a real-world dataset to show the effectiveness of our loss function and network structure. We show that our models indeed learn information from both images as well as LiDAR by visualizing the features.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRemote Sensing and LiDAR Applications · 3D Surveying and Cultural Heritage · 3D Shape Modeling and Analysis
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Concatenated Skip Connection · Convolution · Max Pooling · U-Net · Contrastive Learning · Dense Contrastive Learning
