Instance Segmentation with Cross-Modal Consistency

Alex Zihao Zhu; Vincent Casser; Reza Mahjourian; Henrik Kretzschmar,; S\"oren Pirk

arXiv:2210.08113·cs.CV·October 18, 2022

Instance Segmentation with Cross-Modal Consistency

Alex Zihao Zhu, Vincent Casser, Reza Mahjourian, Henrik Kretzschmar,, S\"oren Pirk

PDF

Open Access

TL;DR

This paper presents a novel multi-sensor instance segmentation approach using contrastive learning to produce stable, viewpoint-invariant embeddings across modalities and time, improving scene understanding in robotics.

Contribution

It introduces a contrastive learning framework for multi-modal, multi-temporal instance segmentation that enhances embedding stability and invariance, advancing perception in autonomous systems.

Findings

01

Embeddings are invariant to viewpoint changes.

02

Embeddings are consistent across sensor modalities.

03

Method improves stability of instance masks over time.

Abstract

Segmenting object instances is a key task in machine perception, with safety-critical applications in robotics and autonomous driving. We introduce a novel approach to instance segmentation that jointly leverages measurements from multiple sensor modalities, such as cameras and LiDAR. Our method learns to predict embeddings for each pixel or point that give rise to a dense segmentation of the scene. Specifically, our technique applies contrastive learning to points in the scene both across sensor modalities and the temporal domain. We demonstrate that this formulation encourages the models to learn embeddings that are invariant to viewpoint variations and consistent across sensor modalities. We further demonstrate that the embeddings are stable over time as objects move around the scene. This not only provides stable instance masks, but can also provide valuable signals to downstream…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Video Surveillance and Tracking Methods · Domain Adaptation and Few-Shot Learning

MethodsContrastive Learning