Foundation Models for Amodal Video Instance Segmentation in Automated   Driving

Jasmin Breitenstein; Franz J\"unger; Andreas B\"ar; Tim Fingscheidt

arXiv:2409.14095·cs.CV·September 24, 2024

Foundation Models for Amodal Video Instance Segmentation in Automated Driving

Jasmin Breitenstein, Franz J\"unger, Andreas B\"ar, Tim Fingscheidt

PDF

Open Access 1 Repo

TL;DR

This paper introduces S-AModal, a novel approach leveraging foundation models and point memory to perform amodal video instance segmentation in automated driving, achieving state-of-the-art results without requiring amodal video labels.

Contribution

It proposes a fine-tuning method of the Segment Anything Model for amodal segmentation, using point prompts and memory to track instances across frames.

Findings

01

Achieves state-of-the-art amodal video instance segmentation results.

02

Reduces dependency on expensive amodal video labels.

03

Demonstrates effective point-based tracking with foundation models.

Abstract

In this work, we study amodal video instance segmentation for automated driving. Previous works perform amodal video instance segmentation relying on methods trained on entirely labeled video data with techniques borrowed from standard video instance segmentation. Such amodally labeled video data is difficult and expensive to obtain and the resulting methods suffer from a trade-off between instance segmentation and tracking performance. To largely solve this issue, we propose to study the application of foundation models for this task. More precisely, we exploit the extensive knowledge of the Segment Anything Model (SAM), while fine-tuning it to the amodal instance segmentation task. Given an initial video instance segmentation, we sample points from the visible masks to prompt our amodal SAM. We use a point memory to store those points. If a previously observed instance is not…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ifnspaml/s-amodal
jaxOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis

MethodsSegment Anything Model