Open-world Instance Segmentation: Top-down Learning with Bottom-up   Supervision

Tarun Kalluri; Weiyao Wang; Heng Wang; Manmohan Chandraker; Lorenzo; Torresani; Du Tran

arXiv:2303.05503·cs.CV·May 15, 2024·1 cites

Open-world Instance Segmentation: Top-down Learning with Bottom-up Supervision

Tarun Kalluri, Weiyao Wang, Heng Wang, Manmohan Chandraker, Lorenzo, Torresani, Du Tran

PDF

Open Access

TL;DR

This paper introduces UDOS, a novel open-world instance segmentation method that combines top-down learning with bottom-up supervision, improving generalization to unseen categories across multiple datasets.

Contribution

The paper proposes a new open-world segmentation approach that integrates bottom-up class-agnostic segmentation with top-down learning, enhancing generalization and efficiency.

Findings

01

Significant performance improvements over state-of-the-art methods.

02

Effective cross-category and cross-dataset transfer capabilities.

03

Robust instance segmentation on diverse challenging datasets.

Abstract

Many top-down architectures for instance segmentation achieve significant success when trained and tested on pre-defined closed-world taxonomy. However, when deployed in the open world, they exhibit notable bias towards seen classes and suffer from significant performance drop. In this work, we propose a novel approach for open world instance segmentation called bottom-Up and top-Down Open-world Segmentation (UDOS) that combines classical bottom-up segmentation algorithms within a top-down learning framework. UDOS first predicts parts of objects using a top-down network trained with weak supervision from bottom-up segmentations. The bottom-up segmentations are class-agnostic and do not overfit to specific taxonomies. The part-masks are then fed into affinity-based grouping and refinement modules to predict robust instance-level segmentations. UDOS enjoys both the speed and efficiency…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Medical Image Segmentation Techniques

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings