Open-world Instance Segmentation: Top-down Learning with Bottom-up Supervision
Tarun Kalluri, Weiyao Wang, Heng Wang, Manmohan Chandraker, Lorenzo, Torresani, Du Tran

TL;DR
This paper introduces UDOS, a novel open-world instance segmentation method that combines top-down learning with bottom-up supervision, improving generalization to unseen categories across multiple datasets.
Contribution
The paper proposes a new open-world segmentation approach that integrates bottom-up class-agnostic segmentation with top-down learning, enhancing generalization and efficiency.
Findings
Significant performance improvements over state-of-the-art methods.
Effective cross-category and cross-dataset transfer capabilities.
Robust instance segmentation on diverse challenging datasets.
Abstract
Many top-down architectures for instance segmentation achieve significant success when trained and tested on pre-defined closed-world taxonomy. However, when deployed in the open world, they exhibit notable bias towards seen classes and suffer from significant performance drop. In this work, we propose a novel approach for open world instance segmentation called bottom-Up and top-Down Open-world Segmentation (UDOS) that combines classical bottom-up segmentation algorithms within a top-down learning framework. UDOS first predicts parts of objects using a top-down network trained with weak supervision from bottom-up segmentations. The bottom-up segmentations are class-agnostic and do not overfit to specific taxonomies. The part-masks are then fed into affinity-based grouping and refinement modules to predict robust instance-level segmentations. UDOS enjoys both the speed and efficiency…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Medical Image Segmentation Techniques
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
