TL;DR
This paper introduces INSTR, a transformer-based method for object instance segmentation from stereo images that does not rely on depth data, enabling effective segmentation in dynamic, cluttered environments.
Contribution
The paper presents a novel stereo image-based instance segmentation approach using transformers, avoiding explicit depth map computation, and demonstrating superior performance over depth-based methods.
Findings
Outperforms state-of-the-art depth-based segmentation methods
Effective in various application domains
Does not require prior object semantic or geometric information
Abstract
Although instance-aware perception is a key prerequisite for many autonomous robotic applications, most of the methods only partially solve the problem by focusing solely on known object categories. However, for robots interacting in dynamic and cluttered environments, this is not realistic and severely limits the range of potential applications. Therefore, we propose a novel object instance segmentation approach that does not require any semantic or geometric information of the objects beforehand. In contrast to existing works, we do not explicitly use depth data as input, but rely on the insight that slight viewpoint changes, which for example are provided by stereo image pairs, are often sufficient to determine object boundaries and thus to segment objects. Focusing on the versatility of stereo sensors, we employ a transformer-based architecture that maps directly from the pair of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Dropout · Attention Is All You Need · Adam · Residual Connection · Byte Pair Encoding · Label Smoothing · Multi-Head Attention
