SipMask: Spatial Information Preservation for Fast Image and Video Instance Segmentation
Jiale Cao, Rao Muhammad Anwer, Hisham Cholakkal, Fahad Shahbaz Khan,, Yanwei Pang, Ling Shao

TL;DR
SipMask is a fast, single-stage instance segmentation method that preserves spatial information for improved accuracy and real-time performance, outperforming existing methods on COCO and YouTube-VIS datasets.
Contribution
The paper introduces a novel lightweight spatial preservation module and a mask alignment loss, enhancing accuracy while maintaining high speed in single-stage instance segmentation.
Findings
Outperforms existing single-stage methods on COCO test-dev
Achieves 1.0% higher mask AP than TensorMask with four times faster speed
Provides promising results for real-time video instance segmentation on YouTube-VIS
Abstract
Single-stage instance segmentation approaches have recently gained popularity due to their speed and simplicity, but are still lagging behind in accuracy, compared to two-stage methods. We propose a fast single-stage instance segmentation method, called SipMask, that preserves instance-specific spatial information by separating mask prediction of an instance to different sub-regions of a detected bounding-box. Our main contribution is a novel light-weight spatial preservation (SP) module that generates a separate set of spatial coefficients for each sub-region within a bounding-box, leading to improved mask predictions. It also enables accurate delineation of spatially adjacent instances. Further, we introduce a mask alignment weighting loss and a feature alignment scheme to better correlate mask prediction with object detection. On COCO test-dev, our SipMask outperforms the existing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning
