Two-Stream Networks for Object Segmentation in Videos
Hannan Lu, Zhi Tian, Lirong Yang, Haibing Ren, Wangmeng Zuo

TL;DR
This paper introduces a Two-Stream Network for video object segmentation that combines pixel-level memory with a holistic instance understanding, significantly improving segmentation accuracy for both seen and unseen pixels.
Contribution
The paper proposes a novel Two-Stream Network architecture that effectively segments unseen objects by integrating pixel-level memory with a dynamic instance stream.
Findings
Achieves state-of-the-art performance on YouTube-VOS 2018 with 86.1%.
Achieves state-of-the-art performance on DAVIS-2017 with 87.5%.
Demonstrates effective fusion of pixel and instance streams improves overall segmentation.
Abstract
Existing matching-based approaches perform video object segmentation (VOS) via retrieving support features from a pixel-level memory, while some pixels may suffer from lack of correspondence in the memory (i.e., unseen), which inevitably limits their segmentation performance. In this paper, we present a Two-Stream Network (TSN). Our TSN includes (i) a pixel stream with a conventional pixel-level memory, to segment the seen pixels based on their pixellevel memory retrieval. (ii) an instance stream for the unseen pixels, where a holistic understanding of the instance is obtained with dynamic segmentation heads conditioned on the features of the target instance. (iii) a pixel division module generating a routing map, with which output embeddings of the two streams are fused together. The compact instance stream effectively improves the segmentation accuracy of the unseen pixels, while…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Advanced Image and Video Retrieval Techniques · Video Surveillance and Tracking Methods
