A Fully Convolutional Two-Stream Fusion Network for Interactive Image Segmentation
Yang Hu, Andrea Soltoggio, Russell Lock, Steve Carter

TL;DR
This paper introduces a novel fully convolutional two-stream fusion network for interactive image segmentation, combining a low-resolution foreground prediction with multi-scale refinement to improve segmentation accuracy.
Contribution
The paper presents a new two-stream fusion network architecture that effectively integrates user interactions and multi-scale features for improved segmentation.
Findings
Achieves competitive performance on four benchmark datasets.
Reduces layers between user interactions and output for more direct influence.
Combines low-resolution prediction with multi-scale refinement for better accuracy.
Abstract
In this paper, we propose a novel fully convolutional two-stream fusion network (FCTSFN) for interactive image segmentation. The proposed network includes two sub-networks: a two-stream late fusion network (TSLFN) that predicts the foreground at a reduced resolution, and a multi-scale refining network (MSRN) that refines the foreground at full resolution. The TSLFN includes two distinct deep streams followed by a fusion network. The intuition is that, since user interactions are more direct information on foreground/background than the image itself, the two-stream structure of the TSLFN reduces the number of layers between the pure user interaction features and the network output, allowing the user interactions to have a more direct impact on the segmentation result. The MSRN fuses the features from different layers of TSLFN with different scales, in order to seek the local to global…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Visual Attention and Saliency Detection · Advanced Image and Video Retrieval Techniques
