Multi-view Aggregation Network for Dichotomous Image Segmentation

Qian Yu; Xiaoqi Zhao; Youwei Pang; Lihe Zhang; Huchuan Lu

arXiv:2404.07445·cs.CV·April 12, 2024·1 cites

Multi-view Aggregation Network for Dichotomous Image Segmentation

Qian Yu, Xiaoqi Zhao, Youwei Pang, Lihe Zhang, Huchuan Lu

PDF

Open Access 1 Repo 2 Models

TL;DR

This paper introduces MVANet, a multi-view aggregation network inspired by human vision, which effectively balances global localization and local detail in high-resolution dichotomous image segmentation, outperforming existing methods.

Contribution

The paper proposes a novel multi-view aggregation network that unifies feature fusion from distant and close-up views into a single encoder-decoder structure for improved DIS.

Findings

01

MVANet outperforms state-of-the-art methods in accuracy.

02

MVANet achieves faster processing speeds.

03

The approach effectively captures slender structures in high-resolution images.

Abstract

Dichotomous Image Segmentation (DIS) has recently emerged towards high-precision object segmentation from high-resolution natural images. When designing an effective DIS model, the main challenge is how to balance the semantic dispersion of high-resolution targets in the small receptive field and the loss of high-precision details in the large receptive field. Existing methods rely on tedious multiple encoder-decoder streams and stages to gradually complete the global localization and local refinement. Human visual system captures regions of interest by observing them from multiple views. Inspired by it, we model DIS as a multi-view object perception problem and provide a parsimonious multi-view aggregation network (MVANet), which unifies the feature fusion of the distant view and close-up view into a single stream with one encoder-decoder structure. With the help of the proposed…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

qianyu-dlut/mvanet
pytorchOfficial

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Visual Attention and Saliency Detection · Advanced Image and Video Retrieval Techniques

MethodsFocus