Diff-VPS: Video Polyp Segmentation via a Multi-task Diffusion Network   with Adversarial Temporal Reasoning

Yingling Lu; Yijun Yang; Zhaohu Xing; Qiong Wang; Lei Zhu

arXiv:2409.07238·cs.CV·September 12, 2024

Diff-VPS: Video Polyp Segmentation via a Multi-task Diffusion Network with Adversarial Temporal Reasoning

Yingling Lu, Yijun Yang, Zhaohu Xing, Qiong Wang, Lei Zhu

PDF

Open Access 1 Repo

TL;DR

This paper introduces Diff-VPS, a diffusion-based multi-task network with adversarial temporal reasoning for improved video polyp segmentation, achieving state-of-the-art results by integrating high-level contextual information and temporal dependencies.

Contribution

The paper presents a novel diffusion model for video polyp segmentation that incorporates multi-task supervision and a temporal reasoning module with adversarial training, which are new contributions.

Findings

01

Achieves state-of-the-art performance on SUN-SEG dataset.

02

Effectively captures temporal dependencies and dynamic cues.

03

Enhances pixel-wise segmentation accuracy.

Abstract

Diffusion Probabilistic Models have recently attracted significant attention in the community of computer vision due to their outstanding performance. However, while a substantial amount of diffusion-based research has focused on generative tasks, no work introduces diffusion models to advance the results of polyp segmentation in videos, which is frequently challenged by polyps' high camouflage and redundant temporal cues.In this paper, we present a novel diffusion-based network for video polyp segmentation task, dubbed as Diff-VPS. We incorporate multi-task supervision into diffusion models to promote the discrimination of diffusion models on pixel-by-pixel segmentation. This integrates the contextual high-level information achieved by the joint classification and detection tasks. To explore the temporal dependency, Temporal Reasoning Module (TRM) is devised via reasoning and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lydia-yllu/diff-vps
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence Applications · Handwritten Text Recognition Techniques · Generative Adversarial Networks and Image Synthesis

MethodsSoftmax · Attention Is All You Need · Diffusion