TL;DR
This paper introduces a novel multi-modal, weakly supervised segmentation method that uses cross-modality equivariant constraints and self-supervision to improve class activation maps, leading to better segmentation accuracy.
Contribution
The paper proposes a new learning strategy with a specialized loss function that enforces equivariance and cross-modality consistency to enhance weakly supervised segmentation.
Findings
Outperforms recent methods on the BRATS dataset
Effectively leverages multi-modal data for improved segmentation
Enhances CAMs with equivariant regularization and KL-divergence
Abstract
Weakly supervised learning has emerged as an appealing alternative to alleviate the need for large labeled datasets in semantic segmentation. Most current approaches exploit class activation maps (CAMs), which can be generated from image-level annotations. Nevertheless, resulting maps have been demonstrated to be highly discriminant, failing to serve as optimal proxy pixel-level labels. We present a novel learning strategy that leverages self-supervision in a multi-modal image scenario to significantly enhance original CAMs. In particular, the proposed method is based on two observations. First, the learning of fully-supervised segmentation networks implicitly imposes equivariance by means of data augmentation, whereas this implicit constraint disappears on CAMs generated with image tags. And second, the commonalities between image modalities can be employed as an efficient…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
