# Foundation Model Drives Weakly Incremental Learning for Semantic   Segmentation

**Authors:** Chaohui Yu, Qiang Zhou, Jingliang Li, Jianlong Yuan, Zhibin Wang, Fan, Wang

arXiv: 2302.14250 · 2023-04-21

## TL;DR

This paper introduces FMWISS, a data-efficient framework for weakly incremental semantic segmentation that leverages foundation models, contrastive learning, and augmentation to improve performance and mitigate forgetting.

## Contribution

The paper proposes a novel framework combining foundation model distillation, contrastive learning, and memory-based augmentation for weakly incremental semantic segmentation.

## Key findings

- FMWISS achieves 70.7% and 73.3% accuracy on Pascal VOC, outperforming previous methods.
- The framework effectively utilizes image-level labels to generate dense pseudo labels.
- Extensive experiments validate the superiority of FMWISS over state-of-the-art approaches.

## Abstract

Modern incremental learning for semantic segmentation methods usually learn new categories based on dense annotations. Although achieve promising results, pixel-by-pixel labeling is costly and time-consuming. Weakly incremental learning for semantic segmentation (WILSS) is a novel and attractive task, which aims at learning to segment new classes from cheap and widely available image-level labels. Despite the comparable results, the image-level labels can not provide details to locate each segment, which limits the performance of WILSS. This inspires us to think how to improve and effectively utilize the supervision of new classes given image-level labels while avoiding forgetting old ones. In this work, we propose a novel and data-efficient framework for WILSS, named FMWISS. Specifically, we propose pre-training based co-segmentation to distill the knowledge of complementary foundation models for generating dense pseudo labels. We further optimize the noisy pseudo masks with a teacher-student architecture, where a plug-in teacher is optimized with a proposed dense contrastive loss. Moreover, we introduce memory-based copy-paste augmentation to improve the catastrophic forgetting problem of old classes. Extensive experiments on Pascal VOC and COCO datasets demonstrate the superior performance of our framework, e.g., FMWISS achieves 70.7% and 73.3% in the 15-5 VOC setting, outperforming the state-of-the-art method by 3.4% and 6.1%, respectively.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2302.14250/full.md

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/2302.14250/full.md

## References

54 references — full list in the complete paper: https://tomesphere.com/paper/2302.14250/full.md

---
Source: https://tomesphere.com/paper/2302.14250