Stronger, Fewer, & Superior: Harnessing Vision Foundation Models for   Domain Generalized Semantic Segmentation

Zhixiang Wei; Lin Chen; Yi Jin; Xiaoxiao Ma; Tianle Liu; Pengyang; Ling; Ben Wang; Huaian Chen; Jinjin Zheng

arXiv:2312.04265·cs.CV·April 19, 2024·1 cites

Stronger, Fewer, & Superior: Harnessing Vision Foundation Models for Domain Generalized Semantic Segmentation

Zhixiang Wei, Lin Chen, Yi Jin, Xiaoxiao Ma, Tianle Liu, Pengyang, Ling, Ben Wang, Huaian Chen, Jinjin Zheng

PDF

Open Access 2 Repos

TL;DR

This paper introduces Rein, a parameter-efficient fine-tuning method for Vision Foundation Models that enhances domain generalization in semantic segmentation, outperforming existing methods with fewer trainable parameters.

Contribution

Rein is a novel fine-tuning approach that uses trainable tokens to refine feature maps, achieving superior generalization with minimal additional parameters.

Findings

01

Rein surpasses state-of-the-art methods in DGSS tasks.

02

With only 1% extra trainable parameters, Rein achieves 78.4% mIoU on Cityscapes.

03

Rein efficiently leverages VFMs for domain generalization in semantic segmentation.

Abstract

In this paper, we first assess and harness various Vision Foundation Models (VFMs) in the context of Domain Generalized Semantic Segmentation (DGSS). Driven by the motivation that Leveraging Stronger pre-trained models and Fewer trainable parameters for Superior generalizability, we introduce a robust fine-tuning approach, namely Rein, to parameter-efficiently harness VFMs for DGSS. Built upon a set of trainable tokens, each linked to distinct instances, Rein precisely refines and forwards the feature maps from each layer to the next layer within the backbone. This process produces diverse refinements for different categories within a single image. With fewer trainable parameters, Rein efficiently fine-tunes VFMs for DGSS tasks, surprisingly surpassing full parameter fine-tuning. Extensive experiments across various settings demonstrate that Rein significantly outperforms…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications

MethodsSparse Evolutionary Training