Bridging Vision and Language Encoders: Parameter-Efficient Tuning for   Referring Image Segmentation

Zunnan Xu; Zhihong Chen; Yong Zhang; Yibing Song; Xiang Wan; Guanbin; Li

arXiv:2307.11545·cs.CV·July 24, 2023·2 cites

Bridging Vision and Language Encoders: Parameter-Efficient Tuning for Referring Image Segmentation

Zunnan Xu, Zhihong Chen, Yong Zhang, Yibing Song, Xiang Wan, Guanbin, Li

PDF

Open Access 1 Repo 1 Models

TL;DR

This paper introduces a parameter-efficient tuning method for referring image segmentation, using a novel adapter and lightweight decoder to achieve high performance with minimal parameter updates.

Contribution

It proposes Bridger, a new adapter for cross-modal interaction, and a lightweight decoder, enabling effective dense prediction with minimal parameter tuning.

Findings

01

Achieves comparable or better performance with only 1.61% to 3.38% backbone parameter updates.

02

Demonstrates effectiveness on challenging benchmarks.

03

Provides a practical approach for resource-efficient dense prediction tasks.

Abstract

Parameter Efficient Tuning (PET) has gained attention for reducing the number of parameters while maintaining performance and providing better hardware resource savings, but few studies investigate dense prediction tasks and interaction between modalities. In this paper, we do an investigation of efficient tuning problems on referring image segmentation. We propose a novel adapter called Bridger to facilitate cross-modal information exchange and inject task-specific information into the pre-trained model. We also design a lightweight decoder for image segmentation. Our approach achieves comparable or superior performance with only 1.61\% to 3.38\% backbone parameter updates, evaluated on challenging benchmarks. The code is available at \url{https://github.com/kkakkkka/ETRIS}.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kkakkkka/etris
pytorchOfficial

Models

🤗
thuteam/ETRIS
model· ♡ 1
♡ 1

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Topic Modeling · COVID-19 diagnosis using AI

MethodsAdapter