Swiss Army Knife: Synergizing Biases in Knowledge from Vision Foundation   Models for Multi-Task Learning

Yuxiang Lu; Shengcao Cao; Yu-Xiong Wang

arXiv:2410.14633·cs.CV·March 18, 2025

Swiss Army Knife: Synergizing Biases in Knowledge from Vision Foundation Models for Multi-Task Learning

Yuxiang Lu, Shengcao Cao, Yu-Xiong Wang

PDF

Open Access 1 Models

TL;DR

This paper introduces the Swiss Army Knife (SAK), a novel multi-task learning framework that adaptively combines multiple Vision Foundation Models by preserving their biases, leading to significant performance improvements across vision tasks.

Contribution

The paper proposes a versatile knowledge distillation method that preserves individual model biases and dynamically combines their representations for enhanced multi-task learning.

Findings

01

Outperforms prior methods by 10% on NYUD-v2 benchmark.

02

Effectively synergizes multiple VFMs for diverse vision tasks.

03

Provides a flexible framework adaptable to advanced model designs.

Abstract

Vision Foundation Models (VFMs) have demonstrated outstanding performance on numerous downstream tasks. However, due to their inherent representation biases originating from different training paradigms, VFMs exhibit advantages and disadvantages across distinct vision tasks. Although amalgamating the strengths of multiple VFMs for downstream tasks is an intuitive strategy, effectively exploiting these biases remains a significant challenge. In this paper, we propose a novel and versatile "Swiss Army Knife" (SAK) solution, which adaptively distills knowledge from a committee of VFMs to enhance multi-task learning. Unlike existing methods that use a single backbone for knowledge transfer, our approach preserves the unique representation bias of each teacher by collaborating the lightweight Teacher-Specific Adapter Path modules with the Teacher-Agnostic Stem. Through dynamic selection and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
yxlu0/SAK
model

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSemantic Web and Ontologies

MethodsAdapter