Learning Semantic Proxies from Visual Prompts for Parameter-Efficient   Fine-Tuning in Deep Metric Learning

Li Ren; Chen Chen; Liqiang Wang; Kien Hua

arXiv:2402.02340·cs.CV·March 18, 2024·1 cites

Learning Semantic Proxies from Visual Prompts for Parameter-Efficient Fine-Tuning in Deep Metric Learning

Li Ren, Chen Chen, Liqiang Wang, Kien Hua

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a parameter-efficient fine-tuning method for Deep Metric Learning using visual prompts in Vision Transformers, enhancing performance while tuning fewer parameters.

Contribution

It proposes a novel framework that learns semantic visual prompts for each class, improving DML performance with fewer tunable parameters.

Findings

01

Achieves comparable or better results than full fine-tuning methods.

02

Tuning only a small percentage of parameters yields high performance.

03

Demonstrates effectiveness across popular DML benchmarks.

Abstract

Deep Metric Learning (DML) has long attracted the attention of the machine learning community as a key objective. Existing solutions concentrate on fine-tuning the pre-trained models on conventional image datasets. As a result of the success of recent pre-trained models trained from larger-scale datasets, it is challenging to adapt the model to the DML tasks in the local data domain while retaining the previously gained knowledge. In this paper, we investigate parameter-efficient methods for fine-tuning the pre-trained model for DML tasks. In particular, we propose a novel and effective framework based on learning Visual Prompts (VPT) in the pre-trained Vision Transformers (ViT). Based on the conventional proxy-based DML paradigm, we augment the proxy by incorporating the semantic information from the input image and the ViT, in which we optimize the visual prompts for each class. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

noahsark/parameterefficient-dml
pytorchOfficial

Videos

Learning Semantic Proxies from Visual Prompts for Parameter-Efficient Fine-Tuning in Deep Metric Learning· slideslive

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Image Retrieval and Classification Techniques · Generative Adversarial Networks and Image Synthesis