Multi-Class Textual-Inversion Secretly Yields a Semantic-Agnostic   Classifier

Kai Wang; Fei Yang; Bogdan Raducanu; Joost van de Weijer

arXiv:2410.22317·cs.CV·October 30, 2024

Multi-Class Textual-Inversion Secretly Yields a Semantic-Agnostic Classifier

Kai Wang, Fei Yang, Bogdan Raducanu, Joost van de Weijer

PDF

Open Access 1 Repo

TL;DR

This paper introduces Multi-Class textual inversion (MC-TI), a novel method that enhances semantic-agnostic classification capabilities of tokens learned from few samples, while maintaining their generative abilities, outperforming previous approaches across multiple datasets.

Contribution

The paper proposes MC-TI, a new approach that incorporates discriminative regularization into textual inversion to improve classification without losing generation quality in few-sample scenarios.

Findings

01

MC-TI outperforms existing methods in classification accuracy.

02

MC-TI maintains high-quality concept generation.

03

Extensive experiments validate the effectiveness of MC-TI across 12 datasets.

Abstract

With the advent of large pre-trained vision-language models such as CLIP, prompt learning methods aim to enhance the transferability of the CLIP model. They learn the prompt given few samples from the downstream task given the specific class names as prior knowledge, which we term as semantic-aware classification. However, in many realistic scenarios, we only have access to few samples and knowledge of the class names (e.g., when considering instances of classes). This challenging scenario represents the semantic-agnostic discriminative case. Text-to-Image (T2I) personalization methods aim to adapt T2I models to unseen concepts by learning new tokens and endowing these tokens with the capability of generating the learned concepts. These methods do not require knowledge of class names as a semantic-aware prior. Therefore, in this paper, we first explore Textual Inversion and reveal that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

wangkai930418/mc_ti
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAuthorship Attribution and Profiling

MethodsContrastive Language-Image Pre-training · Discriminative Regularization