Online Zero-Shot Classification with CLIP

Qi Qian; Juhua Hu

arXiv:2408.13320·cs.CV·August 27, 2024

Online Zero-Shot Classification with CLIP

Qi Qian, Juhua Hu

PDF

Open Access 1 Repo

TL;DR

This paper introduces OnZeta, an online zero-shot classification framework leveraging CLIP, which dynamically adapts to data distribution during inference, achieving high accuracy without storing data, suitable for real-time applications.

Contribution

The paper proposes a novel online zero-shot transfer method that models target data distribution and optimizes class proxies in real-time, with theoretical convergence guarantees.

Findings

01

Achieves 78.94% accuracy on ImageNet without full dataset access.

02

Improves performance by over 3% on average across 13 downstream tasks.

03

Demonstrates effective online adaptation for zero-shot classification.

Abstract

Vision-language pre-training such as CLIP enables zero-shot transfer that can classify images according to the candidate class names. While CLIP demonstrates an impressive zero-shot performance on diverse downstream tasks, the distribution from the target data has not been leveraged sufficiently. In this work, we study a novel online zero-shot transfer scenario, where each image arrives in a random order for classification and is visited only once to obtain prediction immediately without storing its representation. Compared with the vanilla zero-shot classification, the proposed framework preserves its flexibility for online service while considering the statistics of the arrived images as the side information to capture the distribution of target data, which can help improve the performance of real-world applications. To tackle the challenge of effective online optimization, we first…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

idstcv/onzeta
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCOVID-19 diagnosis using AI · Anomaly Detection Techniques and Applications · Domain Adaptation and Few-Shot Learning

Methodstravel james · Contrastive Language-Image Pre-training