Retrieval-enriched zero-shot image classification in low-resource   domains

Nicola Dall'Asen; Yiming Wang; Enrico Fini; Elisa Ricci

arXiv:2411.00988·cs.CV·November 5, 2024

Retrieval-enriched zero-shot image classification in low-resource domains

Nicola Dall'Asen, Yiming Wang, Enrico Fini, Elisa Ricci

PDF

Open Access

TL;DR

This paper introduces CoRE, a retrieval-based, training-free method for zero-shot image classification in low-resource domains, leveraging web data to enhance representations and outperform existing methods.

Contribution

The paper presents a novel retrieval-enrichment approach for zero-shot low-resource image classification that does not require training or synthetic data generation.

Findings

01

CoRE outperforms state-of-the-art methods on diverse low-resource benchmarks.

02

Retrieval-based enrichment improves classification accuracy significantly.

03

The method is effective across medical, botanical, and circuit domains.

Abstract

Low-resource domains, characterized by scarce data and annotations, present significant challenges for language and visual understanding tasks, with the latter much under-explored in the literature. Recent advancements in Vision-Language Models (VLM) have shown promising results in high-resource domains but fall short in low-resource concepts that are under-represented (e.g. only a handful of images per category) in the pre-training set. We tackle the challenging task of zero-shot low-resource image classification from a novel perspective. By leveraging a retrieval-based strategy, we achieve this in a training-free fashion. Specifically, our method, named CoRE (Combination of Retrieval Enrichment), enriches the representation of both query images and class prototypes by retrieving relevant textual information from large web-crawled databases. This retrieval-based enrichment…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning