Rethinking Generalization in Few-Shot Classification

Markus Hiller; Rongkai Ma; Mehrtash Harandi; Tom Drummond

arXiv:2206.07267·cs.CV·October 18, 2022·29 cites

Rethinking Generalization in Few-Shot Classification

Markus Hiller, Rongkai Ma, Mehrtash Harandi, Tom Drummond

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a novel few-shot classification method using Vision Transformers to identify and optimize the most informative image patches, achieving state-of-the-art results without relying on detailed annotations.

Contribution

It proposes a patch-based approach with online optimization for interpretability and leverages masked image modeling to improve generalization in few-shot learning.

Findings

01

Achieves new state-of-the-art on four few-shot benchmarks.

02

Effectively identifies key image regions for classification.

03

Avoids supervision collapse through unsupervised training.

Abstract

Single image-level annotations only correctly describe an often small subset of an image's content, particularly when complex real-world scenes are depicted. While this might be acceptable in many classification scenarios, it poses a significant challenge for applications where the set of classes differs significantly between training and test time. In this paper, we take a closer look at the implications in the context of $few-shot learning$ . Splitting the input samples into patches and encoding these via the help of Vision Transformers allows us to establish semantic correspondences between local regions across images and independent of their respective class. The most informative patch embeddings for the task at hand are then determined as a function of the support set via online optimization at inference time, additionally providing visual interpretability of `$\textit{what…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mrkshllr/FewTURE
pytorchOfficial

Videos

Rethinking Generalization in Few-Shot Classification· slideslive

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · COVID-19 diagnosis using AI · Multimodal Machine Learning Applications

MethodsTest · Residual Connection · Layer Normalization · Swin Transformer · Linear Layer · Softmax · Multi-Head Attention · Attention Is All You Need · Vision Transformer