Harnessing the Power of Local Representations for Few-Shot Classification

Shi Tang; Guiming Luo; Xinchen Ye; Zhiyi Xia

arXiv:2407.01967·cs.CV·April 2, 2026

Harnessing the Power of Local Representations for Few-Shot Classification

Shi Tang, Guiming Luo, Xinchen Ye, Zhiyi Xia

PDF

TL;DR

This paper introduces a novel approach for few-shot classification that leverages local representations through a specialized pretraining paradigm and an adaptable metric, achieving state-of-the-art results.

Contribution

It proposes a new pretraining method with soft labels and a flexible metric based on optimal transport, enhancing generalization to novel classes.

Findings

01

Achieves new state-of-the-art performance on three benchmarks.

02

Outperforms existing methods in fine-grained scenarios.

03

Effectively handles homogeneous local feature sets with the proposed metric.

Abstract

Generalizing to novel classes unseen during training is a key challenge of few-shot classification. Recent metric-based methods try to address this by local representations. However, they are unable to take full advantage of them due to (i) improper supervision for pretraining the feature extractor, and (ii) lack of adaptability in the metric for handling various possible compositions of local feature sets. In this work, we harness the power of local representations in improving novel-class generalization. For the feature extractor, we design a novel pretraining paradigm that learns randomly cropped patches by soft labels. It utilizes the class-level diversity of patches while diminishing the impact of their semantic misalignments to hard labels. To align network output with soft labels, we also propose a UniCon KL-Divergence that emphasizes the equal contribution of each base class in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.