DeepEMD: Differentiable Earth Mover's Distance for Few-Shot Learning

Chi Zhang; Yujun Cai; Guosheng Lin; Chunhua Shen

arXiv:2003.06777·cs.CV·March 31, 2023·39 cites

DeepEMD: Differentiable Earth Mover's Distance for Few-Shot Learning

Chi Zhang, Yujun Cai, Guosheng Lin, Chunhua Shen

PDF

Open Access 5 Repos

TL;DR

DeepEMD introduces a differentiable Earth Mover's Distance layer for few-shot image classification, leveraging optimal matching of image regions to improve accuracy and robustness against background clutter and intra-class variations.

Contribution

The paper proposes a novel differentiable EMD-based method for few-shot learning, including a cross-reference mechanism and a structured fully connected layer for end-to-end training.

Findings

01

Outperforms state-of-the-art on five few-shot benchmarks.

02

Effective in image retrieval tasks.

03

Robust to background clutter and intra-class variations.

Abstract

In this work, we develop methods for few-shot image classification from a new perspective of optimal matching between image regions. We employ the Earth Mover's Distance (EMD) as a metric to compute a structural distance between dense image representations to determine image relevance. The EMD generates the optimal matching flows between structural elements that have the minimum matching cost, which is used to calculate the image distance for classification. To generate the important weights of elements in the EMD formulation, we design a cross-reference mechanism, which can effectively alleviate the adverse impact caused by the cluttered background and large intra-class appearance variations. To implement k-shot classification, we propose to learn a structured fully connected layer that can directly classify dense image representations with the EMD. Based on the implicit function…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Human Pose and Action Recognition