DeepEMD: Differentiable Earth Mover's Distance for Few-Shot Learning
Chi Zhang, Yujun Cai, Guosheng Lin, Chunhua Shen

TL;DR
DeepEMD introduces a differentiable Earth Mover's Distance layer for few-shot image classification, leveraging optimal matching of image regions to improve accuracy and robustness against background clutter and intra-class variations.
Contribution
The paper proposes a novel differentiable EMD-based method for few-shot learning, including a cross-reference mechanism and a structured fully connected layer for end-to-end training.
Findings
Outperforms state-of-the-art on five few-shot benchmarks.
Effective in image retrieval tasks.
Robust to background clutter and intra-class variations.
Abstract
In this work, we develop methods for few-shot image classification from a new perspective of optimal matching between image regions. We employ the Earth Mover's Distance (EMD) as a metric to compute a structural distance between dense image representations to determine image relevance. The EMD generates the optimal matching flows between structural elements that have the minimum matching cost, which is used to calculate the image distance for classification. To generate the important weights of elements in the EMD formulation, we design a cross-reference mechanism, which can effectively alleviate the adverse impact caused by the cluttered background and large intra-class appearance variations. To implement k-shot classification, we propose to learn a structured fully connected layer that can directly classify dense image representations with the EMD. Based on the implicit function…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Human Pose and Action Recognition
