Shallow Deep Learning Can Still Excel in Fine-Grained Few-Shot Learning

Chaofei Qi; Chao Ye; Zhitai Liu; Weiyang Lin; Jianbin Qiu

arXiv:2507.22041·cs.CV·May 19, 2026

Shallow Deep Learning Can Still Excel in Fine-Grained Few-Shot Learning

Chaofei Qi, Chao Ye, Zhitai Liu, Weiyang Lin, Jianbin Qiu

PDF

TL;DR

This paper demonstrates that shallow deep networks like ConvNet-4, enhanced with a location-aware constellation module, can outperform or match deeper models in fine-grained few-shot learning tasks.

Contribution

The introduction of a location-aware constellation network with novel spatial and frequency domain encoding techniques for improved few-shot learning performance.

Findings

01

LCN-4 outperforms ConvNet-4 based state-of-the-art methods.

02

LCN-4 achieves comparable or superior results to ResNet12-based approaches.

03

Validation on three fine-grained benchmarks confirms effectiveness.

Abstract

Deep learning has witnessed the extensive utilization across a wide spectrum of domains, including fine-grained few-shot learning (FGFSL) which heavily depends on deep backbones. Nonetheless, shallower deep backbones such as ConvNet-4, are not commonly preferred because they're prone to extract a larger quantity of non-abstract visual attributes. In this paper, we initially re-evaluate the relationship between network depth and the ability to fully encode few-shot instances, and delve into whether shallow deep architecture could effectuate comparable or superior performance to mainstream deep backbone. Fueled by the inspiration from vanilla ConvNet-4, we introduce a location-aware constellation network (LCN-4), equipped with a cutting-edge location-aware feature clustering module. This module can proficiently encoder and integrate spatial feature fusion, feature clustering, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.