# An efficient low-shot class-agnostic counting framework with hybrid encoder and iterative exemplar feature learning

**Authors:** Qinghua Yang, Bin Liu, Yan Tian, Yangming Shi, Xinxin Du, Fangyuan He, Jikun Guo

PMC · DOI: 10.1371/journal.pone.0322360 · PLOS One · 2025-06-06

## TL;DR

This paper introduces a real-time, efficient framework for counting objects with minimal training data, using a hybrid encoder and iterative learning.

## Contribution

The novel framework combines a hybrid encoder and iterative exemplar feature learning for real-time, class-agnostic object counting with few or zero annotated samples.

## Key findings

- The model achieves real-time inference without performance loss on multiple benchmark datasets.
- The hybrid encoder improves feature representation while reducing training costs.
- Iterative exemplar feature learning enhances matching performance by leveraging class-level characteristics.

## Abstract

Few-shot learning techniques have enabled the rapid adaptation of a general AI model to various tasks using limited data. In this study, we focus on class-agnostic low-shot object counting, a challenging problem that aims to achieve accurate object counting with only a few annotated samples (few-shot) or even in the absence of any annotated data (zero-shot). In existing methods, the primary focus is often on enhancing performance, while relatively little attention is given to inference time—an equally critical factor in many practical applications. We propose a model that achieves real-time inference without compromising performance. Specifically, we design a multi-scale hybrid encoder to enhance feature representation and optimize computational efficiency. This encoder applies self-attention exclusively to high-level features and cross-scale fusion modules to integrate adjacent features, reducing training costs. Additionally, we introduce a learnable shape embedding and an iterative exemplar feature learning module, that progressively enriches exemplar features with class-level characteristics by learning from similar objects within the image, which are essential for improving subsequent matching performance. Extensive experiments on the FSC147, Val-COCO, Test-COCO, CARPK, and ShanghaiTech datasets demonstrate our model’s effectiveness and generalizability compared to state-of-the-art methods.

## Full-text entities

- **Diseases:** loss weight (MESH:D015431)
- **Chemicals:** CAC (-), Val (MESH:D014633)
- **Species:** Homo sapiens (human, species) [taxon 9606]
- **Cell lines:** FSC147 — Homo sapiens (Human), Bone fibrosarcoma, Cancer cell line (CVCL_W199)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12143539/full.md

## Figures

14 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12143539/full.md

## References

44 references — full list in the complete paper: https://tomesphere.com/paper/PMC12143539/full.md

---
Source: https://tomesphere.com/paper/PMC12143539