Beyond Task-Driven Features for Object Detection

Meilun Zhou; Alina Zare

arXiv:2604.03839·cs.CV·April 7, 2026

Beyond Task-Driven Features for Object Detection

Meilun Zhou, Alina Zare

PDF

TL;DR

This paper proposes an annotation-guided feature augmentation method for object detection that improves focus, robustness, and generalization by aligning features with annotation structure.

Contribution

It introduces a novel framework that injects annotation-guided embeddings into detection backbones, enhancing transferability and interpretability over traditional task-driven features.

Findings

01

Improves object focus and reduces background sensitivity.

02

Enhances generalization to unseen or weakly supervised tasks.

03

Consistent performance gains across wildlife and remote sensing datasets.

Abstract

Task-driven features learned by modern object detectors optimize end task loss yet often capture shortcut correlations that fail to reflect underlying annotation structure. Such representations limit transfer, interpretability, and robustness when task definitions change or supervision becomes sparse. This paper introduces an annotation-guided feature augmentation framework that injects embeddings into an object detection backbone. The method constructs dense spatial feature grids from annotation-guided latent spaces and fuses them with feature pyramid representations to influence region proposal and detection heads. Experiments across wildlife and remote sensing datasets evaluate classification, localization, and data efficiency under multiple supervision regimes. Results show consistent improvements in object focus, reduced background sensitivity, and stronger generalization to unseen…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.