Adversarial Reconstruction Feedback for Robust Fine-grained Generalization
Shijie Wang, Jian Shi, Haojie Li

TL;DR
This paper introduces AdvRF, an adversarial framework that enhances fine-grained image retrieval by learning category-agnostic discrepancy representations, improving generalization to unseen categories.
Contribution
AdvRF reformulates FGIR as a discrepancy reconstruction task, combining category-aware localization with category-agnostic feature learning, and uses knowledge distillation for efficient deployment.
Findings
Achieves superior performance on fine-grained datasets.
Effectively generalizes to unseen categories.
Improves localization accuracy of visual differences.
Abstract
Existing fine-grained image retrieval (FGIR) methods predominantly rely on supervision from predefined categories to learn discriminative representations for retrieving fine-grained objects. However, they inadvertently introduce category-specific semantics into the retrieval representation, creating semantic dependencies on predefined classes that critically hinder generalization to unseen categories. To tackle this, we propose AdvRF, a novel adversarial reconstruction feedback framework aimed at learning category-agnostic discrepancy representations. Specifically, AdvRF reformulates FGIR as a visual discrepancy reconstruction task via synergizing category-aware discrepancy localization from retrieval models with category-agnostic feature learning from reconstruction models. The reconstruction model exposes residual discrepancies overlooked by the retrieval model, forcing it to improve…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
