Patch-wise Retrieval: A Bag of Practical Techniques for Instance-level Matching
Wonseok Choi, Sohwi Lim, Nam Hyeon-Woo, Moon Ye-Bin, Dong-Ju Jeong, Jinyoung Hwang, Tae-Hyun Oh

TL;DR
This paper introduces Patchify, a patch-wise image retrieval framework that improves instance-level matching accuracy and interpretability without fine-tuning, supported by a new localization-aware metric and extensive experimental validation.
Contribution
Patchify offers a practical patch-based retrieval method that enhances accuracy, scalability, and interpretability, and introduces LocScore for spatial correctness evaluation.
Findings
Patchify outperforms global methods across multiple benchmarks.
LocScore effectively measures spatial localization accuracy.
Using informative features during compression boosts retrieval performance.
Abstract
Instance-level image retrieval aims to find images containing the same object as a given query, despite variations in size, position, or appearance. To address this challenging task, we propose Patchify, a simple yet effective patch-wise retrieval framework that offers high performance, scalability, and interpretability without requiring fine-tuning. Patchify divides each database image into a small number of structured patches and performs retrieval by comparing these local features with a global query descriptor, enabling accurate and spatially grounded matching. To assess not just retrieval accuracy but also spatial correctness, we introduce LocScore, a localization-aware metric that quantifies whether the retrieved region aligns with the target object. This makes LocScore a valuable diagnostic tool for understanding and improving retrieval behavior. We conduct extensive experiments…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Image Retrieval and Classification Techniques · Multimodal Machine Learning Applications
