Are These the Same Apple? Comparing Images Based on Object Intrinsics
Klemen Kotar, Stephen Tian, Hong-Xing Yu, Daniel L.K. Yamins, Jiajun, Wu

TL;DR
This paper introduces a new image similarity metric based on intrinsic object properties, addressing the challenge of recognizing objects under varying extrinsic conditions across diverse categories.
Contribution
It extends re-identification techniques to general objects, proposing a benchmark dataset and demonstrating that combining deep features with foreground filtering effectively measures intrinsic similarity.
Findings
Deep features with foreground filtering outperform existing methods
The CUTE dataset enables benchmarking of intrinsic object similarity
Proposed approach improves re-identification and object recognition tasks
Abstract
The human visual system can effortlessly recognize an object under different extrinsic factors such as lighting, object poses, and background, yet current computer vision systems often struggle with these variations. An important step to understanding and improving artificial vision systems is to measure image similarity purely based on intrinsic object properties that define object identity. This problem has been studied in the computer vision literature as re-identification, though mostly restricted to specific object categories such as people and cars. We propose to extend it to general object categories, exploring an image similarity metric based on object intrinsics. To benchmark such measurements, we collect the Common paired objects Under differenT Extrinsics (CUTE) dataset of images of objects under different extrinsic factors such as lighting, poses, and imaging…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsVisual Attention and Saliency Detection · Video Surveillance and Tracking Methods · Advanced Image and Video Retrieval Techniques
