Do Convnets Learn Correspondence?
Jonathan Long, Ning Zhang, Trevor Darrell

TL;DR
This paper investigates whether convolutional neural networks inherently learn fine-grained correspondence, demonstrating their ability to localize features more precisely than their receptive fields suggest, and outperforming traditional features in keypoint prediction.
Contribution
The study provides evidence that convnet features can be used for fine-scale correspondence tasks, challenging assumptions about their coarse localization due to large pooling regions.
Findings
Convnet features localize at a finer scale than receptive fields.
Convnet features outperform hand-engineered features in keypoint prediction.
Convnet features are effective for intraclass alignment.
Abstract
Convolutional neural nets (convnets) trained from massive labeled datasets have substantially improved the state-of-the-art in image classification and object detection. However, visual understanding requires establishing correspondence on a finer level than object category. Given their large pooling regions and training from whole-image labels, it is not clear that convnets derive their success from an accurate correspondence model which could be used for precise localization. In this paper, we study the effectiveness of convnet activation features for tasks requiring correspondence. We present evidence that convnet features localize at a much finer scale than their receptive field sizes, that they can be used to perform intraclass alignment as well as conventional hand-engineered features, and that they outperform conventional features in keypoint prediction on objects from PASCAL VOC…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications
