Combining Texture and Shape Cues for Object Recognition With Minimal Supervision
Xingchao Peng, Kate Saenko

TL;DR
This paper introduces a two-stream deep learning framework that combines web-derived texture cues and shape information from CAD models for object recognition, achieving superior results with minimal supervision.
Contribution
It presents a novel method that integrates texture and shape cues from web search data and CAD models, improving object recognition performance.
Findings
Outperforms previous web image and CAD model based methods
Demonstrates complementary nature of texture and shape cues
Achieves state-of-the-art results on PASCAL VOC 2007
Abstract
We present a novel approach to object classification and detection which requires minimal supervision and which combines visual texture cues and shape information learned from freely available unlabeled web search results. The explosion of visual data on the web can potentially make visual examples of almost any object easily accessible via web search. Previous unsupervised methods have utilized either large scale sources of texture cues from the web, or shape information from data such as crowdsourced CAD models. We propose a two-stream deep learning framework that combines these cues, with one stream learning visual texture cues from image search data, and the other stream learning rich shape information from 3D CAD models. To perform classification or detection for a novel image, the predictions of the two streams are combined using a late fusion scheme. We present experiments and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning · Advanced Neural Network Applications
