Learning Semantic Part-Based Models from Google Images

Davide Modolo; Vittorio Ferrari

arXiv:1609.03140·cs.CV·May 15, 2018

Learning Semantic Part-Based Models from Google Images

Davide Modolo, Vittorio Ferrari

PDF

TL;DR

This paper introduces a method to automatically learn detailed semantic part-based object models from Google Images without manual annotations, improving object detection performance by integrating part information.

Contribution

It presents a novel incremental learning framework for semantic parts from web images, eliminating manual part annotations and enhancing detection accuracy.

Findings

01

Performance more than doubled from 12.9 to 27.2 AP.

02

Models successfully learn part appearance and spatial arrangement.

03

Part models improve object detection when integrated with R-CNN.

Abstract

We propose a technique to train semantic part-based models of object classes from Google Images. Our models encompass the appearance of parts and their spatial arrangement on the object, specific to each viewpoint. We learn these rich models by collecting training instances for both parts and objects, and automatically connecting the two levels. Our framework works incrementally, by learning from easy examples first, and then gradually adapting to harder ones. A key benefit of this approach is that it requires no manual part location annotations. We evaluate our models on the challenging PASCAL-Part dataset [1] and show how their performance increases at every step of the learning, with the final models more than doubling the performance of directly training from images retrieved by querying for part names (from 12.9 to 27.2 AP). Moreover, we show that our part models can help object…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSupport Vector Machine · Max Pooling · Convolution · R-CNN