Object Recognition with Multi-Scale Pyramidal Pooling Networks

Jonathan Masci; Ueli Meier; Gabriel Fricout; J\"urgen; Schmidhuber

arXiv:1207.1765·cs.CV·July 10, 2012·1 cites

Object Recognition with Multi-Scale Pyramidal Pooling Networks

Jonathan Masci, Ueli Meier, Gabriel Fricout, J\"urgen, Schmidhuber

PDF

Open Access

TL;DR

This paper introduces a Multi-Scale Pyramidal Pooling Network that handles variable image sizes and improves generalization, especially with limited data, outperforming existing methods on benchmarks and industrial defect classification.

Contribution

It proposes a novel pyramidal pooling layer and encoding layer, enabling size flexibility and better generalization in neural networks for object recognition.

Findings

01

Outperforms CNNs on benchmark datasets

02

Handles images of varying sizes without resizing

03

Effective in industrial defect classification

Abstract

We present a Multi-Scale Pyramidal Pooling Network, featuring a novel pyramidal pooling layer at multiple scales and a novel encoding layer. Thanks to the former the network does not require all images of a given classification task to be of equal size. The encoding layer improves generalisation performance in comparison to similar neural network architectures, especially when training data is scarce. We evaluate and compare our system to convolutional neural networks and state-of-the-art computer vision methods on various benchmark datasets. We also present results on industrial steel defect classification, where existing architectures are not applicable because of the constraint on equally sized input images. The proposed architecture can be seen as a fully supervised hierarchical bag-of-features extension that is trained online and can be fine-tuned for any given task.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Industrial Vision Systems and Defect Detection · Advanced Image and Video Retrieval Techniques