TL;DR
This paper introduces a CNN-based approach for multi-view object recognition that decomposes image sequences into pairs, enabling recognition over arbitrary camera paths and integrating active view planning for improved accuracy.
Contribution
It proposes a novel pairwise decomposition method with CNNs for flexible multi-view recognition and an active view planning framework that optimizes camera trajectories.
Findings
Achieved state-of-the-art results on ModelNet dataset.
Effective with depth, greyscale, or combined images.
Supports recognition over arbitrary camera trajectories.
Abstract
A multi-view image sequence provides a much richer capacity for object recognition than from a single image. However, most existing solutions to multi-view recognition typically adopt hand-crafted, model-based geometric methods, which do not readily embrace recent trends in deep learning. We propose to bring Convolutional Neural Networks to generic multi-view recognition, by decomposing an image sequence into a set of image pairs, classifying each pair independently, and then learning an object classifier by weighting the contribution of each pair. This allows for recognition over arbitrary camera trajectories, without requiring explicit training over the potentially infinite number of camera paths and lengths. Building these pairwise relationships then naturally extends to the next-best-view problem in an active recognition framework. To achieve this, we train a second Convolutional…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Pairwise Decomposition of Image Sequences for Active Multi-View Recognition· youtube
