Straight to Shapes: Real-time Detection of Encoded Shapes

Saumya Jetley; Michael Sapienza; Stuart Golodetz; Philip H.S. Torr

arXiv:1611.07932·cs.CV·July 6, 2017·5 cites

Straight to Shapes: Real-time Detection of Encoded Shapes

Saumya Jetley, Michael Sapienza, Stuart Golodetz, Philip H.S. Torr

PDF

Open Access 1 Repo

TL;DR

This paper introduces a real-time object detection method that predicts object shapes directly, using a shape embedding space to improve instance-specific understanding and generalization to unseen categories.

Contribution

It presents the first real-time shape prediction network that integrates shape encoding with object detection, enabling higher-order shape reasoning in a fast, end-to-end manner.

Findings

01

Runs at ~35 FPS on high-end desktops

02

Generalizes to unseen categories effectively

03

Provides richer object instance information beyond bounding boxes

Abstract

Current object detection approaches predict bounding boxes, but these provide little instance-specific information beyond location, scale and aspect ratio. In this work, we propose to directly regress to objects' shapes in addition to their bounding boxes and categories. It is crucial to find an appropriate shape representation that is compact and decodable, and in which objects can be compared for higher-order concepts such as view similarity, pose variation and occlusion. To achieve this, we use a denoising convolutional auto-encoder to establish an embedding space, and place the decoder after a fast end-to-end network trained to regress directly to the encoded shape vectors. This yields what to the best of our knowledge is the first real-time shape prediction network, running at ~35 FPS on a high-end desktop. With higher-order shape reasoning well-integrated into the network…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

torrvision/straighttoshapes
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Human Pose and Action Recognition · Advanced Image and Video Retrieval Techniques