Equivariant Single View Pose Prediction Via Induced and Restricted   Representations

Owen Howell; David Klee; Ondrej Biza; Linfeng Zhao; and Robin Walters

arXiv:2307.03704·cs.CV·July 10, 2023·1 cites

Equivariant Single View Pose Prediction Via Induced and Restricted Representations

Owen Howell, David Klee, Ondrej Biza, Linfeng Zhao, and Robin Walters

PDF

Open Access

TL;DR

This paper introduces a novel neural network architecture leveraging SO(2)-equivariance constraints to learn 3D object pose from 2D images, achieving state-of-the-art results on pose estimation benchmarks.

Contribution

The paper formulates geometric consistency constraints for 3D pose prediction from 2D images and constructs architectures based on induced and restricted SO(2) representations, unifying previous methods.

Findings

01

Achieves state-of-the-art results on PASCAL3D+ and SYMSOL datasets.

02

Unifies and generalizes previous 3D pose prediction architectures.

03

Proposes a learnable algorithm respecting geometric consistency constraints.

Abstract

Learning about the three-dimensional world from two-dimensional images is a fundamental problem in computer vision. An ideal neural network architecture for such tasks would leverage the fact that objects can be rotated and translated in three dimensions to make predictions about novel images. However, imposing SO(3)-equivariance on two-dimensional inputs is difficult because the group of three-dimensional rotations does not have a natural action on the two-dimensional plane. Specifically, it is possible that an element of SO(3) will rotate an image out of plane. We show that an algorithm that learns a three-dimensional representation of the world from two dimensional images must satisfy certain geometric consistency properties which we formulate as SO(2)-equivariance constraints. We use the induced and restricted representations of SO(2) on SO(3) to construct and classify architectures…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Human Pose and Action Recognition · Robotics and Sensor-Based Localization