CubeNet: Equivariance to 3D Rotation and Translation

Daniel Worrall; Gabriel Brostow

arXiv:1804.04458·cs.CV·April 13, 2018

CubeNet: Equivariance to 3D Rotation and Translation

Daniel Worrall, Gabriel Brostow

PDF

TL;DR

CubeNet is a novel 3D rotation and translation equivariant neural network that preserves shape representations through layers, improving 3D object classification and segmentation performance.

Contribution

Introduces CubeNet, the first 3D rotation equivariant CNN for voxel data, enhancing shape preservation and inference accuracy.

Findings

01

Achieves state-of-the-art on ModelNet10 classification

02

Performs comparably on ISBI 2012 segmentation

03

First to demonstrate 3D rotation equivariance in voxel CNNs

Abstract

3D Convolutional Neural Networks are sensitive to transformations applied to their input. This is a problem because a voxelized version of a 3D object, and its rotated clone, will look unrelated to each other after passing through to the last layer of a network. Instead, an idealized model would preserve a meaningful representation of the voxelized object, while explaining the pose-difference between the two inputs. An equivariant representation vector has two components: the invariant identity part, and a discernable encoding of the transformation. Models that can't explain pose-differences risk "diluting" the representation, in pursuit of optimizing a classification or regression loss function. We introduce a Group Convolutional Neural Network with linear equivariance to translations and right angle rotations in three dimensions. We call this network CubeNet, reflecting its…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.