CubeNet: Equivariance to 3D Rotation and Translation
Daniel Worrall, Gabriel Brostow

TL;DR
CubeNet is a novel 3D rotation and translation equivariant neural network that preserves shape representations through layers, improving 3D object classification and segmentation performance.
Contribution
Introduces CubeNet, the first 3D rotation equivariant CNN for voxel data, enhancing shape preservation and inference accuracy.
Findings
Achieves state-of-the-art on ModelNet10 classification
Performs comparably on ISBI 2012 segmentation
First to demonstrate 3D rotation equivariance in voxel CNNs
Abstract
3D Convolutional Neural Networks are sensitive to transformations applied to their input. This is a problem because a voxelized version of a 3D object, and its rotated clone, will look unrelated to each other after passing through to the last layer of a network. Instead, an idealized model would preserve a meaningful representation of the voxelized object, while explaining the pose-difference between the two inputs. An equivariant representation vector has two components: the invariant identity part, and a discernable encoding of the transformation. Models that can't explain pose-differences risk "diluting" the representation, in pursuit of optimizing a classification or regression loss function. We introduce a Group Convolutional Neural Network with linear equivariance to translations and right angle rotations in three dimensions. We call this network CubeNet, reflecting its…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
