TL;DR
RotEqNet is a CNN architecture that encodes rotation equivariance, invariance, and covariance, leading to more compact models and competitive results across various vision tasks.
Contribution
The paper introduces RotEqNet, a novel CNN design that explicitly encodes rotation properties to improve efficiency and performance.
Findings
RotEqNet achieves comparable accuracy with significantly fewer parameters.
The architecture performs well on classification, segmentation, and orientation tasks.
RotEqNet outperforms larger models in efficiency and effectiveness.
Abstract
In many computer vision tasks, we expect a particular behavior of the output with respect to rotations of the input image. If this relationship is explicitly encoded, instead of treated as any other variation, the complexity of the problem is decreased, leading to a reduction in the size of the required model. In this paper, we propose the Rotation Equivariant Vector Field Networks (RotEqNet), a Convolutional Neural Network (CNN) architecture encoding rotation equivariance, invariance and covariance. Each convolutional filter is applied at multiple orientations and returns a vector field representing magnitude and angle of the highest scoring orientation at every spatial location. We develop a modified convolution operator relying on this representation to obtain deep architectures. We test RotEqNet on several problems requiring different responses with respect to the inputs' rotation:…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsConvolution
