Learning an Effective Equivariant 3D Descriptor Without Supervision
Riccardo Spezialetti, Samuele Salti, Luigi Di Stefano

TL;DR
This paper introduces a novel end-to-end method for learning rotation-equivariant 3D shape descriptors using spherical CNNs and plane folding decoders, eliminating the need for supervised data and outperforming existing descriptors.
Contribution
It proposes a new approach that learns 3D descriptors end-to-end by disentangling equivariance and invariance, using spherical CNNs and plane folding decoders without supervision.
Findings
Outperforms hand-crafted descriptors on benchmark datasets.
Achieves rotation invariance without supervision.
Demonstrates effectiveness of equivariant representations in 3D shape matching.
Abstract
Establishing correspondences between 3D shapes is a fundamental task in 3D Computer Vision, typically addressed by matching local descriptors. Recently, a few attempts at applying the deep learning paradigm to the task have shown promising results. Yet, the only explored way to learn rotation invariant descriptors has been to feed neural networks with highly engineered and invariant representations provided by existing hand-crafted descriptors, a path that goes in the opposite direction of end-to-end learning from raw data so successfully deployed for 2D images. In this paper, we explore the benefits of taking a step back in the direction of end-to-end learning of 3D descriptors by disentangling the creation of a robust and distinctive rotation equivariant representation, which can be learned from unoriented input data, and the definition of a good canonical orientation, required only…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
