Attention-based Transformation from Latent Features to Point Clouds
Kaiyi Zhang, Ximing Yang, Yuan Wu, Cheng Jin

TL;DR
This paper introduces AXform, an attention-based approach for transforming latent features into point clouds, improving quality, speed, and handling of non-smooth surfaces over previous methods, with applications in point cloud completion and segmentation.
Contribution
AXform is a novel attention-based method that reduces outliers, network parameters, and convergence time, while better handling non-smooth surfaces in point cloud generation.
Findings
Achieves state-of-the-art results on multiple datasets.
Fewer outliers and faster convergence compared to prior methods.
Enables unsupervised semantic segmentation through multi-branch extension.
Abstract
In point cloud generation and completion, previous methods for transforming latent features to point clouds are generally based on fully connected layers (FC-based) or folding operations (Folding-based). However, point clouds generated by FC-based methods are usually troubled by outliers and rough surfaces. For folding-based methods, their data flow is large, convergence speed is slow, and they are also hard to handle the generation of non-smooth surfaces. In this work, we propose AXform, an attention-based method to transform latent features to point clouds. AXform first generates points in an interim space, using a fully connected layer. These interim points are then aggregated to generate the target point cloud. AXform takes both parameter sharing and data flow into account, which makes it has fewer outliers, fewer network parameters, and a faster convergence speed. The points…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
Topics3D Shape Modeling and Analysis · Computer Graphics and Visualization Techniques · 3D Surveying and Cultural Heritage
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
