PointCAT: Cross-Attention Transformer for point cloud
Xincheng Yang, Mingze Jin, Weiji He, Qian Chen

TL;DR
PointCAT introduces a novel cross-attention transformer architecture for 3D point cloud data, effectively combining multi-scale features and improving performance in shape classification and segmentation tasks.
Contribution
The paper proposes PointCAT, an end-to-end cross-attention transformer model specifically designed for point cloud data, with an efficient variant for shape classification.
Findings
Outperforms existing methods in shape classification.
Achieves comparable results in part and semantic segmentation.
Efficient model reduces computational cost.
Abstract
Transformer-based models have significantly advanced natural language processing and computer vision in recent years. However, due to the irregular and disordered structure of point cloud data, transformer-based models for 3D deep learning are still in their infancy compared to other methods. In this paper we present Point Cross-Attention Transformer (PointCAT), a novel end-to-end network architecture using cross-attentions mechanism for point cloud representing. Our approach combines multi-scale features via two seprate cross-attention transformer branches. To reduce the computational increase brought by multi-branch structure, we further introduce an efficient model for shape classification, which only process single class token of one branch as a query to calculate attention map with the other. Extensive experiments demonstrate that our method outperforms or achieves comparable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis · 3D Surveying and Cultural Heritage · Remote Sensing and LiDAR Applications
MethodsMulti-Head Attention · Attention Is All You Need · Dense Connections · Label Smoothing · Adam · Softmax · Linear Layer · Absolute Position Encodings · Byte Pair Encoding · Residual Connection
