GTNet: Graph Transformer Network for 3D Point Cloud Classification and Semantic Segmentation
Wei Zhou, Qian Wang, Weiwei Jin, Xinzhe Shi, Ying He

TL;DR
GTNet combines dynamic graph and Transformer techniques to improve local and global feature learning in 3D point cloud classification and segmentation tasks.
Contribution
The paper introduces Graph Transformer blocks that integrate dynamic graph updates with Transformer attention, enhancing local and global feature extraction in point clouds.
Findings
Improved accuracy in shape classification and segmentation tasks.
Effective local feature learning through dynamic graph-based attention.
Enhanced global context understanding with Transformer modules.
Abstract
Recently, graph-based and Transformer-based deep learning networks have demonstrated excellent performances on various point cloud tasks. Most of the existing graph methods are based on static graph, which take a fixed input to establish graph relations. Moreover, many graph methods apply maximization and averaging to aggregate neighboring features, so that only a single neighboring point affects the feature of centroid or different neighboring points have the same influence on the centroid's feature, which ignoring the correlation and difference between points. Most Transformer-based methods extract point cloud features based on global attention and lack the feature learning on local neighbors. To solve the problems of these two types of models, we propose a new feature extraction block named Graph Transformer and construct a 3D point point cloud learning network called GTNet to learn…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis · Remote Sensing and LiDAR Applications · 3D Surveying and Cultural Heritage
MethodsAttention Is All You Need · Absolute Position Encodings · Softmax · Layer Normalization · Laplacian EigenMap · Byte Pair Encoding · Dropout · Linear Layer · Label Smoothing · Multi-Head Attention
