GTNet: Graph Transformer Network for 3D Point Cloud Classification and   Semantic Segmentation

Wei Zhou; Qian Wang; Weiwei Jin; Xinzhe Shi; Ying He

arXiv:2305.15213·cs.CV·August 6, 2024·2 cites

GTNet: Graph Transformer Network for 3D Point Cloud Classification and Semantic Segmentation

Wei Zhou, Qian Wang, Weiwei Jin, Xinzhe Shi, Ying He

PDF

Open Access

TL;DR

GTNet combines dynamic graph and Transformer techniques to improve local and global feature learning in 3D point cloud classification and segmentation tasks.

Contribution

The paper introduces Graph Transformer blocks that integrate dynamic graph updates with Transformer attention, enhancing local and global feature extraction in point clouds.

Findings

01

Improved accuracy in shape classification and segmentation tasks.

02

Effective local feature learning through dynamic graph-based attention.

03

Enhanced global context understanding with Transformer modules.

Abstract

Recently, graph-based and Transformer-based deep learning networks have demonstrated excellent performances on various point cloud tasks. Most of the existing graph methods are based on static graph, which take a fixed input to establish graph relations. Moreover, many graph methods apply maximization and averaging to aggregate neighboring features, so that only a single neighboring point affects the feature of centroid or different neighboring points have the same influence on the centroid's feature, which ignoring the correlation and difference between points. Most Transformer-based methods extract point cloud features based on global attention and lack the feature learning on local neighbors. To solve the problems of these two types of models, we propose a new feature extraction block named Graph Transformer and construct a 3D point point cloud learning network called GTNet to learn…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topics3D Shape Modeling and Analysis · Remote Sensing and LiDAR Applications · 3D Surveying and Cultural Heritage

MethodsAttention Is All You Need · Absolute Position Encodings · Softmax · Layer Normalization · Laplacian EigenMap · Byte Pair Encoding · Dropout · Linear Layer · Label Smoothing · Multi-Head Attention