Dual Transformer for Point Cloud Analysis

Xian-Feng Han; Yi-Fei Jin; Hui-Xian Cheng; Guo-Qiang Xiao

arXiv:2104.13044·cs.CV·April 28, 2021

Dual Transformer for Point Cloud Analysis

Xian-Feng Han, Yi-Fei Jin, Hui-Xian Cheng, Guo-Qiang Xiao

PDF

TL;DR

This paper introduces DTNet, a novel transformer-based architecture for point cloud analysis that captures rich contextual dependencies through dual attention mechanisms, achieving state-of-the-art results in classification and segmentation.

Contribution

The paper proposes a Dual Transformer Network with a Dual Point Cloud Transformer module that effectively models position and channel dependencies in point clouds.

Findings

01

Achieves competitive performance on benchmark datasets.

02

Effectively captures semantic dependencies in point clouds.

03

Outperforms existing methods in classification and segmentation tasks.

Abstract

Following the tremendous success of transformer in natural language processing and image understanding tasks, in this paper, we present a novel point cloud representation learning architecture, named Dual Transformer Network (DTNet), which mainly consists of Dual Point Cloud Transformer (DPCT) module. Specifically, by aggregating the well-designed point-wise and channel-wise multi-head self-attention models simultaneously, DPCT module can capture much richer contextual dependencies semantically from the perspective of position and channel. With the DPCT module as a fundamental component, we construct the DTNet for performing point cloud analysis in an end-to-end manner. Extensive quantitative and qualitative experiments on publicly available benchmarks demonstrate the effectiveness of our proposed transformer framework for the tasks of 3D point cloud classification and segmentation,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Residual Connection · Softmax · Layer Normalization · Byte Pair Encoding · Adam