PSFormer: Point Transformer for 3D Salient Object Detection

Baian Chen; Lipeng Gu; Xin Zhuang; Yiyang Shen; Weiming Wang,; Mingqiang Wei

arXiv:2210.15933·cs.CV·October 31, 2022·1 cites

PSFormer: Point Transformer for 3D Salient Object Detection

Baian Chen, Lipeng Gu, Xin Zhuang, Yiyang Shen, Weiming Wang,, Mingqiang Wei

PDF

Open Access

TL;DR

PSFormer is a novel point transformer model designed for 3D salient object detection, effectively capturing multi-scale contextual information at point and scene levels to improve detection accuracy.

Contribution

The paper introduces PSFormer, a new encoder-decoder architecture with specialized transformer modules for point and scene context modeling in 3D object detection.

Findings

01

Outperforms existing methods in 3D salient object detection

02

More robust to small, multiple, and complex objects

03

Effective multi-scale context modeling

Abstract

We propose PSFormer, an effective point transformer model for 3D salient object detection. PSFormer is an encoder-decoder network that takes full advantage of transformers to model the contextual information in both multi-scale point- and scene-wise manners. In the encoder, we develop a Point Context Transformer (PCT) module to capture region contextual features at the point level; PCT contains two different transformers to excavate the relationship among points. In the decoder, we develop a Scene Context Transformer (SCT) module to learn context representations at the scene level; SCT contains both Upsampling-and-Transformer blocks and Multi-context Aggregation units to integrate the global semantic and multi-level features from the encoder into the global scene context. Experiments show clear improvements of PSFormer over its competitors and validate that PSFormer is more robust to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVisual Attention and Saliency Detection · Advanced Neural Network Applications · Virtual Reality Applications and Impacts

MethodsMulti-Head Attention · Attention Is All You Need · Perceptual control theoretic architecture · Linear Layer · Softmax · Adam · Position-Wise Feed-Forward Layer · Dense Connections · Label Smoothing · Absolute Position Encodings