Hyperbolic Cosine Transformer for LiDAR 3D Object Detection
Jigang Tong, Fanhang Yang, Sen Yang, Enzeng Dong, Shengzhi Du, Xing, Wang, Xianlin Yi

TL;DR
This paper introduces ChTR3D, a hyperbolic cosine transformer for LiDAR 3D object detection that reduces computational complexity and improves inference speed while maintaining competitive accuracy.
Contribution
The paper proposes a novel cosh-attention mechanism with linear complexity, enhancing efficiency in 3D object detection from LiDAR data.
Findings
Cosh-attention significantly speeds up inference.
ChTR3D outperforms vanilla attention in speed on KITTI.
It maintains competitive detection accuracy.
Abstract
Recently, Transformer has achieved great success in computer vision. However, it is constrained because the spatial and temporal complexity grows quadratically with the number of large points in 3D object detection applications. Previous point-wise methods are suffering from time consumption and limited receptive fields to capture information among points. In this paper, we propose a two-stage hyperbolic cosine transformer (ChTR3D) for 3D object detection from LiDAR point clouds. The proposed ChTR3D refines proposals by applying cosh-attention in linear computation complexity to encode rich contextual relationships among points. The cosh-attention module reduces the space and time complexity of the attention operation. The traditional softmax operation is replaced by non-negative ReLU activation and hyperbolic-cosine-based operator with re-weighting mechanism. Extensive experiments on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Hand Gesture Recognition Systems · Advanced Neural Network Applications
MethodsMulti-Head Attention · Attention Is All You Need · Dense Connections · Position-Wise Feed-Forward Layer · Linear Layer · Label Smoothing · Adam · Absolute Position Encodings · Layer Normalization · Byte Pair Encoding
