Weakly Supervised Point Clouds Transformer for 3D Object Detection
Zuojin Tang, Bo Sun, Tongwei Ma, Daosheng Li, Zhenhui Xu

TL;DR
This paper introduces a weakly supervised 3D object detection framework using a point clouds transformer, reducing annotation costs while achieving high accuracy on KITTI datasets.
Contribution
It proposes an Unsupervised Voting Proposal Module and combines transformer-based global features with ResNet local features for effective weakly supervised learning.
Findings
Achieved state-of-the-art average precision on KITTI dataset.
Effectively reduces supervision requirements for 3D object detection.
Combines transformer and ResNet for comprehensive feature extraction.
Abstract
The annotation of 3D datasets is required for semantic-segmentation and object detection in scene understanding. In this paper we present a framework for the weakly supervision of a point clouds transformer that is used for 3D object detection. The aim is to decrease the required amount of supervision needed for training, as a result of the high cost of annotating a 3D datasets. We propose an Unsupervised Voting Proposal Module, which learns randomly preset anchor points and uses voting network to select prepared anchor points of high quality. Then it distills information into student and teacher network. In terms of student network, we apply ResNet network to efficiently extract local characteristics. However, it also can lose much global information. To provide the input which incorporates the global and local information as the input of student networks, we adopt the self-attention…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Average Pooling · Global Average Pooling · Convolution · Residual Connection · Batch Normalization · Max Pooling · Kaiming Initialization · 1x1 Convolution · Bottleneck Residual Block
