Reflective Teacher: Semi-Supervised Multimodal 3D Object Detection in   Bird's-Eye-View via Uncertainty Measure

Saheli Hazra; Sudip Das; Rohit Choudhary; Arindam Das; Ganesh Sistu,; Ciaran Eising; Ujjwal Bhattacharya

arXiv:2412.04337·cs.CV·December 6, 2024

Reflective Teacher: Semi-Supervised Multimodal 3D Object Detection in Bird's-Eye-View via Uncertainty Measure

Saheli Hazra, Sudip Das, Rohit Choudhary, Arindam Das, Ganesh Sistu,, Ciaran Eising, Ujjwal Bhattacharya

PDF

Open Access

TL;DR

This paper introduces a Reflective Teacher framework for semi-supervised 3D object detection in Bird's-Eye-View, combining a knowledge retention mechanism with a geometry-aware multimodal fusion technique, leading to improved performance with limited labeled data.

Contribution

It proposes a novel Reflective Teacher approach that mitigates catastrophic forgetting and a Geometry Aware BEV Fusion method for better multi-modal feature alignment.

Findings

01

Achieves state-of-the-art performance in semi-supervised 3D detection

02

Uses only 22-25% of labeled data to match fully supervised results

03

Demonstrates effectiveness on nuScenes and Waymo datasets

Abstract

Applying pseudo labeling techniques has been found to be advantageous in semi-supervised 3D object detection (SSOD) in Bird's-Eye-View (BEV) for autonomous driving, particularly where labeled data is limited. In the literature, Exponential Moving Average (EMA) has been used for adjustments of the weights of teacher network by the student network. However, the same induces catastrophic forgetting in the teacher network. In this work, we address this issue by introducing a novel concept of Reflective Teacher where the student is trained by both labeled and pseudo labeled data while its knowledge is progressively passed to the teacher through a regularizer to ensure retention of previous knowledge. Additionally, we propose Geometry Aware BEV Fusion (GA-BEVFusion) for efficient alignment of multi-modal BEV features, thus reducing the disparity between the modalities - camera and LiDAR. This…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage and Object Detection Techniques

MethodsAttentive Walk-Aggregating Graph Neural Network