BridgeTA: Bridging the Representation Gap in Knowledge Distillation via Teacher Assistant for Bird's Eye View Map Segmentation

Beomjun Kim; Suhan Woo; Sejong Heo; Euntai Kim

arXiv:2508.09599·cs.CV·August 14, 2025

BridgeTA: Bridging the Representation Gap in Knowledge Distillation via Teacher Assistant for Bird's Eye View Map Segmentation

Beomjun Kim, Suhan Woo, Sejong Heo, Euntai Kim

PDF

TL;DR

BridgeTA introduces a lightweight Teacher Assistant network to effectively bridge the representation gap in knowledge distillation for Bird's Eye View map segmentation, improving accuracy without increasing inference costs.

Contribution

It proposes a novel cost-effective distillation framework using a Teacher Assistant network and a theoretical loss grounded in Young's Inequality to enhance knowledge transfer.

Findings

01

Achieves 4.2% mIoU improvement over baseline

02

Up to 45% better than other KD methods

03

Effective on the nuScenes dataset

Abstract

Bird's-Eye-View (BEV) map segmentation is one of the most important and challenging tasks in autonomous driving. Camera-only approaches have drawn attention as cost-effective alternatives to LiDAR, but they still fall behind LiDAR-Camera (LC) fusion-based methods. Knowledge Distillation (KD) has been explored to narrow this gap, but existing methods mainly enlarge the student model by mimicking the teacher's architecture, leading to higher inference cost. To address this issue, we introduce BridgeTA, a cost-effective distillation framework to bridge the representation gap between LC fusion and Camera-only models through a Teacher Assistant (TA) network while keeping the student's architecture and inference cost unchanged. A lightweight TA network combines the BEV representations of the teacher and student, creating a shared latent space that serves as an intermediate representation. To…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.