KDMOS:Knowledge Distillation for Motion Segmentation
Chunyu Cao, Jintao Cheng, Zeyu Chen, Linfan Zhan, Rui Fan, Zhijian He, Xiaoyu Tang

TL;DR
This paper introduces KDMOS, a knowledge distillation framework for motion object segmentation that improves accuracy and efficiency by decoupling classes, applying tailored distillation, and optimizing network architecture, achieving state-of-the-art results.
Contribution
The paper proposes a logits-based knowledge distillation method for MOS, utilizing a BEV projection model as student and a non-projection teacher, with tailored strategies for class imbalance.
Findings
Achieves 78.8% IoU on SemanticKITTI-MOS dataset
Reduces model parameters by 7.69%
Improves false positive and false negative rates
Abstract
Motion Object Segmentation (MOS) is crucial for autonomous driving, as it enhances localization, path planning, map construction, scene flow estimation, and future state prediction. While existing methods achieve strong performance, balancing accuracy and real-time inference remains a challenge. To address this, we propose a logits-based knowledge distillation framework for MOS, aiming to improve accuracy while maintaining real-time efficiency. Specifically, we adopt a Bird's Eye View (BEV) projection-based model as the student and a non-projection model as the teacher. To handle the severe imbalance between moving and non-moving classes, we decouple them and apply tailored distillation strategies, allowing the teacher model to better learn key motion-related features. This approach significantly reduces false positives and false negatives. Additionally, we introduce dynamic upsampling,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging
MethodsKnowledge Distillation · Adaptive Parameter-wise Diagonal Quasi-Newton Method · ADaptive gradient method with the OPTimal convergence rate · Sparse Evolutionary Training
