VISC: mmWave Radar Scene Flow Estimation using Pervasive Visual-Inertial Supervision

Kezhong Liu; Yiwen Zhou; Mozi Chen; Jianhua He; Jingao Xu; Zheng Yang; Chris Xiaoxuan Lu; Shengkai Zhang

arXiv:2507.03938·cs.CV·July 8, 2025

VISC: mmWave Radar Scene Flow Estimation using Pervasive Visual-Inertial Supervision

Kezhong Liu, Yiwen Zhou, Mozi Chen, Jianhua He, Jingao Xu, Zheng Yang, Chris Xiaoxuan Lu, Shengkai Zhang

PDF

TL;DR

This paper introduces VISC, a novel mmWave radar scene flow estimation framework supervised by visual-inertial data, enabling cost-effective, crowdsourced training for autonomous vehicles, and outperforming LiDAR-based methods in challenging environments.

Contribution

It presents a drift-free rigid transformation estimator and an optical-mmWave supervision extraction module, enhancing scene flow estimation without relying on expensive LiDAR data.

Findings

01

Outperforms LiDAR-based methods in smoke-filled environments

02

Enables crowdsourced training using visual-inertial data

03

Improves static and dynamic scene flow estimation

Abstract

This work proposes a mmWave radar's scene flow estimation framework supervised by data from a widespread visual-inertial (VI) sensor suite, allowing crowdsourced training data from smart vehicles. Current scene flow estimation methods for mmWave radar are typically supervised by dense point clouds from 3D LiDARs, which are expensive and not widely available in smart vehicles. While VI data are more accessible, visual images alone cannot capture the 3D motions of moving objects, making it difficult to supervise their scene flow. Moreover, the temporal drift of VI rigid transformation also degenerates the scene flow estimation of static points. To address these challenges, we propose a drift-free rigid transformation estimator that fuses kinematic model-based ego-motions with neural network-learned results. It provides strong supervision signals to radar-based rigid transformation and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.